Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1596300 - Increase lbaas_activation_timeout to avoid kuryr-controller pod crash
Summary: Increase lbaas_activation_timeout to avoid kuryr-controller pod crash
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: unspecified
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.10.z
Assignee: Luis Tomas Bolivar
QA Contact: Jon Uriarte
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-28 14:42 UTC by Luis Tomas Bolivar
Modified: 2019-01-30 15:13 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-30 15:13:18 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift openshift-ansible pull 9015 None None None 2018-06-28 14:48:18 UTC
Github openshift openshift-ansible pull 9133 None None None 2018-07-10 14:44:52 UTC
OpenStack gerrit 579559 None None None 2018-07-02 14:37:11 UTC
OpenStack gerrit 579846 None None None 2018-07-04 06:39:24 UTC
Red Hat Product Errata RHBA-2019:0206 None None None 2019-01-30 15:13:25 UTC

Description Luis Tomas Bolivar 2018-06-28 14:42:46 UTC
When using Octavia as LBaaS on slow (or busy) environments it may take longer than the default 300 seconds that the kuryr-controller waits for the Amphora VM to be provisioned.

If this timeout is reached, kuryr will retry the action:

2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging [-] Failed to handle event {u'object': {u'kind': u'Endpoints', u'subsets': [{u'addresses': [{u'ip': u'192.168.99.12', u'targetRef': {u'kind': u'Pod', u'resourceVersion': u'6149', u'namespace': u'default', u'name': u'router-1-fgh8f', u'uid': u'82a3d212-7ac6-11e8-b89e-fa163ec618b0'}, u'nodeName': u'infra-node-0.open
shift.example.com'}], u'ports': [{u'protocol': u'TCP', u'name': u'1936-tcp', u'port': 1936}, {u'protocol': u'TCP', u'name': u'80-tcp', u'port': 80}, {u'protocol': u'TCP', u'name': u'443-tcp', u'port': 443}]}], u'apiVersion': u'v1', u'metadata': {u'name': u'router', u'labels': {u'router': u'router'}, u'namespace': u'default', u'resourceVersion': u'6150', u'creationTimestamp': u'2
018-06-28T10:45:54Z', u'annotations': {u'openstack.org/kuryr-lbaas-spec': u'{"versioned_object.data": {"ip": "172.30.211.178", "lb_ip": null, "ports": [{"versioned_object.data": {"name": "80-tcp", "port": 80, "protocol": "TCP"}, "versioned_object.name": "LBaaSPortSpec", "versioned_object.namespace": "kuryr_kubernetes", "versioned_object.version": "1.0"}, {"versioned_object.data"
: {"name": "443-tcp", "port": 443, "protocol": "TCP"}, "versioned_object.name": "LBaaSPortSpec", "versioned_object.namespace": "kuryr_kubernetes", "versioned_object.version": "1.0"}, {"versioned_object.data": {"name": "1936-tcp", "port": 1936, "protocol": "TCP"}, "versioned_object.name": "LBaaSPortSpec", "versioned_object.namespace": "kuryr_kubernetes", "versioned_object.version
": "1.0"}], "project_id": "d85bdba083204fe2845349a86cb87d82", "security_groups_ids": ["1cd5ff23-545f-4af7-a79a-555b5b772b47"], "subnet_id": "e6d320d4-50ff-4c05-a4d8-ad0d2b7cc2ca", "type": "ClusterIP"}, "versioned_object.name": "LBaaSServiceSpec", "versioned_object.namespace": "kuryr_kubernetes", "versioned_object.version": "1.0"}'}, u'selfLink': u'/api/v1/namespaces/default/endp
oints/router', u'uid': u'6e97e23e-7ac0-11e8-b89e-fa163ec618b0'}}, u'type': u'MODIFIED'}: ResourceNotReady: Resource not ready: LBaaSLoadBalancer(id=7f3fc0e8-37bc-4b59-87eb-520a4bd625db,ip=172.30.211.178,name='default/router',port_id=c85e5e0b-2b98-4e77-9739-bdb05152fda4,project_id='d85bdba083204fe2845349a86cb87d82',provider='octavia',security_groups=[1cd5ff23-545f-4af7-a79a-555b5
b772b47],subnet_id=e6d320d4-50ff-4c05-a4d8-ad0d2b7cc2ca)
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging Traceback (most recent call last):
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/logging.py", line 37, in __call__
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     self._handler(event)
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/retry.py", line 63, in __call__
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     self._handler.set_health_status(healthy=False)
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     self.force_reraise()
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     six.reraise(self.type_, self.value, self.tb)
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/retry.py", line 55, in __call__
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     self._handler(event)
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 72, in __call__
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     self.on_present(obj)
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 243, in on_present
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     if self._sync_lbaas_members(endpoints, lbaas_state, lbaas_spec):
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 318, in _sync_lbaas_members
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     if self._sync_lbaas_pools(endpoints, lbaas_state, lbaas_spec):
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 428, in _sync_lbaas_pools
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     if self._sync_lbaas_listeners(endpoints, lbaas_state, lbaas_spec):
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 486, in _sync_lbaas_listeners
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     if self._add_new_listeners(endpoints, lbaas_spec, lbaas_state):
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 504, in _add_new_listeners
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     port=port)
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 164, in ensure_listener
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     self._find_listener)
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 411, in _ensure_provisioned
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     self._wait_for_provisioning(loadbalancer, remaining)
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 450, in _wait_for_provisioning
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging     raise k_exc.ResourceNotReady(loadbalancer)
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging ResourceNotReady: Resource not ready: LBaaSLoadBalancer(id=7f3fc0e8-37bc-4b59-87eb-520a4bd625db,ip=172.30.211.178,name='default/router',port_id=c85e5e0b-2b98-4e77-9739-bdb05152fda4,project_id='d85bdba083204fe2845349a86cb87d82',provider='octavia',security_groups=[1cd5ff23-545f-4af7-a79a-555b5b772b47],subnet_id=e6d3
20d4-50ff-4c05-a4d8-ad0d2b7cc2ca)
2018-06-28 11:34:44.529 1 ERROR kuryr_kubernetes.handlers.logging 


However, as the LBaaS creation was already triggered and due to a bug in Octavia (https://storyboard.openstack.org/#!/story/2001944) when trying to list existing load balancers with filtering, the kuryr controller will fail to perform the remaining operations as it cannot find the existing loadbalancer, throwing an exception: 

2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry [-] Report handler unhealthy LoadBalancerHandler: InternalServerError: 500-{u'debuginfo': u'Traceback (most recent call last):\n\n  File "/usr/lib/python2.7/site-packages/wsmeext/pecan.py", line 85, in callfunction\n    result = f(self, *args, **kwargs)\n\n  File "/opt/stack/octavia/octavia/api/v2/controllers/load_b
alancer.py", line 83, in get_all\n    **query_filter)\n\n  File "/opt/stack/octavia/octavia/db/repositories.py", line 145, in get_all\n    query, self.model_class)\n\n  File "/opt/stack/octavia/octavia/api/common/pagination.py", line 232, in apply\n    query = model.apply_filter(query, model, self.filters)\n\n  File "/opt/stack/octavia/octavia/db/base_models.py", line 123, in ap
ply_filter\n    query = query.filter_by(**translated_filters)\n\n  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 1632, in filter_by\n    for key, value in kwargs.items()]\n\n  File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/operators.py", line 344, in __eq__\n    return self.operate(eq, other)\n\n  File "/usr/lib64/python2.7/site-packages/sqlalc
hemy/orm/attributes.py", line 180, in operate\n    return op(self.comparator, *other, **kwargs)\n\n  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/relationships.py", line 1039, in __eq__\n    other, adapt_source=self.adapter))\n\n  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/relationships.py", line 1372, in _optimized_compare\n    state = attributes.instanc
e_state(state)\n\nAttributeError: \'dict\' object has no attribute \'_sa_instance_state\'\n', u'faultcode': u'Server', u'faultstring': u"'dict' object has no attribute '_sa_instance_state'"}
Neutron server returns request_ids: ['req-8b7c13c4-b52b-4025-8a65-f82120881f71']
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry Traceback (most recent call last):
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/retry.py", line 55, in __call__
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     self._handler(event)
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 75, in __call__
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     self.on_present(obj)
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 243, in on_present
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     if self._sync_lbaas_members(endpoints, lbaas_state, lbaas_spec):
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 318, in _sync_lbaas_members
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     if self._sync_lbaas_pools(endpoints, lbaas_state, lbaas_spec):
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 428, in _sync_lbaas_pools
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     if self._sync_lbaas_listeners(endpoints, lbaas_state, lbaas_spec):
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 483, in _sync_lbaas_listeners
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     if self._sync_lbaas_loadbalancer(endpoints, lbaas_state, lbaas_spec):
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 573, in _sync_lbaas_loadbalancer
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     service_type=lbaas_spec.type)
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 60, in ensure_loadbalancer
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     self._find_loadbalancer)
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 404, in _ensure
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     result = find(obj)
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 271, in _find_loadbalancer
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     vip_subnet_id=loadbalancer.subnet_id)
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 1124, in list_loadbalancers
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     retrieve_all, **_params)
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 369, in list
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     for r in self._pagination(collection, path, **params):
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 384, in _pagination
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     res = self.get(path, params=params)
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 354, in get
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     headers=headers, params=params)
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 331, in retry_request
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     headers=headers, params=params)
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 294, in do_request
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     self._handle_fault_response(status_code, replybody, resp)
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 269, in _handle_fault_response
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     exception_handler_v20(status_code, error_body)
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 93, in exception_handler_v20
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry     request_ids=request_ids)
2018-06-28 11:34:46.005 1 ERROR kuryr_kubernetes.handlers.retry InternalServerError: 500-{u'debuginfo': u'Traceback (most recent call last):\n\n  File "/usr/lib/python2.7/site-packages/wsmeext/pecan.py", line 85, in callfunction\n    result = f(self, *args, **kwargs)\n\n  File "/opt/stack/octavia/octavia/api/v2/controllers/load_balancer.py", line 83, in get_all\n    **query_filt
er)\n\n  File "/opt/stack/octavia/octavia/db/repositories.py", line 145, in get_all\n    query, self.model_class)\n\n  File "/opt/stack/octavia/octavia/api/common/pagination.py", line 232, in apply\n    query = model.apply_filter(query, model, self.filters)\n\n  File "/opt/stack/octavia/octavia/db/base_models.py", line 123, in apply_filter\n    query = query.filter_by(**translat
ed_filters)\n\n  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 1632, in filter_by\n    for key, value in kwargs.items()]\n\n  File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/operators.py", line 344, in __eq__\n    return self.operate(eq, other)\n\n  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/attributes.py", line 180, in operate\n   
 return op(self.comparator, *other, **kwargs)\n\n  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/relationships.py", line 1039, in __eq__\n    other, adapt_source=self.adapter))\n\n  File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/relationships.py", line 1372, in _optimized_compare\n    state = attributes.instance_state(state)\n\nAttributeError: \'dict\' object 
has no attribute \'_sa_instance_state\'\n', u'faultcode': u'Server', u'faultstring': u"'dict' object has no attribute '_sa_instance_state'"}

To avoid this, the time that Kuryr waits for the load balancer provision tneeds to be increased.

Comment 1 Luis Tomas Bolivar 2018-06-28 14:46:30 UTC
Seems this patch on Octavia https://review.openstack.org/#/c/559842/ may fix the https://storyboard.openstack.org/#!/story/2001944 bug regarding getting the existing loadbalancer with filters.

Comment 2 Luis Tomas Bolivar 2018-06-29 13:06:50 UTC
Until the Octavia issue is fixed, we need this on kuryr-kubernetes (controller) side: https://review.openstack.org/#/c/579144/

Comment 3 Luis Tomas Bolivar 2018-07-06 12:24:09 UTC
https://review.openstack.org/#/c/579846/ got merged, so  https://review.openstack.org/#/c/579144/ is not needed anymore

Comment 4 Scott Dodson 2018-08-14 21:40:07 UTC
Should be in openshift-ansible-3.10.28-1

Comment 5 Jon Uriarte 2018-09-25 12:58:44 UTC
Verified in openshift-ansible-3.10.50-1.git.0.96a93c5.el7.noarch.

Verification steps:

1. Deploy OCP 3.10 on OSP 3.10, with kuryr enabled
2. Check lbaas_activation_timeout value in kuryr config:

$ oc -n openshift-infra get configmap -o yaml | grep lbaas
      lbaas_activation_timeout = 1200

3. Create a new project, deploy and scale an app:

$ oc new-project test
$ oc run --image kuryr/demo demo
$ oc scale dc/demo --replicas=2

$ oc get pods --all-namespaces -o wide
NAMESPACE         NAME                                                READY     STATUS    RESTARTS   AGE       IP              NODE
default           docker-registry-1-j9q8p                             1/1       Running   0          21h       10.11.0.11      infra-node-0.openshift.example.com
default           registry-console-1-hqrx4                            1/1       Running   0          21h       10.11.0.3       master-0.openshift.example.com
default           router-1-rpjg7                                      1/1       Running   0          21h       192.168.99.5    infra-node-0.openshift.example.com
kube-system       master-api-master-0.openshift.example.com           1/1       Running   0          21h       192.168.99.15   master-0.openshift.example.com
kube-system       master-controllers-master-0.openshift.example.com   1/1       Running   1          21h       192.168.99.15   master-0.openshift.example.com
kube-system       master-etcd-master-0.openshift.example.com          1/1       Running   1          21h       192.168.99.15   master-0.openshift.example.com
openshift-infra   kuryr-cni-ds-9xs42                                  2/2       Running   0          21h       192.168.99.5    infra-node-0.openshift.example.com
openshift-infra   kuryr-cni-ds-k9b6c                                  2/2       Running   0          21h       192.168.99.10   app-node-0.openshift.example.com
openshift-infra   kuryr-cni-ds-nw82s                                  2/2       Running   0          21h       192.168.99.15   master-0.openshift.example.com
openshift-infra   kuryr-cni-ds-znwrt                                  2/2       Running   0          21h       192.168.99.4    app-node-1.openshift.example.com
openshift-infra   kuryr-controller-59fc7f478b-zwfvb                   1/1       Running   0          33m       192.168.99.15   master-0.openshift.example.com
openshift-node    sync-fpmst                                          1/1       Running   0          21h       192.168.99.15   master-0.openshift.example.com
openshift-node    sync-qzzvp                                          1/1       Running   0          21h       192.168.99.5    infra-node-0.openshift.example.com
openshift-node    sync-s7xzt                                          1/1       Running   0          21h       192.168.99.4    app-node-1.openshift.example.com
openshift-node    sync-zmqbh                                          1/1       Running   0          21h       192.168.99.10   app-node-0.openshift.example.com
test              demo-1-7v9xf                                        1/1       Running   0          1m        10.11.0.17      app-node-0.openshift.example.com
test              demo-1-njzfx                                        1/1       Running   0          2m        10.11.0.28      app-node-1.openshift.example.com

$ curl 10.11.0.17:8080                                                                                                                                                                       
demo-1-7v9xf: HELLO! I AM ALIVE!!!

$ curl 10.11.0.28:8080                                                                                                                                                                       
demo-1-njzfx: HELLO! I AM ALIVE!!!

4. Create a service and kill the kuryr controller pod while the load balancer is creating:
$ oc expose dc/demo --port 80 --target-port 8080

+--------------------------------------+------------------------------------------------+----------------------------------+---------------+---------------------+----------+
| id                                   | name                                           | project_id                       | vip_address   | provisioning_status | provider |
+--------------------------------------+------------------------------------------------+----------------------------------+---------------+---------------------+----------+
| 49bf0999-621c-4c6b-9540-5c2536c76102 | test/demo                                      | 4379cba1109242639a99af8ad04c7208 | 172.30.101.63 | PENDING_CREATE      | octavia  |
+--------------------------------------+------------------------------------------------+----------------------------------+---------------+---------------------+----------+

$ oc -n openshift-infra delete pod kuryr-controller-59fc7f478b-zwfvb

$ oc -n openshift-infra get pods -o wide
NAME                                READY     STATUS    RESTARTS   AGE       IP              NODE
kuryr-cni-ds-9xs42                  2/2       Running   0          21h       192.168.99.5    infra-node-0.openshift.example.com
kuryr-cni-ds-k9b6c                  2/2       Running   0          21h       192.168.99.10   app-node-0.openshift.example.com
kuryr-cni-ds-nw82s                  2/2       Running   0          21h       192.168.99.15   master-0.openshift.example.com
kuryr-cni-ds-znwrt                  2/2       Running   0          21h       192.168.99.4    app-node-1.openshift.example.com
kuryr-controller-59fc7f478b-skrpf   1/1       Running   0          38s       192.168.99.4    app-node-1.openshift.example.com

+--------------------------------------+------------------------------------------------+----------------------------------+---------------+---------------------+----------+
| id                                   | name                                           | project_id                       | vip_address   | provisioning_status | provider |
+--------------------------------------+------------------------------------------------+----------------------------------+---------------+---------------------+----------+
| 49bf0999-621c-4c6b-9540-5c2536c76102 | test/demo                                      | 4379cba1109242639a99af8ad04c7208 | 172.30.101.63 | ACTIVE              | octavia  |
+--------------------------------------+------------------------------------------------+----------------------------------+---------------+---------------------+----------+

$ oc get svc
NAME      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
demo      ClusterIP   172.30.101.63   <none>        80/TCP    2m

$ curl 172.30.101.63
demo-1-7v9xf: HELLO! I AM ALIVE!!!

$ curl 172.30.101.63
demo-1-njzfx: HELLO! I AM ALIVE!!!

5. Check there are no restarts or errors in kuryr controller.

6. Delete the project:
$ oc delete project test

Another test:

1. Deploy OCP 3.10 on OSP 3.10, with kuryr enabled

2. Create a new project, deploy and expose an app:

$ oc new-project test-timeout
$ oc run --image kuryr/demo demo                                                                                                                                                             
$ oc get pods -o wide
NAME           READY     STATUS    RESTARTS   AGE       IP          NODE
demo-1-6nczk   1/1       Running   0          21s       10.11.0.7   app-node-0.openshift.example.com

$ curl 10.11.0.7:8080
demo-1-6nczk: HELLO! I AM ALIVE!!!

$ oc expose dc/demo --port 80 --target-port 8080 

$ oc get svc
NAME      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
demo      ClusterIP   172.30.176.122   <none>        80/TCP    4s

$ curl 172.30.176.122
demo-1-6nczk: HELLO! I AM ALIVE!!!

$ oc -n openshift-infra get pods                                                                                                                                                             
NAME                                READY     STATUS    RESTARTS   AGE
kuryr-cni-ds-9xs42                  2/2       Running   0          22h
kuryr-cni-ds-k9b6c                  2/2       Running   0          22h
kuryr-cni-ds-nw82s                  2/2       Running   0          22h
kuryr-cni-ds-znwrt                  2/2       Running   0          22h
kuryr-controller-59fc7f478b-tlfll   1/1       Running   0          10m

3. Delete the kuryr controller pod:

$ oc -n openshift-infra delete pod kuryr-controller-59fc7f478b-tlfll
pod "kuryr-controller-59fc7f478b-tlfll" deleted

$ oc -n openshift-infra get pods
NAME                                READY     STATUS    RESTARTS   AGE
kuryr-cni-ds-9xs42                  2/2       Running   0          22h
kuryr-cni-ds-k9b6c                  2/2       Running   0          22h
kuryr-cni-ds-nw82s                  2/2       Running   0          22h
kuryr-cni-ds-znwrt                  2/2       Running   0          22h
kuryr-controller-59fc7f478b-rh2f2   1/1       Running   0          15s

$ oc get pods -o wide
NAME           READY     STATUS    RESTARTS   AGE       IP          NODE
demo-1-6nczk   1/1       Running   0          3m        10.11.0.7   app-node-0.openshift.example.com

$ curl 172.30.176.122                                                                                                                                                                        
demo-1-6nczk: HELLO! I AM ALIVE!!!

4. Scale the app:

$ oc scale dc/demo --replicas=2

$ oc get pods -o wide
NAME           READY     STATUS    RESTARTS   AGE       IP           NODE
demo-1-6nczk   1/1       Running   0          4m        10.11.0.7    app-node-0.openshift.example.com
demo-1-ft659   1/1       Running   0          58s       10.11.0.25   app-node-1.openshift.example.com


5. Check new member was added correctly to the load balancer:

$ curl 172.30.176.122                                                                                                                                                                        
demo-1-ft659: HELLO! I AM ALIVE!!!

$ curl 172.30.176.122
demo-1-6nczk: HELLO! I AM ALIVE!!!

6. Delete the project:
$ oc delete project test-timeout

Comment 7 errata-xmlrpc 2019-01-30 15:13:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0206


Note You need to log in before you can comment on or make changes to this bug.