Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1508091 - Bug in L3 agent code while cleaning up a router namespace
Summary: Bug in L3 agent code while cleaning up a router namespace
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 9.0 (Mitaka)
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: zstream
: 9.0 (Mitaka)
Assignee: Brian Haley
QA Contact: Roee Agiman
URL:
Whiteboard:
Depends On:
Blocks: 1510157 1510159 1510162
TreeView+ depends on / blocked
 
Reported: 2017-10-31 20:05 UTC by Andreas Karis
Modified: 2018-04-09 12:54 UTC (History)
13 users (show)

Fixed In Version: openstack-neutron-8.4.0-9.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1510157 1510159 1510162 (view as bug list)
Environment:
Last Closed: 2018-03-15 12:41:29 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0537 None None None 2018-03-15 12:42:44 UTC

Description Andreas Karis 2017-10-31 20:05:14 UTC
Description of problem:
Bug in L3 agent code while cleaning up a router namespace

Version-Release number of selected component (if applicable):
neutron 8.4.0-6

How reproducible:
customer has several different other issues in neutron. After an upgrade of the neutron RPM to latest, the customer gets:

After banning and clearing the resource on one of the controllers:
~~~
017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent     pm.enable()
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/external_process.py", line 94, in enable
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent     run_as_root=self.run_as_root)
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 958, in execute
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent     log_fail_as_error=log_fail_as_error, **kwargs)
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 146, in execute
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent     raise ProcessExecutionError(msg, returncode=returncode)
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent ProcessExecutionError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Guru mediation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports.
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent Option "verbose" from group "DEFAULT" is deprecated for removal.  Its value may be silently ignored in the future.
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent Option "notification_driver" from group "DEFAULT" is deprecated. Use option "driver" from group "oslo_messaging_notifications".
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent
2017-10-31 19:42:14.025 20498 ERROR neutron.agent.l3.agent [-] Error while deleting router 9127bd7b-1bad-43f6-83e8-e70b731c85c5
2017-10-31 19:42:14.025 20498 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2017-10-31 19:42:14.025 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 369, in _router_added
2017-10-31 19:42:14.025 20498 ERROR neutron.agent.l3.agent     ri.delete()
2017-10-31 19:42:14.025 20498 ERROR neutron.agent.l3.agent TypeError: delete() takes exactly 2 arguments (1 given)
2017-10-31 19:42:14.025 20498 ERROR neutron.agent.l3.agent
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 9127bd7b-1bad-43f6-83e8-e70b731c85c5
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 523, in _process_router_update
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     self._process_router_if_compatible(router)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 460, in _process_router_if_compatible
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     self._process_added_router(router)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 465, in _process_added_router
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     self._router_added(router['id'], router)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     router_id)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     self.force_reraise()
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     six.reraise(self.type_, self.value, self.tb)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 361, in _router_added
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     ri.initialize(self.process_monitor)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 118, in initialize
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     self.spawn_state_change_monitor(process_monitor)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 351, in spawn_state_change_monitor
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     pm.enable()
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/external_process.py", line 94, in enable
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     run_as_root=self.run_as_root)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 958, in execute
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     log_fail_as_error=log_fail_as_error, **kwargs)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 146, in execute
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     raise ProcessExecutionError(msg, returncode=returncode)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent ProcessExecutionError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Guru mediation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports.
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent Option "verbose" from group "DEFAULT" is deprecated for removal.  Its value may be silently ignored in the future.
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent Option "notification_driver" from group "DEFAULT" is deprecated. Use option "driver" from group "oslo_messaging_notifications".
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent
~~~

This looks like a bug:

/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py
~~~
(...)
   321     def _create_router(self, router_id, router):
    322         args = []
    323         kwargs = {
    324             'router_id': router_id,
    325             'router': router,
    326             'use_ipv6': self.use_ipv6,
    327             'agent_conf': self.conf,
    328             'interface_driver': self.driver,
    329         }
    330
    331         if router.get('distributed'):
    332             kwargs['agent'] = self
    333             kwargs['host'] = self.host
    334
    335         if router.get('distributed') and router.get('ha'):
    336             if self.conf.agent_mode == l3_constants.L3_AGENT_MODE_DVR_SNAT:
    337                 kwargs['state_change_callback'] = self.enqueue_state_change
    338                 return dvr_edge_ha_router.DvrEdgeHaRouter(*args, **kwargs)
    339
    340         if router.get('distributed'):
    341             if self.conf.agent_mode == l3_constants.L3_AGENT_MODE_DVR_SNAT:
    342                 return dvr_router.DvrEdgeRouter(*args, **kwargs)
    343             else:
    344                 return dvr_local_router.DvrLocalRouter(*args, **kwargs)
    345
    346         if router.get('ha'):
    347             kwargs['state_change_callback'] = self.enqueue_state_change
    348             return ha_router.HaRouter(*args, **kwargs)
    349
    350         return legacy_router.LegacyRouter(*args, **kwargs)
(...)
    352     def _router_added(self, router_id, router):
    353         ri = self._create_router(router_id, router)
    354         registry.notify(resources.ROUTER, events.BEFORE_CREATE,
    355                         self, router=ri)
    356
    357         self.router_info[router_id] = ri
    358
    359         # If initialize() fails, cleanup and retrigger complete sync
    360         try:
    361             ri.initialize(self.process_monitor)
    362         except Exception:
    363             with excutils.save_and_reraise_exception():
    364                 del self.router_info[router_id]
    365                 LOG.exception(_LE('Error while initializing router %s'),
    366                               router_id)
    367                 self.namespaces_manager.ensure_router_cleanup(router_id)
    368                 try:
    369                     ri.delete()
    370                 except Exception:
    371                     LOG.exception(_LE('Error while deleting router %s'),
    372                                   router_id)
(...)
~~~

~~~
/usr/lib/python2.7/site-packages/neutron/agent/l3/legacy_router.py:class LegacyRouter(router.RouterInfo):
~~~

~~~
/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py:class HaRouter(router.RouterInfo):
~~~

And from /usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py
~~~
   413     def delete(self, agent):
    414         self.destroy_state_change_monitor(self.process_monitor)
    415         self.disable_keepalived()
    416         self.ha_network_removed()
    417         super(HaRouter, self).delete(agent)
    418
~~~

/usr/lib/python2.7/site-packages/neutron/agent/l3/legacy_router.py
~~~
(...)
    362     def delete(self, agent):
    363         self.router['gw_port'] = None
    364         self.router[l3_constants.INTERFACE_KEY] = []
    365         self.router[l3_constants.FLOATINGIP_KEY] = []
    366         self.process_delete(agent)
    367         self.disable_radvd()
    368         self.router_namespace.delete()
(...)
~~~


Look at the argument mismatch. ri.delete should call with 2 arguments.

- Andreas

Comment 3 Brian Haley 2017-11-06 14:22:35 UTC
I have a change for this I tried to push upstream but affected stable branches are already closed, I'll just push it downstream.

Comment 13 Roee Agiman 2018-02-26 08:47:39 UTC
Verified.
[stack@undercloud-0 ~]$ cat /etc/yum.repos.d/latest-installed 
9   -p 2018-02-19.1
[stack@undercloud-0 ~]$ rpm -qa | grep neutron-
openstack-neutron-8.4.0-17.el7ost.noarch
python-neutron-8.4.0-17.el7ost.noarch

Comment 16 errata-xmlrpc 2018-03-15 12:41:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0537


Note You need to log in before you can comment on or make changes to this bug.