Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 1510162

Summary: Bug in L3 agent code while cleaning up a router namespace
Product: Red Hat OpenStack Reporter: Brian Haley <bhaley>
Component: openstack-neutronAssignee: Brian Haley <bhaley>
Status: CLOSED ERRATA QA Contact: Toni Freger <tfreger>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.0 (Kilo)CC: akaris, amuller, bhaley, chrisw, ihrachys, jjoyce, nyechiel, pneedle, ragiman, sclewis, srevivo, tfreger
Target Milestone: zstreamKeywords: Triaged, ZStream
Target Release: 7.0 (Kilo)   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-neutron-2015.1.4-26.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1508091 Environment:
Last Closed: 2017-12-05 10:47:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1508091    
Bug Blocks: 1510157, 1510159    

Comment 3 Toni Freger 2017-11-26 07:04:16 UTC
Brian,

I've ran rally benchmark test, creation and deletion of 30 routers, 3 concurrent iteration.

on version openstack-neutron-2015.1.4-26.el7ost.noarch

You can find the test here - https://github.com/openstack/rally/blob/793735c152a573d72391a8ac21e2d908b631195a/samples/tasks/scenarios/neutron/create-and-delete-routers.json


2017-11-26 06:14:13.156 15052 ERROR neutron.agent.l3.ha_router [-] Unable to process HA router 59fad7c2-d393-464f-820b-334927047e64 without HA port
2017-11-26 06:14:13.156 15052 TRACE neutron.agent.l3.ha_router None
2017-11-26 06:14:13.156 15052 TRACE neutron.agent.l3.ha_router
2017-11-26 06:14:13.157 15052 ERROR neutron.agent.l3.agent [-] Error while initializing router 59fad7c2-d393-464f-820b-334927047e64
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent Traceback (most recent call last):
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 335, in _router_added
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     ri.initialize(self.process_monitor)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 83, in initialize
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     raise Exception(msg)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent Exception: Unable to process HA router 59fad7c2-d393-464f-820b-334927047e64 without HA port
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent
2017-11-26 06:14:13.157 15052 ERROR neutron.agent.l3.agent [-] Error while deleting router 59fad7c2-d393-464f-820b-334927047e64
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent Traceback (most recent call last):
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 342, in _router_added
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     ri.delete(self)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 359, in delete
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     self.destroy_state_change_monitor(self.process_monitor)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent AttributeError: 'HaRouter' object has no attribute 'process_monitor'
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent
2017-11-26 06:14:13.157 15052 ERROR neutron.agent.l3.agent [-] Failed to process compatible router '59fad7c2-d393-464f-820b-334927047e64'
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent Traceback (most recent call last):
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 509, in _process_router_update
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     self._process_router_if_compatible(router)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 450, in _process_router_if_compatible
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     self._process_added_router(router)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 455, in _process_added_router
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     self._router_added(router['id'], router)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 345, in _router_added
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     router_id)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     six.reraise(self.type_, self.value, self.tb)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 335, in _router_added
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     ri.initialize(self.process_monitor)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 83, in initialize
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     raise Exception(msg)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent Exception: Unable to process HA router 59fad7c2-d393-464f-820b-334927047e64 without HA port
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent

Comment 5 Brian Haley 2017-11-27 20:03:59 UTC
Hi Toni,

The first backtrace in Comment #3 looks like another bug in this code path that would be present in all releases.  self.process_monitor is only initialized in a super() call from the HA router initialize code.  In this case initialize() failed early and super() was never called.  I need to open an upstream bug and propose a change there.  This would have been triggered even without the new code from what I can tell and was just a race condition waiting to happen.

The second backtrace in Comment #4 is possibly something new, or could have been fixed upstream already as it looks familiar.  Since it's unrelated I guess I wouldn't necessarily hold things for it.

Let me look at the other bug updates you posted to see if the trace is similar.

Comment 7 Brian Haley 2017-11-28 16:38:48 UTC
Hi Scott,

The second issue (from Comment #4) is unrelated to the changes, so I would consider it new to OSP7.

The first issue (from Comment #3) is related to the changes, but is actually a new bug - i.e. fixing one bug uncovered another.  I am fine with this small change and the one for https://bugzilla.redhat.com/show_bug.cgi?id=1496916 merging which are related since they make the original failure more recoverable and do not fill the log files unnecessarily.

Hopefully Toni will agree.

Comment 9 Brian Haley 2017-11-30 21:55:01 UTC
Scott,

I think we should ship this as-is and I can fix any new bugs going forward.

Toni,

I opened https://bugs.launchpad.net/neutron/+bug/1735557 and have a patch up to fix the other l3-agent issue, not sure if you opened a downstream bug for this yet.  I will need to take a look at the other issue you found as time permits.

Comment 10 Toni Freger 2017-12-04 16:22:54 UTC
Since functionality wasn't damaged it reasonable to work on new bugs and to move this one to verified.

Comment 13 errata-xmlrpc 2017-12-05 10:47:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3381