Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1518221 - [UPDATES] Error response from Zaqar. Code: 503. Title: Service temporarily unavailable
Summary: [UPDATES] Error response from Zaqar. Code: 503. Title: Service temporarily un...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-tripleoclient
Version: 12.0 (Pike)
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ga
: 12.0 (Pike)
Assignee: mathieu bultel
QA Contact: Yurii Prokulevych
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-28 12:55 UTC by Yurii Prokulevych
Modified: 2018-02-05 19:18 UTC (History)
18 users (show)

Fixed In Version: python-tripleoclient-7.3.3-7.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-13 22:23:28 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
RDO 10741 None pike-rdo: MERGED openstack/tripleoclient-distgit: Add python-zaqarclient as requires for tripleoclient (I481f86ea251ae27e259e9a6a0224fe82... 2017-11-30 16:52:27 UTC
Red Hat Product Errata RHEA-2017:3462 normal SHIPPED_LIVE Red Hat OpenStack Platform 12.0 Enhancement Advisory 2018-02-16 01:43:25 UTC
OpenStack gerrit 523500 None master: MERGED python-tripleoclient: Catch zaqar exception when no message to claim (I802ffd553c54e4a4f9998420645aa078f508b9f0) 2017-11-30 16:52:14 UTC
OpenStack gerrit 524128 None stable/pike: NEW python-tripleoclient: Catch zaqar exception when no message to claim (I802ffd553c54e4a4f9998420645aa078f508b9f0) 2017-11-30 16:52:07 UTC
Launchpad 1734957 None None None 2017-11-28 19:04:59 UTC

Description Yurii Prokulevych 2017-11-28 12:55:20 UTC
Description of problem:
-----------------------
During minor update of RHOS-12 got  error:
openstack overcloud update stack --nodes Networker
...
 u'TError response from Zaqar. Code: 503. Title: Service temporarily unavailable. Description: Claim could not be created. Please try again in a few seconds..
ASK [Set host puppet debugging fact string] ***********************************',
 u'skipping: [192.168.24.8]',
 u'',
 u'TASK [Write the config_step hieradata] *****************************************',
 u'changed: [192.168.24.8]',
 u'',
 u'TASK [Run puppet host configuration for step 4] ********************************',
 u'changed: [192.168.24.8]']

and this cause playbook to fail:
...

SG:

non-zero return code

changed: [undercloud-0] => (item=Messaging)                                                                                                     

msg: All items completed
        to retry, use: --limit @/root/IR2/IR-SEALUSA-7/plugins/tripleo-upgrade/infrared_plugin/main.retry

PLAY RECAP ********************************************************************************************************************************************************************************************************
undercloud-0               : ok=17   changed=2    unreachable=0    failed=1   

ERROR   Playbook "/root/IR2/IR-SEALUSA-7/plugins/tripleo-upgrade/infrared_plugin/main.yml" failed!



Version-Release number of selected component (if applicable):
-------------------------------------------------------------
openstack-zaqar-5.0.0-3.el7ost.noarch
python-zaqarclient-1.7.0-1.el7ost.noarch
puppet-zaqar-11.3.0-3.el7ost.noarch
openstack-tripleo-puppet-elements-7.0.1-1.el7ost.noarch
openstack-tripleo-common-containers-7.6.3-4.el7ost.noarch
python-tripleoclient-7.3.3-5.el7ost.noarch
puppet-tripleo-7.4.3-9.el7ost.noarch
openstack-tripleo-common-7.6.3-4.el7ost.noarch
openstack-tripleo-ui-7.4.3-4.el7ost.noarch
openstack-tripleo-validations-7.4.2-1.el7ost.noarch
openstack-tripleo-heat-templates-7.0.3-13.el7ost.noarch
openstack-tripleo-image-elements-7.0.1-1.el7ost.noarch

Steps to Reproduce:
-------------------
1. Run update of composable deployment (~15nodes)
2. Unfortunately this is not always reproducable


Actual results:
---------------
Update fails and has to be re-run


Expected results:
-----------------
Such events/tracebacks are handled and retried


Additional info:
----------------
Virtual setup: 3controllers + 3messaging + 3database + 2networker + 2computes + 3ceph

Comment 2 Yurii Prokulevych 2017-11-28 13:06:52 UTC
From zaqar.log:
...
2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims [(None,) 664ef39f4cff49ec8109f901af05eff8 8793d5e72bf74354b8b8194940c56daa - - -] Queue update does not exist for project 8793d5e72bf74354b8b81
94940c56daa: QueueDoesNotExist: Queue update does not exist for project 8793d5e72bf74354b8b8194940c56daa
2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims Traceback (most recent call last):
2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims   File "/usr/lib/python2.7/site-packages/zaqar/transport/wsgi/v2_0/claims.py", line 85, in on_post
2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims     **claim_options)
2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims   File "/usr/lib/python2.7/site-packages/zaqar/common/pipeline.py", line 97, in consumer
2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims     tmp = target(*args, **kwargs)
2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims   File "/usr/lib/python2.7/site-packages/zaqar/storage/swift/claims.py", line 107, in create
2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims     include_claimed=False)
2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims   File "/usr/lib/python2.7/site-packages/zaqar/storage/swift/messages.py", line 102, in _list
2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims     raise errors.QueueDoesNotExist(queue, project)
2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims QueueDoesNotExist: Queue update does not exist for project 8793d5e72bf74354b8b8194940c56daa
2017-11-28 07:29:33.736 1564 ERROR zaqar.transport.wsgi.v2_0.claims

Comment 3 Mike Orazi 2017-11-28 20:58:44 UTC
Can you elaborate on how frequently this occurs and whether or not modifying timeouts would like resolve the issue?

I'm leaning towards saying this is not a blocker but it would help to understand the frequency + impact that bug actually has before making that statement.

Comment 4 Mike Orazi 2017-11-29 05:18:06 UTC
And the other question is -- will a re-run of update reliably fix this?

Comment 9 Julie Pichon 2017-11-30 16:48:21 UTC
Link to spec change on pike-rdo: https://review.rdoproject.org/r/#/c/10741/

Comment 12 Yurii Prokulevych 2017-12-13 15:25:44 UTC
Verified with python-tripleoclient-7.3.3-7.el7ost.noarch

Comment 14 errata-xmlrpc 2017-12-13 22:23:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462


Note You need to log in before you can comment on or make changes to this bug.