Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1066936 - cinder: volume stuck in creating/deleting when command is sent while qpid is down and than started (restart qpid race)
Summary: cinder: volume stuck in creating/deleting when command is sent while qpid is ...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-cinder
Version: 4.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 6.0 (Juno)
Assignee: Sergey Gotliv
QA Contact: Dafna Ron
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-02-19 10:57 UTC by Dafna Ron
Modified: 2018-02-08 10:12 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-08-26 19:56:05 UTC


Attachments (Terms of Use)
logs (deleted)
2014-02-19 10:57 UTC, Dafna Ron
no flags Details


Links
System ID Priority Status Summary Last Updated
Launchpad 1282017 None None None Never

Description Dafna Ron 2014-02-19 10:57:33 UTC
Created attachment 865024 [details]
logs

Description of problem:

I think its a race but if you slow it down it reproduces 100% 
volume creat/delete will get stuck whith no timeout when command is sent when qpid is restarted. 

Version-Release number of selected component (if applicable):

[root@puma31 ~]# rpm -qa |grep qpid
qpid-cpp-client-0.18-14.el6.x86_64
qpid-cpp-server-0.18-14.el6.x86_64
python-qpid-0.18-4.el6.noarch

[root@orange-vdsf ~(keystone_admin)]# rpm -qa |grep cinder
openstack-cinder-2013.2.2-1.el6ost.noarch
python-cinderclient-1.0.7-2.el6ost.noarch
python-cinder-2013.2.2-1.el6ost.noarch


How reproducible:

100%

Steps to Reproduce:

My setup is remote cinder and controller
 
1. stop qpid service
2. create a volume 
3. start qpid

Actual results:

the command is sent leaving the volume status change to creating 
In actuality, the command is only shown in api log (so I don't think its actually sent) and there is no time out. 

Expected results:

we should: 
1. either fail the command right away or with timeout
2. change volume status to error

Additional info: logs

Comment 1 Flavio Percoco 2014-03-17 09:18:29 UTC
Looks like the API node (and most probably this needs to be fixed in the scheduler node too) has all the info needed to change the status. I'd assume this happens with other commands too.

In the case of volume creation - in stable/havana - this call may need to be wrapped[0], although this sounds like something that could be improved in taskflow too. I... think this was fixed in Icehouse, at least it should have a better way to handle this kind of failures.

[0] https://github.com/openstack/cinder/blob/stable/havana/cinder/volume/flows/create_volume/__init__.py#L1504


Note You need to log in before you can comment on or make changes to this bug.