Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1688649 - "rabbitmqctl list_consumers" gets stuck after restarting all controller nodes one by one
Summary: "rabbitmqctl list_consumers" gets stuck after restarting all controller nodes...
Keywords:
Status: NEW
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rabbitmq-server
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Peter Lemenkov
QA Contact: pkomarov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-14 07:27 UTC by Takashi Kajinami
Modified: 2019-04-09 03:07 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description Takashi Kajinami 2019-03-14 07:27:59 UTC
Description of problem:

In the basic situation, the following command shows the list of consumes for rabbitmq cluster.

 $ sudo docker exec $(sudo docker ps -f name=rabbitmq-bundle -q) rabbitmqctl list_consumers

However, after we restart all the controller nodes one by one, we can not get the result by the command
as the command get stacked at some point.

~~~
[heat-admin@controller-0 ~]$ sudo docker exec $(sudo docker ps -f name=rabbitmq-bundle -q) rabbitmqctl list_consumers 
Listing consumers
l3_agent_fanout_73c3e66599a648379e3149a85af6bd8e	<rabbit@controller-1.3.5979.0>	3	true	0	[]
q-l3-plugin_fanout_b2e4eb3ef5d641dfb189002c4d6f9133	<rabbit@controller-0.3.10747.0>	3	true	0	[]
...
neutron-vo-SubPort-1.0.compute-0.localdomain	<rabbit@controller-1.3.6308.0>	2	true	0	[]
q-reports-plugin_fanout_46dab178092e4ac986bb4bbad877ac99	<rabbit@controller-1.3.6053.0>	3	true	0	[]
q-reports-plugin_fanout_94b1171459d24ad5b79f483902c03223	<rabbit@controller-1.3.5904.0>	3	true	0	[]
trunk_fanout_b5424ee9eeaf47f480394907b4e03e49	<rabbit@controller-0.3.6260.0>	3	true	0	[]
~~~

We do not see any specific error in console, and neither in rabbitmq logs.
Also, to recover from this situation, we need to restart whole the controller nodes.


Version-Release number of selected component (if applicable):

~~~
$ sudo docker images | grep rabbitmq
192.168.24.1:8787/rhosp13/openstack-rabbitmq                    2019-01-10.1        766efb5b9b38        3 months ago        635 MB
192.168.24.1:8787/rhosp13/openstack-rabbitmq                    pcmklatest          766efb5b9b38        3 months ago        635 MB
$ sudo docker exec $(sudo docker ps -f name=rabbitmq-bundle -q) rpm -qa | grep rabbitmq
rabbitmq-server-3.6.15-3.el7ost.noarch
puppet-rabbitmq-8.1.1-0.20180216013831.d4b06b7.el7ost.noarch
~~~


RHOSP13

How reproducible:

Always

Steps to Reproduce:
1. Restart all the controllers one by one, by "sudo reboot"
2. run the said command in one of the controller

Actual results:

The command get stuck

Expected results:

The command should return the complete information without any error or stuck.

Additional info:

Comment 1 Takashi Kajinami 2019-03-14 23:22:59 UTC
Could this be related to the following bug in rabbitmq?

 https://bugzilla.redhat.com/show_bug.cgi?id=1592528


Note You need to log in before you can comment on or make changes to this bug.