Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 1509845

Summary: In distribute volume after glusterd restart, brick goes offline
Product: [Community] GlusterFS Reporter: Atin Mukherjee <amukherj>
Component: glusterdAssignee: Atin Mukherjee <amukherj>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: akrai, bmekala, bugs, rhs-bugs, storage-qa-internal, sunnikri, vbellur
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-4.0.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1509102
: 1511293 1511301 (view as bug list) Environment:
Last Closed: 2018-03-15 11:20:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 1509102, 1511293, 1511301    

Comment 1 Atin Mukherjee 2017-11-06 08:01:10 UTC
Description of problem:
After glusterd restart on same node, brick goes offline.

Version-Release number of selected component (if applicable):
mainline

How reproducible:
3/3

Steps to Reproduce:
1. Created a distribute volume with 3 bricks of each node and start it.
2. Stopped glusterd on other two node and check the volume status where glusterd is running.
3. Restart glusterd on node where glusterd is running and check volume status.

Actual results:
Before restart glusterd

Status of volume: testvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.52:/bricks/brick0/testvol    49160     0          Y       17734
 
Task Status of Volume testvol
------------------------------------------------------------------------------
There are no active volume tasks

After restart glusterd

Status of volume: testvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.52:/bricks/brick0/testvol    N/A       N/A        N       N/A  
 
Task Status of Volume testvol
------------------------------------------------------------------------------
There are no active volume tasks


Expected results:
Brick must be online after restart glusterd.

Additional info:
Glusterd is stopped on other two nodes.

Comment 2 Worker Ant 2017-11-06 08:02:33 UTC
REVIEW: https://review.gluster.org/18669 (glusterd: restart the brick if qorum status is NOT_APPLICABLE_QUORUM) posted (#1) for review on master by Atin Mukherjee

Comment 3 Worker Ant 2017-11-09 05:11:06 UTC
COMMIT: https://review.gluster.org/18669 committed in master by  

------------- glusterd: restart the brick if qorum status is NOT_APPLICABLE_QUORUM

If a volume is not having server quorum enabled and in a trusted storage
pool all the glusterd instances from other peers are down, on restarting
glusterd the brick start trigger doesn't happen resulting into the
brick not coming up.

Change-Id: If1458e03b50a113f1653db553bb2350d11577539
BUG: 1509845
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>

Comment 4 Worker Ant 2018-01-03 13:32:44 UTC
REVIEW: https://review.gluster.org/19134 (glusterd: connect to an existing brick process when qourum status is NOT_APPLICABLE_QUORUM) posted (#1) for review on master by Atin Mukherjee

Comment 5 Worker Ant 2018-01-05 07:32:10 UTC
COMMIT: https://review.gluster.org/19134 committed in master by \"Atin Mukherjee\" <amukherj@redhat.com> with a commit message- glusterd: connect to an existing brick process when qourum status is NOT_APPLICABLE_QUORUM

First of all, this patch reverts commit 635c1c3 as the same is causing a
regression with bricks not coming up on time when a node is rebooted.
This patch tries to fix the problem in a different way by just trying to
connect to an existing running brick when quorum status is not
applicable.

Change-Id: I0efb5901832824b1c15dcac529bffac85173e097
BUG: 1509845
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>

Comment 6 Shyamsundar 2018-03-15 11:20:03 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-4.0.0, please open a new bug report.

glusterfs-4.0.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-March/000092.html
[2] https://www.gluster.org/pipermail/gluster-users/