Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1336764 - Bricks doesn't come online after reboot [ Brick Full ]
Summary: Bricks doesn't come online after reboot [ Brick Full ]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: posix
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: RHGS 3.2.0
Assignee: Ashish Pandey
QA Contact: Karan Sandha
URL:
Whiteboard:
: 1361517 (view as bug list)
Depends On:
Blocks: 1351522 1360679 1364354 1364365
TreeView+ depends on / blocked
 
Reported: 2016-05-17 12:14 UTC by Karan Sandha
Modified: 2017-03-23 05:31 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.8.4-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1360679 (view as bug list)
Environment:
Last Closed: 2017-03-23 05:31:15 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:0486 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update 2017-03-23 09:18:45 UTC

Description Karan Sandha 2016-05-17 12:14:01 UTC
************** The bug is a clone 1333341 (Upstream)


Description of problem:
Rebooted the brick2 and started renaming the files in a brick1 which is full. The brick2 didn't came online after the reboot. Errors were seen in the brick logs.
"Creation of unlink directory failed"

sosreport kept at rhsqe-repo.lab.eng.blr.redhat.com://var/www/html/sosreports/<bugid>

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Create replica 3 volume and mount the volume on client using fuse.
2. Create files using 
for (( i=1; i <= 50; i++ ))
do
 dd if=/dev/zero of=file$i count=1000 bs=5M status=progress

done
3. After the creation is done. reboot the second brick.
4. start the renaming process of the files to test$i..n
5. When the second brick comes up it fails with below errors.

[2016-05-05 14:37:45.826772] E [MSGID: 113096] [posix.c:6443:posix_create_unlink_dir] 0-arbiter-posix: Creating directory /rhs/brick1/arbiter/.glusterfs/unlink failed [No space left on device]
[2016-05-05 14:37:45.826856] E [MSGID: 113096] [posix.c:6866:init] 0-arbiter-posix: Creation of unlink directory failed
[2016-05-05 14:37:45.826880] E [MSGID: 101019] [xlator.c:433:xlator_init] 0-arbiter-posix: Initialization of volume 'arbiter-posix' failed, review your volfile again
[2016-05-05 14:37:45.826925] E [graph.c:322:glusterfs_graph_init] 0-arbiter-posix: initializing translator failed
[2016-05-05 14:37:45.826943] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed
[2016-05-05 14:37:45.828349] W [glusterfsd.c:1251:cleanup_and_exit] (-->/usr/sbin/glusterfsd(mgmt_getspec_cbk+0x331) [0x7f6ba63797d1] -->/usr/sbin/glusterfsd(glusterfs_process_volfp+0x120) [0x7f6ba6374150] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x69) [0x7f6ba6373739] ) 0-: received signum (0), shutting down


 
Actual results:
[root@dhcp43-167 arbiter]# gluster volume status
Status of volume: arbiter
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dhcp42-58.lab.eng.blr.redhat.com:/rhs
/brick1/arbiter                             N/A       N/A        N       N/A  
Brick dhcp43-142.lab.eng.blr.redhat.com:/rh
s/brick1/arbiter                            49157     0          Y       2120 
Brick dhcp43-167.lab.eng.blr.redhat.com:/rh
s/brick1/arbiter                            49156     0          Y       2094 
NFS Server on localhost                     2049      0          Y       2679 
Self-heal Daemon on localhost               N/A       N/A        Y       3172 
NFS Server on dhcp42-58.lab.eng.blr.redhat.
com                                         2049      0          Y       2195 
Self-heal Daemon on dhcp42-58.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       2816 
NFS Server on dhcp43-142.lab.eng.blr.redhat
.com                                        2049      0          Y       3072 
Self-heal Daemon on dhcp43-142.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       3160 
 
Task Status of Volume arbiter
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: arbiternfs
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dhcp42-58.lab.eng.blr.redhat.com:/rhs
/brick2/arbiternfs                          N/A       N/A        N       N/A  
Brick dhcp43-142.lab.eng.blr.redhat.com:/rh
s/brick2/arbiternfs                         49158     0          Y       2128 
Brick dhcp43-167.lab.eng.blr.redhat.com:/rh
s/brick2/arbiternfs                         49157     0          Y       2109 
NFS Server on localhost                     2049      0          Y       2679 
Self-heal Daemon on localhost               N/A       N/A        Y       3172 
NFS Server on dhcp42-58.lab.eng.blr.redhat.
com                                         2049      0          Y       2195 
Self-heal Daemon on dhcp42-58.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       2816 
NFS Server on dhcp43-142.lab.eng.blr.redhat
.com                                        2049      0          Y       3072 
Self-heal Daemon on dhcp43-142.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       3160 
 
Task Status of Volume arbiternfs
------------------------------------------------------------------------------
There are no active volume tasks
 
*************************************************************
[root@dhcp43-142 arbiter]# gluster volume heal arbiternfs info
Brick dhcp42-58.lab.eng.blr.redhat.com:/rhs/brick2/arbiternfs
Status: Transport endpoint is not connected
Number of entries: -

Brick dhcp43-142.lab.eng.blr.redhat.com:/rhs/brick2/arbiternfs
/file4 
/file5 
Status: Connected
Number of entries: 2

Brick dhcp43-167.lab.eng.blr.redhat.com:/rhs/brick2/arbiternfs
/file4 
/file5 
Status: Connected
Number of entries: 2

[root@dhcp43-142 arbiter]# 
[root@dhcp43-142 arbiter]# 
[root@dhcp43-142 arbiter]# 
[root@dhcp43-142 arbiter]# gluster volume heal arbiter info
Brick dhcp42-58.lab.eng.blr.redhat.com:/rhs/brick1/arbiter
Status: Transport endpoint is not connected
Number of entries: -

Brick dhcp43-142.lab.eng.blr.redhat.com:/rhs/brick1/arbiter
/ - Possibly undergoing heal

Status: Connected
Number of entries: 1

Brick dhcp43-167.lab.eng.blr.redhat.com:/rhs/brick1/arbiter
/ 
Status: Connected
Number of entries: 1

[root@dhcp43-142 arbiter]# 

Expected results:
The bricks should be up and running and file names should have been renamed.

Additional info:

Comment 6 Pranith Kumar K 2016-08-02 08:10:54 UTC
*** Bug 1361517 has been marked as a duplicate of this bug. ***

Comment 7 Atin Mukherjee 2016-08-09 04:17:37 UTC
Upstream mainline patch http://review.gluster.org/15030 is merged.

Comment 9 Atin Mukherjee 2016-09-17 14:33:15 UTC
Upstream mainline : http://review.gluster.org/15030
Upstream 3.8 : http://review.gluster.org/15093

And the fix is available in rhgs-3.2.0 as part of rebase to GlusterFS 3.8.4.

Comment 12 Karan Sandha 2016-10-04 11:47:30 UTC
Verified the bug on 

[root@dhcp47-144 brick0]# gluster --version
glusterfs 3.8.4 built on Sep 29 2016 12:20:30
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.

Steps:-
1. Create replica 3 volume and mount the volume on client using fuse.
2. Create files using 
for (( i=1; i <= 50; i++ ))
do
 dd if=/dev/zero of=file$i count=1000 bs=5M status=progress

done
3. After the creation is done. reboot the second brick.
4. start the renaming process of the files to test$i..n
5. When the second brick comes up it fails with below errors.

No ERRORS and Warning messages were observed 
Functionality of the bricks were as expected.

Comment 14 errata-xmlrpc 2017-03-23 05:31:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html


Note You need to log in before you can comment on or make changes to this bug.