Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 988286 - restarting glusterd doesn't start the brick, nfs and self-heal daemon process
Summary: restarting glusterd doesn't start the brick, nfs and self-heal daemon process
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd
Version: 2.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: Sudhir D
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-07-25 09:07 UTC by spandura
Modified: 2014-01-17 11:50 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-17 11:50:46 UTC
Target Upstream Version:


Attachments (Terms of Use)
SOS Reports (deleted)
2013-07-25 09:32 UTC, spandura
no flags Details

Description spandura 2013-07-25 09:07:06 UTC
Description of problem:
========================
In a replicate volume ( 1 x 2 , storage_node1 and storage_node2 ) all the brick process , nfs process, self-heal-daemon process and glusterd process are killed on both the nodes ( killall glusterfs glusterfsd glusterd ) and started glusterd (service glusterd start) on one of the node. glusterd fails to start brick, nfs and self-heal daemon process on the node where the glusterd is restarted. 

Version-Release number of selected component (if applicable):
============================================================
root@rhs-client11 [Jul-25-2013-14:30:55] >rpm -qa | grep glusterfs-server
glusterfs-server-3.4.0.12rhs.beta6-1.el6rhs.x86_64

root@rhs-client11 [Jul-25-2013-14:30:58] >gluster --version
glusterfs 3.4.0.12rhs.beta6 built on Jul 23 2013 16:20:03

How reproducible:
=================
Often

Steps to Reproduce:
=====================
1. Create a replicate volume ( 1 x 2 ). Start the volume.

2. Create nfs, fuse mount. Create files/dirs from both the mounts. 

3. killall glusterfs glusterfsd glusterd  from both the storage nodes. 

4. From one of the storage node start glusterd (service glusterd start)

Actual results:
================
1. Brick, NFS, Self-heal daemon process are not started. 

2. Even after restarting glusterd multiple times it doesn't start any of the process

root@rhs-client11 [Jul-25-2013-13:55:18] >killall glusterfs glusterfsd glusterd
root@rhs-client11 [Jul-25-2013-13:55:31] >
root@rhs-client11 [Jul-25-2013-13:55:36] >service glusterd start
Starting glusterd:                                         [  OK  ]
root@rhs-client11 [Jul-25-2013-13:55:44] >
root@rhs-client11 [Jul-25-2013-13:55:44] >
root@rhs-client11 [Jul-25-2013-13:55:44] >gluster v status
Status of volume: vol_rep
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick rhs-client11:/rhs/brick1/b0			N/A	N	N/A
NFS Server on localhost					N/A	N	N/A
Self-heal Daemon on localhost				N/A	N	N/A
 
There are no active volume tasks
root@rhs-client11 [Jul-25-2013-13:55:47] >
root@rhs-client11 [Jul-25-2013-13:55:48] >gluster v status
Status of volume: vol_rep
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick rhs-client11:/rhs/brick1/b0			N/A	N	N/A
NFS Server on localhost					N/A	N	N/A
Self-heal Daemon on localhost				N/A	N	N/A
 
There are no active volume tasks
root@rhs-client11 [Jul-25-2013-13:55:49] >
root@rhs-client11 [Jul-25-2013-13:55:53] >gluster v status
Status of volume: vol_rep
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick rhs-client11:/rhs/brick1/b0			N/A	N	N/A
NFS Server on localhost					N/A	N	N/A
Self-heal Daemon on localhost				N/A	N	N/A
 
There are no active volume tasks
root@rhs-client11 [Jul-25-2013-13:55:54] >
root@rhs-client11 [Jul-25-2013-13:55:59] >ps -ef | grep gluster
root     23631 23619  0 11:49 pts/2    00:00:00 tail -f /var/log/glusterfs/glustershd.log
root     24906     1  0 13:55 ?        00:00:00 /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid
root     25058 22781  0 13:56 pts/0    00:00:00 grep gluster

root@rhs-client11 [Jul-25-2013-14:01:14] >service glusterd status
glusterd (pid  24906) is running...
root@rhs-client11 [Jul-25-2013-14:01:22] >
root@rhs-client11 [Jul-25-2013-14:01:22] >service glusterd restart
Starting glusterd:                                         [  OK  ]
root@rhs-client11 [Jul-25-2013-14:01:28] >
root@rhs-client11 [Jul-25-2013-14:01:29] >service glusterd status
glusterd (pid  25231) is running...
root@rhs-client11 [Jul-25-2013-14:01:31] >gluster v status
Status of volume: vol_rep
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick rhs-client11:/rhs/brick1/b0			N/A	N	N/A
NFS Server on localhost					N/A	N	N/A
Self-heal Daemon on localhost				N/A	N	

Expected results:
================
should restart bricks, nfs and self-heal-daemon process. 

Additional info:
=================
root@rhs-client11 [Jul-25-2013-14:35:00] >gluster v info
 
Volume Name: vol_rep
Type: Replicate
Volume ID: f7928cb5-76bf-4a9f-93b2-a4ce3073519b
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: rhs-client11:/rhs/brick1/b0
Brick2: rhs-client12:/rhs/brick1/b1

Comment 2 spandura 2013-07-25 09:32:11 UTC
Created attachment 778159 [details]
SOS Reports

Comment 3 Pranith Kumar K 2014-01-03 13:22:36 UTC
Tried to re-create. It works fine.

root@pranithk-vm3 - ~/RPMS 
18:22:56 :) ⚡ killall glusterfs glusterd glusterfsd

root@pranithk-vm3 - ~/RPMS 
18:23:03 :) ⚡ ps aux | grep gluster
root     16432  0.0  0.0 103244   832 pts/0    S+   18:23   0:00 grep gluster

root@pranithk-vm3 - ~/RPMS 
18:23:06 :) ⚡ gluster volume start r2
Connection failed. Please check if gluster daemon is operational.

root@pranithk-vm3 - ~/RPMS 
18:23:12 :( ⚡ service glusterd start
Starting glusterd:                                         [  OK  ]

root@pranithk-vm3 - ~/RPMS 
18:23:24 :) ⚡ ps aux | grep gluster
root     16479  3.1  0.8 360520 16456 ?        Ssl  18:23   0:00 /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid
root     16778  0.6  0.9 642520 19700 ?        Ssl  18:23   0:00 /usr/sbin/glusterfsd -s 10.70.43.148 --volfile-id r2.10.70.43.148.brick-2 -p /var/lib/glusterd/vols/r2/run/10.70.43.148-brick-2.pid -S /var/run/d3d3
root     16790  2.6  3.0 389888 62400 ?        Ssl  18:23   0:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/55e1ee0bebt
root     16795  1.3  0.9 329196 20504 ?        Ssl  18:23   0:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustersh1
root     16814  0.0  0.0 103248   840 pts/0    S+   18:23   0:00 grep gluster

root@pranithk-vm3 - ~/RPMS 
18:23:27 :) ⚡ rpm -qa | grep gluster
glusterfs-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-server-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-debuginfo-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-libs-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-api-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-geo-replication-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-api-devel-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-devel-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-fuse-3.4.0.53rhs-1.el6rhs.x86_64
glusterfs-rdma-3.4.0.53rhs-1.el6rhs.x86_64

Comment 4 Vivek Agarwal 2014-01-17 11:50:46 UTC
Based on comment 3, closing this bug


Note You need to log in before you can comment on or make changes to this bug.