Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1687671 - Brick process has coredumped, when starting glusterd
Summary: Brick process has coredumped, when starting glusterd
Keywords:
Status: NEW
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: rhhi
Version: rhhiv-1.6
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Sahina Bose
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On: 1687641 1687705 1688218
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-12 06:04 UTC by SATHEESARAN
Modified: 2019-04-11 07:58 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1687641
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description SATHEESARAN 2019-03-12 06:04:56 UTC
Description of problem:
------------------------
This issue is seen in the RHHI setup. RHHI nodes has atleast 2 networks, one dedicated for VM traffic while the other for gluster traffic. 4 Gluster volume ( replica 3 & arbitrated replicate ) were created and up. When restarting the RHHI-V node post upgrade, interface corresponding to gluster network haven't picked up the IP. This issue is because of there is no BOOTPROTO parameter in the network configuration file.

Because of this, bricks haven't come up, but glusterd came up. When checking the brick status, it was down. Then the network was set up and restarted glusterd.
Bricks corresponding to 3 volumes came up, but one of the brick coredumped

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
RHGS-3.4.4 nightly ( glusterfs-3.12.2-45.el7rhgs )

How reproducible:
-----------------
1/1

Steps to Reproduce:
--------------------
1. With 2 network interfaces, set the network corresponding to gluster ( which is used for peer probe & volume creation) down
        Hint: Remote BOOTPROTO for DHCP network or use ONBOOT=no in the interface configuration file
2. Restart the node
3. Post reboot find that the gluster brick process should be down, with glusterd up
4. Fix the gluster network, to be up
5. Restart glusterd

Actual results:
---------------
Brick process coredumped

Expected results:
-----------------
All brick process should be up

Comment 1 SATHEESARAN 2019-03-12 06:05:27 UTC
[2019-03-11 19:24:46.737551] I [rpcsvc.c:2582:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64
[2019-03-11 19:24:46.737635] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-vmstore-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction
pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2019-03-11 19:24:46
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.12.2
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x9d)[0x7fa1d4cf8b9d]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7fa1d4d03114]
/lib64/libc.so.6(+0x36280)[0x7fa1d3334280]
/lib64/libpthread.so.0(pthread_mutex_lock+0x0)[0x7fa1d3b36c30]
/usr/lib64/glusterfs/3.12.2/xlator/protocol/server.so(+0x985d)[0x7fa1bf2ef85d]
/lib64/libgfrpc.so.0(+0x7685)[0x7fa1d4a94685]
/lib64/libgfrpc.so.0(rpcsvc_notify+0x65)[0x7fa1d4a98985]
/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fa1d4a9aae3]
/usr/lib64/glusterfs/3.12.2/rpc-transport/socket.so(+0xce77)[0x7fa1c98c1e77]
/lib64/libglusterfs.so.0(+0x8a870)[0x7fa1d4d57870]
/lib64/libpthread.so.0(+0x7dd5)[0x7fa1d3b34dd5]
/lib64/libc.so.6(clone+0x6d)[0x7fa1d33fbead]
---------

Comment 2 SATHEESARAN 2019-03-12 06:05:39 UTC
Core was generated by `/usr/sbin/glusterfsd -s rhsqa-grafton12.lab.eng.blr.redhat.com --volfile-id vms'.
Program terminated with signal 11, Segmentation fault.
#0  __GI___pthread_mutex_lock (mutex=mutex@entry=0x40) at ../nptl/pthread_mutex_lock.c:65
65        unsigned int type = PTHREAD_MUTEX_TYPE_ELISION (mutex);
Missing separate debuginfos, use: debuginfo-install openssl-libs-1.0.2k-16.el7_6.1.x86_64
(gdb) bt
#0  __GI___pthread_mutex_lock (mutex=mutex@entry=0x40) at ../nptl/pthread_mutex_lock.c:65
#1  0x00007fa1bf2ef85d in server_rpc_notify (rpc=<optimized out>, xl=<optimized out>, event=<optimized out>, data=0x7fa1ac000c40) at server.c:538
#2  0x00007fa1d4a94685 in rpcsvc_program_notify (listener=0x7fa1b8040bb0, event=event@entry=RPCSVC_EVENT_ACCEPT, data=data@entry=0x7fa1ac000c40) at rpcsvc.c:405
#3  0x00007fa1d4a98985 in rpcsvc_accept (new_trans=0x7fa1ac000c40, listen_trans=0x7fa1b803fd90, svc=<optimized out>) at rpcsvc.c:428
#4  rpcsvc_notify (trans=0x7fa1b803fd90, mydata=<optimized out>, event=<optimized out>, data=0x7fa1ac000c40) at rpcsvc.c:999
#5  0x00007fa1d4a9aae3 in rpc_transport_notify (this=this@entry=0x7fa1b803fd90, event=event@entry=RPC_TRANSPORT_ACCEPT, data=data@entry=0x7fa1ac000c40) at rpc-transport.c:557
#6  0x00007fa1c98c1e77 in socket_server_event_handler (fd=<optimized out>, idx=<optimized out>, gen=<optimized out>, data=0x7fa1b803fd90, poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0,
    event_thread_died=0 '\000') at socket.c:2946
#7  0x00007fa1d4d57870 in event_dispatch_epoll_handler (event=0x7fa1beae3e70, event_pool=0x555ec37eec00) at event-epoll.c:643
#8  event_dispatch_epoll_worker (data=0x7fa1b803d790) at event-epoll.c:759
#9  0x00007fa1d3b34dd5 in start_thread (arg=0x7fa1beae4700) at pthread_create.c:307
#10 0x00007fa1d33fbead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Comment 3 SATHEESARAN 2019-03-20 03:09:02 UTC
This bug is not accepted for RHGS 3.4.4, and removing the acks


Note You need to log in before you can comment on or make changes to this bug.