Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1510268 - Self-Heal not complete and pending frames signal reviced : 11
Summary: Self-Heal not complete and pending frames signal reviced : 11
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: GlusterFS
Classification: Community
Component: quick-read
Version: 3.10
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-07 05:07 UTC by jhkim
Modified: 2018-04-02 11:36 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-02 11:36:19 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description jhkim 2017-11-07 05:07:05 UTC
Description of problem:

Self-Heal not complete for 30 days. and heal fail log (Client Side)
command "find "" -exec file " to find broken files at mount point
but, pending frames and signal received : 11 

The message "W [MSGID: 122002] [ec-common.c:122:ec_heal_report] : Heal failed [Input/output error]" repeated 91 times between [2017-11-07 02:24:55.378087] and [2017-11-07 02:25:29.401347]
pending frames:
frame : type(1) op(READ)
frame : type(1) op(OPEN)
frame : type(1) op(READ)
frame : type(1) op(READ)
frame : type(1) op(FLUSH)
frame : type(1) op(FLUSH)
frame : type(1) op(READ)
frame : type(1) op(READ)
frame : type(1) op(READ)
frame : type(1) op(READ)
frame : type(1) op(READ)
frame : type(1) op(READ)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2017-11-07 02:25:31
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.1
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7fd482105e92]
/lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7fd4821224ed]
/lib64/libc.so.6(+0x35670)[0x7fd4807f4670]
/lib64/libc.so.6(+0x147dc9)[0x7fd480906dc9]
/usr/lib64/glusterfs/3.7.1/xlator/performance/quick-read.so(qr_readv_cached+0x119)[0x7fd46f7cd329]
/usr/lib64/glusterfs/3.7.1/xlator/performance/quick-read.so(qr_readv+0x4a)[0x7fd46f7cd57a]
/lib64/libglusterfs.so.0(default_readv_resume+0x13c)[0x7fd482116bec]
/lib64/libglusterfs.so.0(call_resume_wind+0x242)[0x7fd482135b52]
/lib64/libglusterfs.so.0(call_resume+0x7d)[0x7fd48213614d]
/usr/lib64/glusterfs/3.7.1/xlator/performance/open-behind.so(open_and_resume+0xb8)[0x7fd46f5c3678]
/usr/lib64/glusterfs/3.7.1/xlator/performance/open-behind.so(ob_readv+0x7f)[0x7fd46f5c588f]
/usr/lib64/glusterfs/3.7.1/xlator/performance/md-cache.so(mdc_readv+0x157)[0x7fd46f3b63e7]
/usr/lib64/glusterfs/3.7.1/xlator/debug/io-stats.so(io_stats_readv+0x171)[0x7fd46f19a8d1]
/lib64/libglusterfs.so.0(default_readv+0x80)[0x7fd48210a510]
/usr/lib64/glusterfs/3.7.1/xlator/meta.so(meta_readv+0x4e)[0x7fd46ef84ffe]
/usr/lib64/glusterfs/3.7.1/xlator/mount/fuse.so(fuse_readv_resume+0x224)[0x7fd478ce7664]
/usr/lib64/glusterfs/3.7.1/xlator/mount/fuse.so(+0x8a65)[0x7fd478cdfa65]
/usr/lib64/glusterfs/3.7.1/xlator/mount/fuse.so(+0x87a8)[0x7fd478cdf7a8]
/usr/lib64/glusterfs/3.7.1/xlator/mount/fuse.so(+0x8aae)[0x7fd478cdfaae]
/usr/lib64/glusterfs/3.7.1/xlator/mount/fuse.so(fuse_resolve_continue+0x23)[0x7fd478cdf023]
/usr/lib64/glusterfs/3.7.1/xlator/mount/fuse.so(+0x8748)[0x7fd478cdf748]
/usr/lib64/glusterfs/3.7.1/xlator/mount/fuse.so(+0x8a8e)[0x7fd478cdfa8e]
/usr/lib64/glusterfs/3.7.1/xlator/mount/fuse.so(fuse_resolve_and_resume+0x20)[0x7fd478cdfad0]
/usr/lib64/glusterfs/3.7.1/xlator/mount/fuse.so(+0x1b6ce)[0x7fd478cf26ce]
/lib64/libpthread.so.0(+0x7dc5)[0x7fd480f6edc5]
/lib64/libc.so.6(clone+0x6d)[0x7fd4808b528d]


Version-Release number of selected component (if applicable):
CentOS Linux release 7.2.1511 (Core) 
glusterfs 3.7.1

Comment 1 Sanoj Unnikrishnan 2017-11-07 09:59:17 UTC
Is this scenario reproducing?

Looks like a segfault in qr_readv_cached function. 
Did u get a core dump/ can u share it? 
Could you generate a core and share the core file for further analysis?

Comment 2 jhkim 2017-11-07 10:16:39 UTC
(In reply to Sanoj Unnikrishnan from comment #1)
> Is this scenario reproducing?
> 
> Looks like a segfault in qr_readv_cached function. 
> Did u get a core dump/ can u share it? 
> Could you generate a core and share the core file for further analysis?

(gdb) bt
#0  __memmove_ssse3 () at ../sysdeps/x86_64/multiarch/memcpy-ssse3.S:1614
#1  0x00007f924b9dc329 in memcpy (__len=1576, __src=<optimized out>, __dest=<optimized out>) at /usr/include/bits/s
#2  qr_readv_cached (frame=frame@entry=0x7f925be7fb7c, qr_inode=0x7f92300c8110, size=size@entry=4096, offset=offset
#3  0x00007f924b9dc57a in qr_readv (frame=0x7f925be7fb7c, this=0x7f924c0eb300, fd=0x7f923001cfa0, size=4096, offset
#4  0x00007f925e36bbec in default_readv_resume (frame=0x7f925be685ec, this=0x7f924c0ec780, fd=0x7f923001cfa0, size=
#5  0x00007f925e38ab52 in call_resume_wind (stub=<optimized out>) at call-stub.c:2118
#6  0x00007f925e38b14d in call_resume (stub=0x7f925b90b5a0) at call-stub.c:2576
#7  0x00007f924b7d2678 in open_and_resume (this=this@entry=0x7f924c0ec780, fd=fd@entry=0x7f923001cfa0, stub=stub@en
#8  0x00007f924b7d488f in ob_readv (frame=0x7f925be685ec, this=0x7f924c0ec780, fd=<optimized out>, size=<optimized
#9  0x00007f924b5c53e7 in mdc_readv (frame=0x7f925be8c1b0, this=0x7f924c0edb40, fd=0x7f923001d00c, size=4096, offse
#10 0x00007f924b3a98d1 in io_stats_readv (frame=0x7f925be931e4, this=0x7f924c0eef60, fd=0x7f923001d00c, size=4096,
#11 0x00007f925e35f510 in default_readv (frame=0x7f925be931e4, this=0x7f924c0f04c0, fd=0x7f923001d00c, size=4096, o
#12 0x00007f924b193ffe in meta_readv (frame=0x7f925be931e4, this=0x7f924c0f04c0, fd=0x7f923001d00c, size=4096, offs
#13 0x00007f9254f3c664 in fuse_readv_resume (state=0x7f9220135ce0) at fuse-bridge.c:2210
#14 0x00007f9254f34a65 in fuse_resolve_done (state=<optimized out>) at fuse-resolve.c:644
#15 fuse_resolve_all (state=<optimized out>) at fuse-resolve.c:671
#16 0x00007f9254f347a8 in fuse_resolve (state=0x7f9220135ce0) at fuse-resolve.c:635
#17 0x00007f9254f34aae in fuse_resolve_all (state=<optimized out>) at fuse-resolve.c:667
#18 0x00007f9254f34023 in fuse_resolve_continue (state=state@entry=0x7f9220135ce0) at fuse-resolve.c:687
#19 0x00007f9254f34748 in fuse_resolve_fd (state=0x7f9220135ce0) at fuse-resolve.c:547
#20 fuse_resolve (state=0x7f9220135ce0) at fuse-resolve.c:624
#21 0x00007f9254f34a8e in fuse_resolve_all (state=<optimized out>) at fuse-resolve.c:660
#22 0x00007f9254f34ad0 in fuse_resolve_and_resume (state=0x7f9220135ce0, fn=0x7f9254f3c440 <fuse_readv_resume>) at
#23 0x00007f9254f476ce in fuse_thread_proc (data=0x7f925f003d50) at fuse-bridge.c:4903
#24 0x00007f925d1c3dc5 in start_thread (arg=0x7f922638a700) at pthread_create.c:308
#25 0x00007f925cb0a28d in getxattr () at ../sysdeps/unix/syscall-template.S:81
#26 0x0000000000000000 in ?? ()
(gdb) f 4
#4  0x00007f925e36bbec in default_readv_resume (frame=0x7f925be685ec, this=0x7f924c0ec780, fd=0x7f923001cfa0, size=4096, offset=0, flags=32768, xdata=0x0) at defaults.c:1405
1405            STACK_WIND (frame, default_readv_cbk, FIRST_CHILD(this),
(gdb) list
1400
1401    int32_t
1402    default_readv_resume (call_frame_t *frame, xlator_t *this, fd_t *fd,
1403                          size_t size, off_t offset, uint32_t flags, dict_t *xdata)
1404    {
1405            STACK_WIND (frame, default_readv_cbk, FIRST_CHILD(this),
1406                        FIRST_CHILD(this)->fops->readv, fd, size, offset, flags, xdata);
1407            return 0;
1408    }

Comment 4 Milind Changire 2017-11-09 06:05:59 UTC
jhkim,
I'd suggest you to upgrade to latest bits: 3.12.2
You seem to be using an old gluster release: 3.7.1

Let me know if the upgrade to 3.12.2 helps and then close the BZ appropriately.

Comment 5 Milind Changire 2017-11-09 06:15:22 UTC
Patch https://review.gluster.org/18146, which addresses the issue is available upstream with version 3.12.2


Note You need to log in before you can comment on or make changes to this bug.