Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1362540 - glfs_fini() crashes with SIGSEGV
Summary: glfs_fini() crashes with SIGSEGV
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: libgfapi
Version: 3.8.1
Hardware: x86_64
OS: All
unspecified
medium
Target Milestone: ---
Assignee: Soumya Koduri
QA Contact: Sudhir D
URL:
Whiteboard:
: 1365748 (view as bug list)
Depends On: 1364026
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-02 13:13 UTC by Prashanth Pai
Modified: 2016-08-24 10:20 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.8.3
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1364026 (view as bug list)
Environment:
Last Closed: 2016-08-24 10:20:46 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)
Sample reproducer using libgfapi-python (deleted)
2016-08-02 13:13 UTC, Prashanth Pai
no flags Details
Compressed core dump (deleted)
2016-08-02 13:24 UTC, Prashanth Pai
no flags Details

Description Prashanth Pai 2016-08-02 13:13:38 UTC
Created attachment 1186818 [details]
Sample reproducer using libgfapi-python

Description of problem:
I was trying to benchmark libgfapi-python bindings for filesystem walk (metadata intensive) workload to do comparison against FUSE for same workload. The program crashes during virtual unmount (fini).

Setup details:

[root@f24 ~]# rpm -qa | grep gluster
glusterfs-3.8.1-1.fc24.x86_64
python-gluster-3.8.1-1.fc24.noarch
glusterfs-libs-3.8.1-1.fc24.x86_64
glusterfs-client-xlators-3.8.1-1.fc24.x86_64
glusterfs-cli-3.8.1-1.fc24.x86_64
glusterfs-fuse-3.8.1-1.fc24.x86_64
glusterfs-server-3.8.1-1.fc24.x86_64
glusterfs-api-3.8.1-1.fc24.x86_64
glusterfs-debuginfo-3.8.1-1.fc24.x86_64

[root@f24 ~]# uname -a
Linux f24 4.6.4-301.fc24.x86_64 #1 SMP Tue Jul 12 11:50:00 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

[root@f24]# gluster volume info
 
Volume Name: test
Type: Distributed-Replicate
Volume ID: e675d53a-c9b6-468f-bb8f-7101828bec70
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: f24:/export/brick1/data
Brick2: f24:/export/brick2/data
Brick3: f24:/export/brick3/data
Brick4: f24:/export/brick4/data
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet

NOTE: The volume is created with 4 RAM disks (1GB each) as bricks.

[root@f24 ~]# df -h | grep 'ram\|Filesystem'
Filesystem               Size  Used Avail Use% Mounted on
/dev/ram1                971M  272M  700M  28% /export/brick1
/dev/ram2                971M  272M  700M  28% /export/brick2
/dev/ram3                971M  272M  700M  28% /export/brick3
/dev/ram4                971M  272M  700M  28% /export/brick4

[root@f24 ~]# df -ih | grep 'ram\|Filesystem'
Filesystem              Inodes IUsed IFree IUse% Mounted on
/dev/ram1                 489K  349K  140K   72% /export/brick1
/dev/ram2                 489K  349K  140K   72% /export/brick2
/dev/ram3                 489K  349K  140K   72% /export/brick3
/dev/ram4                 489K  349K  140K   72% /export/brick4


How reproducible:
Always and consistently (at least on my fedora 24 test VM)

Steps to Reproduce:
1. Create large nested directory tree using FUSE mount.
2. Unmount FUSE mount. This is just to generate initial data.
3. Use python-libgfapi bindings with patch (http://review.gluster.org/#/c/14770/). It's reproducible without this patch too but the patch makes the crawling faster.
4. Run the python script that does walk using libgfapi

[root@f24 ~]# ./reproducer.py 
Segmentation fault (core dumped)

There *could* be a simple reproducer too but I haven't had the time to look further into it.

Excerpt from bt:

#0  list_add_tail (head=0x90, new=0x7fcac00040b8) at list.h:41
41		new->prev = head->prev;
[Current thread is 1 (Thread 0x7fcae3dbb700 (LWP 4165))]
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.23.1-8.fc24.x86_64 keyutils-libs-1.5.9-8.fc24.x86_64 krb5-libs-1.14.1-8.fc24.x86_64 libacl-2.2.52-11.fc24.x86_64 libattr-2.4.47-16.fc24.x86_64 libcom_err-1.42.13-4.fc24.x86_64 libffi-3.1-9.fc24.x86_64 libselinux-2.5-9.fc24.x86_64 libuuid-2.28-3.fc24.x86_64 openssl-libs-1.0.2h-1.fc24.x86_64 pcre-8.39-2.fc24.x86_64 zlib-1.2.8-10.fc24.x86_64
(gdb) bt
#0  list_add_tail (head=0x90, new=0x7fcac00040b8) at list.h:41
#1  list_move_tail (head=0x90, list=0x7fcac00040b8) at list.h:107
#2  __inode_retire (inode=0x7fcac0004030) at inode.c:439
#3  0x00007fcad764100e in inode_table_prune (table=table@entry=0x7fcac0004040) at inode.c:1521
#4  0x00007fcad7642f22 in inode_table_destroy (inode_table=0x7fcac0004040) at inode.c:1808
#5  0x00007fcad7642fee in inode_table_destroy_all (ctx=ctx@entry=0x55f82360b430) at inode.c:1733
#6  0x00007fcad7f3fde6 in pub_glfs_fini (fs=0x55f823537950) at glfs.c:1204


Expected results:
No crash

Additional_info:

[root@f24 ~]# rpm -qa | grep glusterfs-debuginfo
glusterfs-debuginfo-3.8.1-1.fc24.x86_64
[root@f24 ~]# rpm -qa | grep python-debuginfo
python-debuginfo-2.7.12-1.fc24.x86_64

[root@f24 coredump]# ls -lh
total 311M
-rw-r--r--. 1 root root 350M Aug  2 17:31 core.python.0.21e34182be6844658e00bba43a55dfa0.4165.1470139182000000000000
-rw-r-----. 1 root root  22M Aug  2 17:29 core.python.0.21e34182be6844658e00bba43a55dfa0.4165.1470139182000000000000.lz4

Can provide the coredump offline.

Comment 1 Prashanth Pai 2016-08-02 13:24:40 UTC
Created attachment 1186826 [details]
Compressed core dump

Comment 2 Prashanth Pai 2016-08-02 13:35:30 UTC
Also, I can't seem to reproduce this when the test filesystem tree is small enough.

Comment 3 Soumya Koduri 2016-08-04 10:14:55 UTC
I suspect below could have caused the issue - 

In inode_table_destroy(), we first purge all the lru entries but the lru count is not adjusted accordingly. So when inode_table_prune() is called in case if the lru count was larger than lru limit (as can be seen in the core), we shall end up accessing invalid memory. 

(gdb) f 3
#3  0x00007fcad764100e in inode_table_prune (table=table@entry=0x7fcac0004040) at inode.c:1521
1521	                        __inode_retire (entry);
(gdb) p table->lru_size
$4 = 132396
(gdb) p table->lru_limit
$5 = 131072
(gdb) p table->lru
$6 = {next = 0x90, prev = 0xcafecafe}
(gdb) p &&table->lru
A syntax error in expression, near `&&table->lru'.
(gdb) p &table->lru
$7 = (struct list_head *) 0x7fcac00040b8
(gdb) 

I will send a fix for it.

Comment 4 Raghavendra G 2016-08-09 11:30:21 UTC
(In reply to Soumya Koduri from comment #3)
> I suspect below could have caused the issue - 
> 
> In inode_table_destroy(), we first purge all the lru entries but the lru
> count is not adjusted accordingly.

This is not true. Can you point out the code where an inode is moved into/out of lru list but lru_size is not modified atomically?

I think we need to explore further to find out why lru_size and lru_list were out of sync, before considering this bug as closed with patch http://review.gluster.org/#/c/15087/

Comment 5 Raghavendra G 2016-08-09 11:34:36 UTC
(In reply to Raghavendra G from comment #4)
> (In reply to Soumya Koduri from comment #3)
> > I suspect below could have caused the issue - 
> > 
> > In inode_table_destroy(), we first purge all the lru entries but the lru
> > count is not adjusted accordingly.
> 
> This is not true. Can you point out the code where an inode is moved
> into/out of lru list but lru_size is not modified atomically?

Sorry I missed out that inode_table_destroy is the culprit. We need to fix inode_table_destroy to update lru_size while moving inodes from lru to purge list (just like inode_table_prune).

> 
> I think we need to explore further to find out why lru_size and lru_list
> were out of sync, before considering this bug as closed with patch
> http://review.gluster.org/#/c/15087/

Comment 6 Soumya Koduri 2016-08-09 11:49:53 UTC
(In reply to Raghavendra G from comment #5)
> (In reply to Raghavendra G from comment #4)
> > (In reply to Soumya Koduri from comment #3)
> > > I suspect below could have caused the issue - 
> > > 
> > > In inode_table_destroy(), we first purge all the lru entries but the lru
> > > count is not adjusted accordingly.
> > 
> > This is not true. Can you point out the code where an inode is moved
> > into/out of lru list but lru_size is not modified atomically?
> 
> Sorry I missed out that inode_table_destroy is the culprit. We need to fix
> inode_table_destroy to update lru_size while moving inodes from lru to purge
> list (just like inode_table_prune).
> 

yes.. and that is exactly what http://review.gluster.org/#/c/15087/ does. Sorry ..am I missing anything?

Comment 7 Vijay Bellur 2016-08-10 07:27:11 UTC
REVIEW: http://review.gluster.org/15129 (inode: Adjust lru_size while retiring entries in lru list) posted (#2) for review on release-3.8 by Oleksandr Natalenko (oleksandr@natalenko.name)

Comment 8 Soumya Koduri 2016-08-10 07:27:40 UTC
*** Bug 1365748 has been marked as a duplicate of this bug. ***

Comment 9 Vijay Bellur 2016-08-13 08:04:04 UTC
COMMIT: http://review.gluster.org/15129 committed in release-3.8 by Niels de Vos (ndevos@redhat.com) 
------
commit dae860ab7e1c5a205646393f2cb80a0a06986c30
Author: Soumya Koduri <skoduri@redhat.com>
Date:   Thu Aug 4 16:00:31 2016 +0530

    inode: Adjust lru_size while retiring entries in lru list
    
    As part of inode_table_destroy(), we first retire entries
    in the lru list but the lru_size is not adjusted accordingly.
    This may result in invalid memory reference in inode_table_prune
    if the lru_size > lru_limit.
    
    > Reviewed-on: http://review.gluster.org/15087
    > Smoke: Gluster Build System <jenkins@build.gluster.org>
    > CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
    > NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
    > Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
    > Reviewed-by: Prashanth Pai <ppai@redhat.com>
    
    BUG: 1362540
    Change-Id: I29ee3c03b0eaa8a118d06dc0cefba85877daf963
    Signed-off-by: Soumya Koduri <skoduri@redhat.com>
    Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name>
    Reviewed-on: http://review.gluster.org/15129
    Smoke: Gluster Build System <jenkins@build.gluster.org>
    Reviewed-by: Prashanth Pai <ppai@redhat.com>
    NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
    Reviewed-by: Niels de Vos <ndevos@redhat.com>

Comment 10 Niels de Vos 2016-08-24 10:20:46 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.3, please open a new bug report.

glusterfs-3.8.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://www.gluster.org/pipermail/announce/2016-August/000059.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.