Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1364026 - glfs_fini() crashes with SIGSEGV
Summary: glfs_fini() crashes with SIGSEGV
Alias: None
Product: GlusterFS
Classification: Community
Component: libgfapi
Version: mainline
Hardware: x86_64
OS: All
Target Milestone: ---
Assignee: Soumya Koduri
QA Contact: Sudhir D
Depends On:
Blocks: 1362540
TreeView+ depends on / blocked
Reported: 2016-08-04 10:15 UTC by Soumya Koduri
Modified: 2017-03-27 18:18 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.9.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1362540
Last Closed: 2017-03-27 18:18:24 UTC
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:

Attachments (Terms of Use)

Description Soumya Koduri 2016-08-04 10:15:25 UTC
+++ This bug was initially created as a clone of Bug #1362540 +++

Description of problem:
I was trying to benchmark libgfapi-python bindings for filesystem walk (metadata intensive) workload to do comparison against FUSE for same workload. The program crashes during virtual unmount (fini).

Setup details:

[root@f24 ~]# rpm -qa | grep gluster

[root@f24 ~]# uname -a
Linux f24 4.6.4-301.fc24.x86_64 #1 SMP Tue Jul 12 11:50:00 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

[root@f24]# gluster volume info
Volume Name: test
Type: Distributed-Replicate
Volume ID: e675d53a-c9b6-468f-bb8f-7101828bec70
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Brick1: f24:/export/brick1/data
Brick2: f24:/export/brick2/data
Brick3: f24:/export/brick3/data
Brick4: f24:/export/brick4/data
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet

NOTE: The volume is created with 4 RAM disks (1GB each) as bricks.

[root@f24 ~]# df -h | grep 'ram\|Filesystem'
Filesystem               Size  Used Avail Use% Mounted on
/dev/ram1                971M  272M  700M  28% /export/brick1
/dev/ram2                971M  272M  700M  28% /export/brick2
/dev/ram3                971M  272M  700M  28% /export/brick3
/dev/ram4                971M  272M  700M  28% /export/brick4

[root@f24 ~]# df -ih | grep 'ram\|Filesystem'
Filesystem              Inodes IUsed IFree IUse% Mounted on
/dev/ram1                 489K  349K  140K   72% /export/brick1
/dev/ram2                 489K  349K  140K   72% /export/brick2
/dev/ram3                 489K  349K  140K   72% /export/brick3
/dev/ram4                 489K  349K  140K   72% /export/brick4

How reproducible:
Always and consistently (at least on my fedora 24 test VM)

Steps to Reproduce:
1. Create large nested directory tree using FUSE mount.
2. Unmount FUSE mount. This is just to generate initial data.
3. Use python-libgfapi bindings with patch ( It's reproducible without this patch too but the patch makes the crawling faster.
4. Run the python script that does walk using libgfapi

[root@f24 ~]# ./ 
Segmentation fault (core dumped)

There *could* be a simple reproducer too but I haven't had the time to look further into it.

Excerpt from bt:

#0  list_add_tail (head=0x90, new=0x7fcac00040b8) at list.h:41
41		new->prev = head->prev;
[Current thread is 1 (Thread 0x7fcae3dbb700 (LWP 4165))]
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.23.1-8.fc24.x86_64 keyutils-libs-1.5.9-8.fc24.x86_64 krb5-libs-1.14.1-8.fc24.x86_64 libacl-2.2.52-11.fc24.x86_64 libattr-2.4.47-16.fc24.x86_64 libcom_err-1.42.13-4.fc24.x86_64 libffi-3.1-9.fc24.x86_64 libselinux-2.5-9.fc24.x86_64 libuuid-2.28-3.fc24.x86_64 openssl-libs-1.0.2h-1.fc24.x86_64 pcre-8.39-2.fc24.x86_64 zlib-1.2.8-10.fc24.x86_64
(gdb) bt
#0  list_add_tail (head=0x90, new=0x7fcac00040b8) at list.h:41
#1  list_move_tail (head=0x90, list=0x7fcac00040b8) at list.h:107
#2  __inode_retire (inode=0x7fcac0004030) at inode.c:439
#3  0x00007fcad764100e in inode_table_prune (table=table@entry=0x7fcac0004040) at inode.c:1521
#4  0x00007fcad7642f22 in inode_table_destroy (inode_table=0x7fcac0004040) at inode.c:1808
#5  0x00007fcad7642fee in inode_table_destroy_all (ctx=ctx@entry=0x55f82360b430) at inode.c:1733
#6  0x00007fcad7f3fde6 in pub_glfs_fini (fs=0x55f823537950) at glfs.c:1204

Expected results:
No crash


[root@f24 ~]# rpm -qa | grep glusterfs-debuginfo
[root@f24 ~]# rpm -qa | grep python-debuginfo

[root@f24 coredump]# ls -lh
total 311M
-rw-r--r--. 1 root root 350M Aug  2 17:31 core.python.0.21e34182be6844658e00bba43a55dfa0.4165.1470139182000000000000
-rw-r-----. 1 root root  22M Aug  2 17:29 core.python.0.21e34182be6844658e00bba43a55dfa0.4165.1470139182000000000000.lz4

Can provide the coredump offline.

--- Additional comment from Prashanth Pai on 2016-08-02 09:24 EDT ---

--- Additional comment from Prashanth Pai on 2016-08-02 09:35:30 EDT ---

Also, I can't seem to reproduce this when the test filesystem tree is small enough.

--- Additional comment from Soumya Koduri on 2016-08-04 06:14:55 EDT ---

I suspect below could have caused the issue - 

In inode_table_destroy(), we first purge all the lru entries but the lru count is not adjusted accordingly. So when inode_table_prune() is called in case if the lru count was larger than lru limit (as can be seen in the core), we shall end up accessing invalid memory. 

(gdb) f 3
#3  0x00007fcad764100e in inode_table_prune (table=table@entry=0x7fcac0004040) at inode.c:1521
1521	                        __inode_retire (entry);
(gdb) p table->lru_size
$4 = 132396
(gdb) p table->lru_limit
$5 = 131072
(gdb) p table->lru
$6 = {next = 0x90, prev = 0xcafecafe}
(gdb) p &&table->lru
A syntax error in expression, near `&&table->lru'.
(gdb) p &table->lru
$7 = (struct list_head *) 0x7fcac00040b8

I will send a fix for it.

Comment 1 Vijay Bellur 2016-08-04 11:31:06 UTC
REVIEW: (inode: Adjust lru_size while retiring entries in lru list) posted (#1) for review on master by soumya k (

Comment 2 Soumya Koduri 2016-08-04 11:31:57 UTC

Could you please test the fix posted. Thanks!

Comment 3 Prashanth Pai 2016-08-08 07:07:48 UTC
The fix works and issue is no longer reproducible. If possible, please backport this to 3.7 branch as well. Thanks for the quick fix.

Comment 4 Vijay Bellur 2016-08-09 08:15:30 UTC
COMMIT: committed in master by Raghavendra G ( 
commit 567304eaa24c0715f8f5d8ca5c70ac9e2af3d2c8
Author: Soumya Koduri <>
Date:   Thu Aug 4 16:00:31 2016 +0530

    inode: Adjust lru_size while retiring entries in lru list
    As part of inode_table_destroy(), we first retire entries
    in the lru list but the lru_size is not adjusted accordingly.
    This may result in invalid memory reference in inode_table_prune
    if the lru_size > lru_limit.
    Change-Id: I29ee3c03b0eaa8a118d06dc0cefba85877daf963
    BUG: 1364026
    Signed-off-by: Soumya Koduri <>
    Smoke: Gluster Build System <>
    Reviewed-by: Raghavendra G <>
    Reviewed-by: Prashanth Pai <>
    CentOS-regression: Gluster Build System <>
    NetBSD-regression: NetBSD Build System <>

Comment 5 Shyamsundar 2017-03-27 18:18:24 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.9.0, please open a new bug report.

glusterfs-3.9.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.


Note You need to log in before you can comment on or make changes to this bug.