Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1514140 - [GSS] quota reporting incorrect values on disperse volume
Summary: [GSS] quota reporting incorrect values on disperse volume
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: quota
Version: rhgs-3.2
Hardware: All
OS: Linux
Target Milestone: ---
: ---
Assignee: hari gowtham
QA Contact: Rahul Hinduja
Whiteboard: Accounting
Depends On:
TreeView+ depends on / blocked
Reported: 2017-11-16 17:36 UTC by Pan Ousley
Modified: 2018-11-02 20:53 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2018-11-02 20:53:10 UTC
Target Upstream Version:

Attachments (Terms of Use)
log accounting script (deleted)
2017-11-17 12:43 UTC, Sanoj Unnikrishnan
no flags Details
xattr_parsing_script (deleted)
2017-11-17 12:45 UTC, Sanoj Unnikrishnan
no flags Details
script_to_restore_quota_limits (deleted)
2017-11-22 11:31 UTC, Sanoj Unnikrishnan
no flags Details
control_cpu_load (deleted)
2017-12-07 09:04 UTC, Sanoj Unnikrishnan
no flags Details
FS_walk_and_account (deleted)
2017-12-12 17:54 UTC, Sanoj Unnikrishnan
no flags Details
walk_2 (deleted)
2017-12-13 12:33 UTC, Sanoj Unnikrishnan
no flags Details (deleted)
2018-01-11 19:52 UTC, Sanoj Unnikrishnan
no flags Details

Description Pan Ousley 2017-11-16 17:36:57 UTC
Description of problem: The affected path (see forthcoming comments) has a 2 TB quota set of which 1.6 TB has been used, but gluster is reporting all but 34 GB has been used. Each brick of this 1x(4+2) dispersed set shows 386 GB used by this folder, but the trusted.glusterfs.quota.size.1 only matches this value on three of the six nodes. We expect the quota to report much closer to the 1.6 TB used by files.

Where are you experiencing the behavior?  What environment? RHGS 3.2, disperse volume 1x(4+2)=6

Version-Release number of selected component (if applicable): glusterfs-3.8.4-18.7.el7rhgs.x86_64

Additional info: This issue has persisted across quota remove and quota change operations for this path.

To preserve confidentiality, I will provide further details in forthcoming private comments. Please let me know if more information is needed.

Comment 3 Sanoj Unnikrishnan 2017-11-17 12:41:37 UTC
It seems to be an accounting issue as observed in previous comments,
To further resolve this, we need to know the xattrs from the backend and the source of the accounting issues (which set of file/directory accounting has gone wrong) on the bricks where accounting is incorrect.

I am including 2 attachments/scripts to identify the same. and

Below is an example of how to run them:

/export/b1 is my brick path.
/mnt is where volume v1 is mounted

#find /export/b1 | xargs  getfattr -d -m. -e hex  > /tmp/log_gluster_xattr
getfattr: Removing leading '/' from absolute path names
# /tmp/log_gluster_xattr > /tmp/gluster_quota_xattr

Note: Collect the xattr from all bricks that has incorrect accounting.

From any node that has a client mount run the following script:
# /mnt v1
This should generate a tar ball with name gluster_quota_files.tar under /tmp.

Please revert with all the xattr log(/tmp/gluster_quota_xattr) and the tar file

Comment 4 Sanoj Unnikrishnan 2017-11-17 12:43:58 UTC
Created attachment 1354138 [details]
log accounting script

script to compare the accounting by du with quota accross FS.
Also collects hard link count information

Comment 5 Sanoj Unnikrishnan 2017-11-17 12:45:12 UTC
Created attachment 1354139 [details]

script collects backend xattrs to isolate accounting to specific subtree/ diretory

Comment 9 Sanoj Unnikrishnan 2017-11-20 12:08:46 UTC

I tested the attached scripts on a large FS with about 2 million files and 
directory tree up to 20 levels deep to know how much time it takes.
For me the ls -lRt and xattr collection took only a couple of minutes (see below), while the log_accounting took a lot of time due to du command.

Hence, if your filesystem is of similar scale could you please run the following command and share the logs.

find <brick-path> | xargs  getfattr -d -m. -e hex  > /tmp/log_gluster_xattr 2>/dev/null

Note 1: that the disk usage of your FS does not matter to determine the time taken to crawl. The total number of files and directories present, determine the time.

Note 2: The generated file(log_gluster_xattr) can be pretty large if there are many files

# time find /export/b1/ | xargs  getfattr -d -m. -e hex  > /tmp/log_gluster_xattr 2>/dev/null

real	2m49.622s
user	0m19.431s
sys	2m38.766s

# ls -lh  /tmp/log_gluster_xattr
-rw-r--r--. 1 root root 1.7G Nov 20 16:26 /tmp/log_gluster_xattr

# time ls -lRt /mnt/ | wc -l

real	2m56.784s
user	0m5.245s
sys	0m11.995s

A direct work around is to take a backup of the limits configured and 
do a quota disable , followed by a quota enable and restore the limits individually.

However, with output of the above command, we may be able to identify the specific path in the tree that has incorrect accounting and repair them alone. This will also help us root cause the issue

Comment 13 Sanoj Unnikrishnan 2017-11-22 11:31:42 UTC
Created attachment 1357404 [details]

script to restore quota with a previously backed limit

Comment 16 Sanoj Unnikrishnan 2017-11-23 13:59:46 UTC
Quota disable and enable indeed does crawl the filesystem. However it should not effect the users of the system except the fact that there will be a window of time where enforcement does not happen. Once crawl finishes the limits will be honored as before
Hence please use the suggested work around

From the xattr values one of the issues we saw was 
=====# file: gluster/brick4/brick4/=======
{'dir_count': 18446744073704775721L, 'file_count': 18446744073685675058L, 'dirty': False, 'size': '16383P'}

This could cause a user IO impact, Hence disable followed by enable and reset limits was suggested

To cross check this, could you please share the quota list output as well.

It is not possible to repair or isolate the cause of incorrect accounting without a full crawl. We shall look at options to make the script more efficient and if it is possible to reduce its memory footprint, so that in future we can collect the logs.
Hence for now please use the work around.

Comment 30 Sanoj Unnikrishnan 2017-12-07 09:04:09 UTC
Created attachment 1364117 [details]

script to cap cpu load using cgroup

Comment 37 Sanoj Unnikrishnan 2017-12-12 17:54:10 UTC
Created attachment 1366839 [details]

walks over FS accounts its size and compare with the size in xattr

Comment 40 Sanoj Unnikrishnan 2017-12-13 12:33:01 UTC
Created attachment 1367325 [details]

changed to reduce logs generated

Comment 54 Sanoj Unnikrishnan 2018-01-11 19:52:43 UTC
Created attachment 1380215 [details]

Comment 64 Amar Tumballi 2018-09-17 04:51:01 UTC
This also falls under KCS article about quota fsck. The quota fsck script is now merged upstream, and we should consider using the same for solving the issue than fixing it in the code for now (as it can be more complex)

Also reducing the priority/severity as the issue gets resolved with the script.

Comment 65 hari gowtham 2018-09-17 06:00:02 UTC
We can get the backend xattr values for all the bricks before running the script (if the size of the volume is smaller). This will help us identity which directory had an issue and what caused the issue. This information might come in handy to resolve further issues.

Comment 66 hari gowtham 2018-10-29 06:07:13 UTC
Hi Pan,

As we aren't going to actively work on quota, Do you mind closing the bug, if the workaround helped? 


Note You need to log in before you can comment on or make changes to this bug.