Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1062550 - quota+self-heal: hitting "D" state for "dd" command
Summary: quota+self-heal: hitting "D" state for "dd" command
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: quota
Version: 2.1
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1282724
TreeView+ depends on / blocked
 
Reported: 2014-02-07 09:15 UTC by Saurabh
Modified: 2016-09-17 12:40 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1282724 (view as bug list)
Environment:
Last Closed: 2015-11-17 09:06:53 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Saurabh 2014-02-07 09:15:53 UTC
Description of problem:
Hitting D state for a dd command. dd command being used for creating data inside a directory. 

The volume in consideration is having quota enabled. The directory had quota limit-set of 500GB. The data creation included just created files of RANDOM size. Meanwhile, data creation was happening I killed few bricks and brought them back after a long time period (say 10 hours). Bricks were brought back in order to trigger the self-heal. The data creation was going on continuosly all through these set of operations.

Version-Release number of selected component (if applicable):
glusterfs-3.4.0.59rhs-1.el6rhs.x86_64

How reproducible:
happened once, test case itseld been tried once

Steps to Reproduce:
set up: RHS nodes A,B,C,D
clients: c1 and c2

1. create a volume of 6x2, start it
2. enable quota
3. mount volume over nfs, on from A on c1 and from C on c2.
4. create 2 directories, say dir1 and dir2
5. set limits of 500GB on these directories. for "dir1" from client c1 and for "dir2" from client c2.
6. start creating  files in these directories, of RANDOM sizes.
7. after some time kill the brick processes on RHS nodes 1 and 2.
8. bring the brick processes back , so that self heal gets trigerred.


Actual results:
step 8: self finishes,
but one of the dd operations goes in D state, as can be seen here,

root     30434  0.0  0.0 105180   644 pts/1    D+   10:37   0:03 dd bs=1024 count=18130 skip=15099 if=/dev/urandom of=./f.29368


Expected results:
dd operation should not get affected

Additional info:
dmesg from client,
INFO: task dd:30434 blocked for more than 120 seconds.
      Not tainted 2.6.32-431.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
dd            D 0000000000000001     0 30434  30433 0x00000080
 ffff88011cc79c78 0000000000000086 ffff8801ffffffff 000dfb094886950c
 ffff88003795c7c0 ffff880118e753d0 00000000110c3942 ffffffffad274724
 ffff88011a77d098 ffff88011cc79fd8 000000000000fbc8 ffff88011a77d098
Call Trace:
 [<ffffffff810a70a1>] ? ktime_get_ts+0xb1/0xf0
 [<ffffffff8111f930>] ? sync_page+0x0/0x50
 [<ffffffff815280a3>] io_schedule+0x73/0xc0
 [<ffffffff8111f96d>] sync_page+0x3d/0x50
 [<ffffffff81528b6f>] __wait_on_bit+0x5f/0x90
 [<ffffffff8111fba3>] wait_on_page_bit+0x73/0x80
 [<ffffffff8109b320>] ? wake_bit_function+0x0/0x50
 [<ffffffff81135bf5>] ? pagevec_lookup_tag+0x25/0x40
 [<ffffffff8111ffcb>] wait_on_page_writeback_range+0xfb/0x190
 [<ffffffff81120198>] filemap_write_and_wait_range+0x78/0x90
 [<ffffffff811baa3e>] vfs_fsync_range+0x7e/0x100
 [<ffffffff811bab2d>] vfs_fsync+0x1d/0x20
 [<ffffffffa02e58b0>] nfs_file_flush+0x70/0xa0 [nfs]
 [<ffffffff81185b6c>] filp_close+0x3c/0x90
 [<ffffffff81185c65>] sys_close+0xa5/0x100
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

Comment 3 Vijaikumar Mallikarjuna 2015-11-17 09:06:53 UTC
As 2.1 is EOL'ed, closing this bug and filed 3.1 bug# 1282724


Note You need to log in before you can comment on or make changes to this bug.