Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 157710 - rename(2) can deadlock on a distributed filesystem.
Summary: rename(2) can deadlock on a distributed filesystem.
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Alexander Viro
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-05-13 22:14 UTC by Michael Gaughen
Modified: 2012-06-20 16:18 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-20 16:18:03 UTC


Attachments (Terms of Use)
Proposed patch to fix rename(2) deadlock. (deleted)
2005-05-13 22:16 UTC, Michael Gaughen
no flags Details | Diff

Description Michael Gaughen 2005-05-13 22:14:55 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.2) Gecko/20040803

Description of problem:
A problem with distributed filesystems is that there is no guarantee that a
path_lookup() will return a valid dentry if another node is executing a rename(2) on that same path hierarchy.  lock_rename() performs this check:        

  struct dentry *lock_rename(struct dentry *p1, struct dentry *p2)
  {
        ...                                       
        if (p1 == p2) {
                down(&p1->d_inode->i_sem);
                return NULL;
        }
        ...

and in the case of a distributed filesystem, the dentries (p1 and p2) can be different, yet refer to the same inode.  In that case, the above check will
fail, and a later attempt to do:

        ...
        down(&p2->d_inode->i_sem);
        down(&p1->d_inode->i_sem);
        ...

will result in an attempt to down() the *same* ->i_sem twice, resulting in a
deadlock.


Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
I don't have a good test case to reproduce this.  It requires multiple nodes,
performing renames on the same path hierarchy, on a distributed filesystem.
  

Additional info:

Comment 1 Michael Gaughen 2005-05-13 22:16:39 UTC
Created attachment 114367 [details]
Proposed patch to fix rename(2) deadlock.

Instead of comparing the two dentries for equality, the patch changes
lock_rename() and unlock_rename() to compare the dentries ->i_sem.

Comment 2 Alexander Viro 2005-06-14 09:08:47 UTC
Which distributed fs are we talking about and what other changes of
locking scheme does it make?  If we ever get multiple dentries for
a directory inode, we are in much more trouble than just lock_rename()
deadlock.

Comment 3 Michael Gaughen 2005-06-22 19:41:36 UTC
We are talking about PolyServe's PSFS filesystem.  I haven't tried to reproduce
this problem on other distributed filesystems (eg. GFS), so I can't say for sure
whether it would encounter this deadlock, though it seems likely.  The problem
is that there is no guarantee that the path_lookup()s, inside of do_rename(),
will return valid old/new dentry/inode pairs when multiple nodes are renaming
the same path hierarchy.  And (at least for us) that is alright as our
filesystem can deal with that.  However, lock_rename() deadlocks before we are
even called.  Of course this problem doesn't exist on a single node, and may or
may not exist on other distributed filesystems.

Comment 4 Jiri Pallich 2012-06-20 16:18:03 UTC
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. 
Please See https://access.redhat.com/support/policy/updates/errata/

If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.


Note You need to log in before you can comment on or make changes to this bug.