Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 454286 - problems bringing up lockd after it has been taken down
Summary: problems bringing up lockd after it has been taken down
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.2
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Jeff Layton
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-07-07 14:52 UTC by Jeff Layton
Modified: 2009-05-20 19:29 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-09-29 12:02:02 UTC


Attachments (Terms of Use)

Description Jeff Layton 2008-07-07 14:52:16 UTC
Do this on 2.6.18-92.1.6.el5debug kernel after a fresh reboot:

1) mount a tcp NFSv3 filesystem
2) unmount it
3) service nfs start

...nfsd will fail to start because lockd_up fails. From dmesg:

FS-Cache: Loaded
FS-Cache: netfs 'nfs' registered for caching
SELinux: initialized (dev 0:17, type nfs), uses genfs_contexts
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
SELinux: initialized (dev nfsd, type nfsd), uses genfs_contexts
NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
NFSD: starting 90-second grace period
lockd_up: makesock failed, error=-98
lockd_down: no lockd running.
nfsd: last server has exited
nfsd: unexporting all filesystems

...then if you do a "service nfs restart":

lockd_up: no pid, 2 users??
NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
NFSD: starting 90-second grace period
nfsd: last server has exited
nfsd: unexporting all filesystems

...so I think we have a couple of bugs here. Something is causing the makesock
to fail and when this occurs, lockd_up isn't handling the error condition
appropriately and it's throwing off the nlmsvc_users counter.

I suspect this is a regression from 5.1, but I need to confirm it.

Comment 1 Jeff Layton 2008-07-07 16:27:14 UTC
Actually, this doesn't appear to be a regression. When I do the same test on
-8.el5, then I get these messages:

lockd_up: makesock failed, error=-98
lockd_up: no pid, 2 users??
lockd_up: no pid, 3 users??
lockd_up: no pid, 4 users??
lockd_up: no pid, 5 users??
lockd_up: no pid, 6 users??
lockd_up: no pid, 7 users??
lockd_up: no pid, 8 users??


...and lockd isn't started. Since no one has complained about this, I'll put
this on 5.4 proposed for now. If the fix turns out to be simple I may move it to
5.3...



Comment 2 Jeff Layton 2008-07-08 15:46:03 UTC
This problem has strangely "fixed itself". Yesterday, I could reliably reproduce
this. Today, I can't make it happen.

The host where I saw this was a RHEL5 FV xen guest. It looked like the power
blinked at the office and the xen dom0 rebooted. I brought my RHEL5 image back
up and now this isn't happening anymore. It seems unlikely, but maybe this is
something to do with being a guest on a long running dom0?

I'll leave this open for now in case it happens again...


Comment 3 Jeff Layton 2008-09-29 12:01:44 UTC
Closing this out. I've not seen this problem since, though it still worries me that I saw it at all. I'll reopen it if it returns.

Comment 4 Ram Kesavan 2009-05-20 19:29:53 UTC
I am not sure if this is important, but you will get this error if the portmapper is not running, start portmapper /etc/init.d/portmapper and try the mount and it will work properly.


Note You need to log in before you can comment on or make changes to this bug.