Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 454286

Summary: problems bringing up lockd after it has been taken down
Product: Red Hat Enterprise Linux 5 Reporter: Jeff Layton <jlayton>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED NOTABUG QA Contact: Martin Jenner <mjenner>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.2CC: ram_kesavan, staubach, steved
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-09-29 12:02:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Jeff Layton 2008-07-07 14:52:16 UTC
Do this on 2.6.18-92.1.6.el5debug kernel after a fresh reboot:

1) mount a tcp NFSv3 filesystem
2) unmount it
3) service nfs start

...nfsd will fail to start because lockd_up fails. From dmesg:

FS-Cache: Loaded
FS-Cache: netfs 'nfs' registered for caching
SELinux: initialized (dev 0:17, type nfs), uses genfs_contexts
Installing knfsd (copyright (C) 1996
SELinux: initialized (dev nfsd, type nfsd), uses genfs_contexts
NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
NFSD: starting 90-second grace period
lockd_up: makesock failed, error=-98
lockd_down: no lockd running.
nfsd: last server has exited
nfsd: unexporting all filesystems

...then if you do a "service nfs restart":

lockd_up: no pid, 2 users??
NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
NFSD: starting 90-second grace period
nfsd: last server has exited
nfsd: unexporting all filesystems I think we have a couple of bugs here. Something is causing the makesock
to fail and when this occurs, lockd_up isn't handling the error condition
appropriately and it's throwing off the nlmsvc_users counter.

I suspect this is a regression from 5.1, but I need to confirm it.

Comment 1 Jeff Layton 2008-07-07 16:27:14 UTC
Actually, this doesn't appear to be a regression. When I do the same test on
-8.el5, then I get these messages:

lockd_up: makesock failed, error=-98
lockd_up: no pid, 2 users??
lockd_up: no pid, 3 users??
lockd_up: no pid, 4 users??
lockd_up: no pid, 5 users??
lockd_up: no pid, 6 users??
lockd_up: no pid, 7 users??
lockd_up: no pid, 8 users??

...and lockd isn't started. Since no one has complained about this, I'll put
this on 5.4 proposed for now. If the fix turns out to be simple I may move it to

Comment 2 Jeff Layton 2008-07-08 15:46:03 UTC
This problem has strangely "fixed itself". Yesterday, I could reliably reproduce
this. Today, I can't make it happen.

The host where I saw this was a RHEL5 FV xen guest. It looked like the power
blinked at the office and the xen dom0 rebooted. I brought my RHEL5 image back
up and now this isn't happening anymore. It seems unlikely, but maybe this is
something to do with being a guest on a long running dom0?

I'll leave this open for now in case it happens again...

Comment 3 Jeff Layton 2008-09-29 12:01:44 UTC
Closing this out. I've not seen this problem since, though it still worries me that I saw it at all. I'll reopen it if it returns.

Comment 4 Ram Kesavan 2009-05-20 19:29:53 UTC
I am not sure if this is important, but you will get this error if the portmapper is not running, start portmapper /etc/init.d/portmapper and try the mount and it will work properly.