Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 155313 - NFS over UDP timeouts due to oversmall RPC_RTO_MIN
Summary: NFS over UDP timeouts due to oversmall RPC_RTO_MIN
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: i386
OS: Linux
medium
high
Target Milestone: ---
Assignee: Steve Dickson
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-04-19 03:06 UTC by Damian Menscher
Modified: 2007-11-30 22:07 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-10-19 19:04:25 UTC


Attachments (Terms of Use)

Description Damian Menscher 2005-04-19 03:06:40 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)

Description of problem:
When using NFS over UDP, the timeo mount option is ignored in favor of an adaptive Round Trip Time (RTT) estimator (that this is contrary to all known documentation is a bug in itself).  The minimum allowable time for the adaptive estimator is set in .../net/sunrpc/timer.c as
#define RPC_RTO_MIN (HZ/30)
Note that on fast hardware, the timeo will tune itself down to about 0.04s.  With the default retransmit value (retrans=3) this gives a server just 0.6s to respond.  This will lead to frequent timeouts, which can cause data corruption in the case of a soft mount.

I recommend using the value from recent kernels: HZ/10, which will give the server a minimum of 1.5s to respond.

Version-Release number of selected component (if applicable):
kernel-2.4.21-27.0.2.EL

How reproducible:
Always

Steps to Reproduce:
1. running "du" on a soft-mounted partition (UDP as the transport) will often show the problem.  But it's easier to just read the kernel source.


Actual Results:  The RPC call will return ETIMEDOUT, which returns EIO to the calling program, and logs an error to the syslog.

Additional info:

This one should be a no-brainer... it's been fixed in the mainstream kernel for a fairly long time.  Marking as "high" severity since data corruption could result from an unwarranted NFS timeout.

Comment 3 Damian Menscher 2005-04-19 21:32:03 UTC
It's probably worth mentioning (for the benefit of others with this problem) 
that setting retrans=5 (or larger) as a mount option will approximate 
the "correct" behavior.  You need to umount/mount, as the remount option 
doesn't appear to let you change NFS mount options (see Bug 155392).

Comment 4 RHEL Product and Program Management 2007-10-19 19:04:25 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.


Note You need to log in before you can comment on or make changes to this bug.