Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 820004 - nfs client hangs older nfs servers
Summary: nfs client hangs older nfs servers
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.3
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Jeff Layton
QA Contact: Filesystem QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-05-08 20:46 UTC by Vilius Šumskas
Modified: 2012-12-11 11:09 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-12-11 11:09:51 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Bugzilla 770592 None None None Never

Description Vilius Šumskas 2012-05-08 20:46:30 UTC
Description of problem:
We are running RHEL 6.2 with RHEL 6.3 beta kernel as an NFS client. The server also runs vsftpd which is configured to access mounted NFS volume. The beta kernel was installed because of issue in https://bugzilla.redhat.com/show_bug.cgi?id=770592 . Now we have another problem. Seems like every time when at least 5 users connect to FTP and start upload data to NFS volume it hangs the actual NFS server (which is Mac OS X Server 10.5.8). We tried to mount the volumes from two different Mac OS X Servers and it always behaves like this. NFS server just freezes. The amount of data coming through FTP is ~50-70mbps. Reverting the kernel to RHEL 6.1 (2.6.32-131.17.1.el6.x86_64) fixes the problem.

Version-Release number of selected component (if applicable):
kernel-2.6.32-262.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Install RHEL6.3 beta kernel.
2. Mount NFS volume from Mac OS X 10.5.8 server. Fstab entry:

10.1.1.223:/Volumes/Laidos/ftp  /mnt/ftp        nfs     nosuid,exec,nodev,proto=tcp  0 0

3. Install and configure vsftpd on RHEL to allow uploads into /mnt/ftp
4. Connect as many clients as you can and start uploading from all of them.
  
Actual results:
The NFS server hangs.

Expected results:
It should not hang.


Additional info:
I suspect it could be that RHEL still looses the connection/packet randomly like in https://bugzilla.redhat.com/show_bug.cgi?id=770592 but this just doesn't show up in the logs anymore. As NFS protocol is not so prone to packet losts this could freeze the server.

Comment 2 RHEL Product and Program Management 2012-05-13 04:04:41 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 3 Vilius Šumskas 2012-06-25 14:59:35 UTC
The issue still persist in kernel-2.6.32-279.el6.x86_64 in RHEL 6.3 release.

Can someone take a look into it?

Comment 4 Tore H. Larsen 2012-07-17 19:18:34 UTC
cc

Comment 5 Steve Dickson 2012-07-24 18:55:24 UTC
Could you please get a network trace on whats going over the wire? Something similar to:

tshark -w /tmp/data.pcap host <server>
bzip2 /tmp/data.pcap

Comment 6 Vilius Šumskas 2012-07-31 11:56:06 UTC
Here you go http://www.tekila.lt/public/data.pcap.bz2

The capture was made just right after rebooted to RHEL 6.3 kernel, started to upload data through FTP to NFS share, and then NFS server hang.

Comment 7 Jeff Layton 2012-07-31 12:15:57 UTC
Hmmm...sounds more like a problem with the server here. Even if the client is doing something it doesn't like, hanging is not really a good way to handle it.

Perhaps you should consider getting Apple's support organization involved?

Comment 8 Vilius Šumskas 2012-07-31 12:28:03 UTC
I completely support your statement that hanging is not a really good way to handle it, but if something can be done on the client side to make it more compatible with Apple (and other) products, it would be great.

I can try to report this to Apple, but their support is beyond terrible when it goes to server products. The response times are YEARS (literally) and their standard response is "we don't support versions other than the current version". Even for the companies with support contracts.

Comment 9 Jeff Layton 2012-07-31 12:34:31 UTC
To be clear, we're happy to help, but without some idea of why the server is hanging, it's going to end up being a game of "try this and see if it works", if we can even come up with things to try...

What you might also want to do is to get a capture of some of the network traffic between the client and server for the "working" case as well so you can compare and contrast between the two.

Comment 10 Vilius Šumskas 2012-07-31 13:33:15 UTC
A working case capture with RHEL 6.1 kernel: http://www.tekila.lt/public/data_working.pcap.bz2

Comment 11 Steve Dickson 2012-10-08 13:30:12 UTC
(In reply to comment #10)
> A working case capture with RHEL 6.1 kernel:
> http://www.tekila.lt/public/data_working.pcap.bz2

Unfortunately I don't see the hang in that trace... But I do see a number of
  [TCP previous segment lost] packages which might point to a network issue...

Comment 12 Vilius Šumskas 2012-10-08 15:08:03 UTC
I have already ruled that out changing the router.

Even if it is network problem, it doesn't explain why server works in 6.1 kernel and doesn't work in 6.3.

Comment 13 Jeff Layton 2012-12-11 11:09:51 UTC
Whether Apple's support is terrible or not, a hung server indicates a bug in the server. A network server of any sort ought to be able to handle anything the client throws at it without hanging. I've looked over the captures and I don't see anything wrong with what's being sent to the server here.

Without more to go on, I don't see anything that we can do. At this point, I'm going to call this a bug in Apple's product and close this as NOTABUG. If you can get their support organization involved, and they point out something wrong with what we're sending to them then please reopen this can we'll be happy to take another look.

Also, with complex multi-vendor cases like this, it's generally a good idea to open a support case with RH support.


Note You need to log in before you can comment on or make changes to this bug.