Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 158039 - nfsd oopses on testing kernel update for FC3
Summary: nfsd oopses on testing kernel update for FC3
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 3
Hardware: i686
OS: Linux
Target Milestone: ---
Assignee: Steve Dickson
QA Contact: Brian Brock
Depends On:
TreeView+ depends on / blocked
Reported: 2005-05-18 01:10 UTC by Alexandre Oliva
Modified: 2007-11-30 22:11 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2005-05-19 14:10:41 UTC

Attachments (Terms of Use)
Oopses (deleted)
2005-05-18 01:11 UTC, Alexandre Oliva
no flags Details

Description Alexandre Oliva 2005-05-18 01:10:26 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.8) Gecko/20050512 Fedora/1.0.4-2 Firefox/1.0.4

Description of problem:
Got all of these oopses on the same box over the past few weeks, running various different kernels.  It might be faulty hardware, so take it with a grain of salt, but I don't have any other boxes with identical hardware configuration to tell whether it's something specific to the set of modules involved, nor easy local access to run hardware tests.  There are two ext3 oopses and some nfsd oopses from the stable kernel as well, could this all be caused filesystem corruption?  I'm thinking of bringing the system down for an fsck.

Version-Release number of selected component (if applicable):

How reproducible:
Didn't try

Steps to Reproduce:
1.Boot up either the stable or the testing 2.6.11 FC3 kernel and let it run for days.

Actual Results:  Oopses I'll attach.

Expected Results:  No such oopses.

Additional info:

Comment 1 Alexandre Oliva 2005-05-18 01:11:26 UTC
Created attachment 114493 [details]

Comment 2 Alexandre Oliva 2005-05-18 03:00:59 UTC
fsck didn't find any inconsistencies, but a local user reported some recent
suspicion on overheating, and the failures appear to be related with peak use.

Comment 3 Steve Dickson 2005-05-18 11:38:07 UTC
Oops are never good for data integrity. 
Why do you think this is faulty hardware? 

Comment 4 Alexandre Oliva 2005-05-18 15:15:15 UTC
That was the suspicion of another sysadmin.  Apparently the box has never been
exactly rock solid, with some programs crashing every now and then, odd messages
on cron mail, and so on, but this had never (apparently) affected its ability to
serve out filesystems over nfs.  The box was recently taken off to a computer
repair facility at the uni, and they suspected the goop that attaches the cooler
to the processor might be at fault, and replaced it, but that had no effect
whatsoever.  If anything, crashes are now more frequent.

Besides, we have many other boxes running NFS servers with the very same
software, although not exactly the same hardware, so I found it unlikely that
things would crash so often for one box and not for others.  This one isn't even
the most heavily used server.  I figured, if such oopses should be hitting
others, you'd know about it, so I thought I'd file it, but don't waste too much
time on it until we can get better assurance that it's not caused by hardware
problems.  I've downgraded to 2.6.10-1.670_FC3 yesterday, and now the box is off
line.  I can't tell whether it crashed or was taken to the repair facility
again.  Aah, the wonders of being a remote sysadmin :-)

Comment 5 Alexandre Oliva 2005-05-19 14:10:41 UTC
The box failed again, and was taken to the repair office again.  They ran a
memtest again, and found both memory modules to be defective.  I'll probably
have to go on site and verify the testing, but we're now pretty sure it's
hardware failure.  Sorry about the noise.

(s/1.670_FC3/1.770_FC3/ in the previous comment, BTW)

Note You need to log in before you can comment on or make changes to this bug.