Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 79726 - strange system load on rh 7.2 web servers
Summary: strange system load on rh 7.2 web servers
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: kernel
Version: 2.1
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Larry Woodman
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-12-16 09:28 UTC by Tobias Meier
Modified: 2007-11-30 22:06 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-09-28 11:40:59 UTC


Attachments (Terms of Use)
here is the kernel debug stream (deleted)
2002-12-18 10:04 UTC, Tobias Meier
no flags Details
new debug stream (deleted)
2002-12-18 15:06 UTC, Tobias Meier
no flags Details
debug output advanced server (deleted)
2003-01-09 13:10 UTC, Tobias Meier
no flags Details
the same file again / without auto-detect content type (deleted)
2003-01-09 13:15 UTC, Tobias Meier
no flags Details
lspci output (deleted)
2003-01-09 14:00 UTC, Tobias Meier
no flags Details
/ets/fstab (deleted)
2003-01-09 14:04 UTC, Tobias Meier
no flags Details
interesting top output --> kswapd (deleted)
2003-01-09 15:38 UTC, Tobias Meier
no flags Details
new rh 7.2 kernel debug stream (deleted)
2003-01-13 21:56 UTC, Tobias Meier
no flags Details
top vmstat and ps output while the load is high (deleted)
2003-01-14 18:14 UTC, Tobias Meier
no flags Details

Description Tobias Meier 2002-12-16 09:28:51 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003

Description of problem:
hi,
we use FSC Primary P250 boxes with 2 * 2,4 ghz xeon processors and a adaptec
2100S raid controller. the os is redhat 7.2 and a 2.4.18-18-7.xsmp kernel.
we installed a squid and an apache server. the normal system load is between 0.1
and 0.5 (/proc/loadavg). but sometimes the load goes up to 4-6, and we don't
know why. the cpus are 98% idle and there is no io traffic.
after 2 or 3 hours the load goes back down to 0.1. the same problem appears with
a uniprocessor kernel.
as soon as we stop the squid , the load goes back down to a normal value, but
after restarting the squid, the problem appears again. 

interesting: starting a bonnie++ while the load is up, the load goes up to
10-12, killing the bonnie process 2 minutes later results in the load going back
down to a normal value (0.2) and stays down.

sometimes the top output and the sar tool shows wrong values. i saw an idle time
of 234567.98 for example or a cpu usage of 234567%.
the other values report that the system feels boring, and the squid response
times looks good :-)

we used 4 different setups: 4 boxes with smp kernel and redhat squid, 3 with up
kernel an redhat squid, 4 with smp kernel and our own squid with mod_gzip and 2
with up kernel and our squid. on all systems the problem appears.

   tobias

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.install a fsc xeon box with rh 7.2
2.install a squid and a apache
3.simulate traffic ( 100 requests/s ) 
4.wait 1 or 2 weeks and monitor the load
    

Additional info:

Comment 1 Arjan van de Ven 2002-12-16 09:52:18 UTC
if you can enable sysreq ("echo 1 > /proc/sys/kernel/sysrq") then using the
alt-sysrq-t key combination will spew a kernel debug stream to syslog. based on
that it's possible to see why/where the load is so high, please attach such
output here. (but only in the problem scenario; in the "healthy" case it's no use)

Comment 2 Ben LaHaise 2002-12-16 16:31:50 UTC
There are two separate bugs here: the obviously incorrect cpu usage (which may
be related to a missed timer tick or incorrect time accounting in the kernel),
and the high load triggering process death (likely a vm issue).

Comment 3 Tobias Meier 2002-12-18 10:04:02 UTC
Created attachment 88791 [details]
here is the kernel debug stream

Comment 4 Arjan van de Ven 2002-12-18 10:19:08 UTC
looks like it got stuck on NFS :(

Comment 5 Tobias Meier 2002-12-18 10:32:32 UTC
we had the same problems without nfs. we can unmount the nfs devices and send
you a new debug stream.

Comment 6 Tobias Meier 2002-12-18 15:06:40 UTC
Created attachment 88795 [details]
new debug stream

after umount all nfs shares, stop the nfs services, and rmmod the nfs kernel
modul. the load is still 4.

Comment 7 Tobias Meier 2002-12-18 15:45:18 UTC
 fyi: we have the same problems as descriped in bug id: 64984. perhaps our 
load problems may result from this bug.  
 

Comment 8 Tobias Meier 2003-01-09 13:00:57 UTC
ok, we have the same problems with redhat advanced server. we solved our nfs
problems and the problem is still here.

Comment 9 Tobias Meier 2003-01-09 13:10:42 UTC
Created attachment 89234 [details]
debug output advanced server

Comment 10 Tobias Meier 2003-01-09 13:15:26 UTC
Created attachment 89235 [details]
the same file again / without auto-detect content type

Comment 11 Tobias Meier 2003-01-09 14:00:17 UTC
Created attachment 89236 [details]
lspci output

Comment 12 Tobias Meier 2003-01-09 14:04:48 UTC
Created attachment 89237 [details]
/ets/fstab

Comment 13 Tobias Meier 2003-01-09 14:36:38 UTC
is there a way to get the kernel debug stream without pressing the sysrq keys ?

Comment 14 Tobias Meier 2003-01-09 15:38:47 UTC
Created attachment 89243 [details]
interesting top output --> kswapd

Comment 15 Tobias Meier 2003-01-13 21:56:27 UTC
Created attachment 89341 [details]
new rh 7.2 kernel debug stream

Comment 16 Bastien Nocera 2003-01-14 17:02:28 UTC
Hello Tobias,

We would need some more information, related to your finding about kswapd.

We would need the data from:
- readprofile
- top
- vmstat
when the problem occurs. It will give us more depth into the problem, now that
we know that kswapd might be a problem.

Cheers

Comment 17 Tobias Meier 2003-01-14 18:14:53 UTC
Created attachment 89356 [details]
top vmstat and ps output while the load is high

Comment 18 Larry Woodman 2005-09-28 11:40:59 UTC
Let me know if this is still a problem.

Larry Woodman


Note You need to log in before you can comment on or make changes to this bug.