Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 80279 - ksoftirqd_CPU0 hits 100% when running iostat
Summary: ksoftirqd_CPU0 hits 100% when running iostat
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: kernel
Version: 2.1
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Don Howard
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-12-23 22:08 UTC by Anthony Marusic
Modified: 2007-11-30 22:06 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-07-26 18:12:19 UTC


Attachments (Terms of Use)

Description Anthony Marusic 2002-12-23 22:08:21 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020918

Description of problem:


While running 9i RAC tests, we were monitoring overall performance, using the
top utility.  All 4 CPUs were evenly distributing the workload. This was evident
by the percentage of  user CPU time of all CPUs ranging between 85% and 100%. 
When we ran "iostat -x 3" to check disk IO performance, the process,
ksoftiqrd_CPU0 ran up to 100% system CPU time, and stood at 100% throughout the
rest of the test.  At this time, CPU1,2,and 3 went to 1-3% user and 1-3% system
CPU times.  The ksoftirqd_CPU0 process continued to exhibit the same results
when starting a second test.  This condition was only cleared, when restarting
the database.  In addition to this, many of the the counters for "iostat -x 3 "
(%util, avgqu-sz, avgrq-sz, svctm, etc.) seemed to display cumulative results,
not being able to clear themselves and give proper 3 second statistics.

Background Information: 
2-way, 2.8Ghz server with Hyper Threading turned on 
      * kernel -- 2.4.9-e.8 enterprise.AS2.1 1686
      * sysstat 4.0.1 Release 2 
      * Oracle testing: 
      * Mixed read/write IO on 4 tablespaces, across 4 CPUs 
                Running 60 oracle processes 
                One Oracle Instance 

The 2 major problems are:

1) CPU0 ends up running at 100% system time, severely impacting performance

2) iostat does not seem to clean it's counters for the specified interval, or
ever for that matter.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Start 60 oracle processes, issuing continuous reads/writes to 4 tablespaces.
Oracle tests run multiple select & insert/update statements.
2.run iostat during tests
3.top will show CPU0 at 100% system util while CPU1 CPU2 & CPU3 are at .03%
    

Additional info:

kernel -- 2.4.9-e.8 enterprise.AS2.1 1686
sysstat 4.0.1 Release 2

Comment 1 jeffrey.buchsbaum 2003-02-13 17:32:31 UTC
This is the same bug really as 83789.  I have the same problem with just one
telnet session running.

Dell 530 dual 2.4ghz
4gb ram ecc

Comment 2 Alan Cox 2003-06-05 14:51:14 UTC
Im unconvinced they are the same thing


Comment 3 edward dertouzas 2003-07-27 05:32:21 UTC
    6 root      34  19     0    0     0 SWN   0.0  0.0 319:40 ksoftirqd_CPU0
   10 root      15   0     0    0     0 SW    0.0  0.0  95:25 kswapd
   13 root      15   0     0    0     0 SW    0.0  0.0 168:12 bdflush

Linux 2.4.9-e.12enterprise #1 SMP Tue Feb 11 01:29:18 EST 2003 i686 unn

This happens in a production environment running Oracle. Every few days the 
system will become completely unresponsive except for a redimentary functioning 
of the TCP/IP stack. Connect() returns success but remote host will just idle 
from that point forward. Server is unresponsive on console until it either 
returns (anywhere between 5 - 45 minutes, usually after the oracle listener and 
db have died) or the host is manually powercycled.

Note: this is not connected with any orinico problems.

Comment 4 Nils Philippsen 2004-01-13 15:38:13 UTC
Does the problem still show with recent kernels?

Anyway, this sounds like a kernel/scheduling problem to me, even more
so because the problem shows not only when iostat runs.

Comment 5 Charlie Bennett 2004-09-23 19:29:43 UTC
handing off to the kernel group

Comment 6 Jason Baron 2004-09-23 19:46:23 UTC
this is an old one, i think we should start w/reproducing it on the
latest rhel2.1 kernel, e.49. thanks.

Comment 7 Don Howard 2006-03-16 21:57:13 UTC
This is a truely ancient report.  If there is no update here in the next two
weeks demonstrating this bug on a current 2.1 kernel, this ticket will be closed.


Note You need to log in before you can comment on or make changes to this bug.