Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 84109 - scheduler priority bug
Summary: scheduler priority bug
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.3
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-02-12 11:46 UTC by Marc Schmitt
Modified: 2007-04-18 16:51 UTC (History)
2 users (show)

Fixed In Version: 2.4.20-13.7smp
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-05-27 12:53:31 UTC


Attachments (Terms of Use)

Description Marc Schmitt 2003-02-12 11:46:59 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20021120
Netscape/7.01

Description of problem:
we are always running low priority jobs in the background.  The scheduler gets
confused and ends up running jobs with nice +19 almost at top priority.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. launch 2 jobs with default priority
2. launch 4 jobs with nice +19

    

Actual Results:
213 processes: 203 sleeping, 8 running, 2 zombie, 0 stopped
CPU0 states: 91.2% user,  8.4% system,  2.2% nice,  0.0% idle
CPU1 states: 99.0% user,  0.1% system, 99.4% nice,  0.0% idle
Mem:   513100K av,  504868K used,    8232K free,       0K shrd,   28092K buff
Swap: 2096472K av,  186560K used, 1909912K free                  223376K cached
PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
26495 darwin    25   0 48988  47M   436 R    47.1  9.5   8:30 darwin
26464 darwin    25   0 50524  49M   436 R    46.9  9.8   8:51 darwin
26614 gonnet    39  19  2416 2416   300 R N  33.4  0.4   4:08 Switch4.LI
26735 gonnet    39  19  2396 2396   300 R N  33.4  0.4   2:02 Switch4.LI
24154 gonnet    39  19  1548 1548   304 R N  33.2  0.3  21:50 Switch4.LI
25323 gonnet    39  19  2400 2400   304 R N   2.7  0.4   9:52 Switch4.LI
. . . . 

Notice that the first two jobs have nice 0 and they get 100% of one cpu.
The other 4 jobs have all nice=19 and get also 100% of the other cpu.



Expected Results:  Jobs should run at assigned priority.

Additional info:
Kernel 2.4.18-19.7.xsmp (athlon) on a Tyan Tiger MPX dual athlon motherboard.

Comment 1 Roderick Johnstone 2003-03-17 15:05:05 UTC
We are seeing something similar.

 PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CO
 2175 maw       39  19 47864  46M  2276 R N  99.2  4.6   6:51 cloudy.exe
15198 derosa    25   0  7140 7140  3532 R    50.2  0.6 116:51 xspec
 1748 swa       25   0 21304  14M  1288 R    49.7  1.4   8:05 cosmomc

Here two nice zero jobs are sharing one cpu with a nice 19 job having the other
cpu to itself.

Surely this is a scheduling bug?

Kernel version is 2.4.18-24.7.xsmp for Athlon on redhat 7.3. Tyan Tiger MP, 2x
Athlon MP processors.

Comment 2 Marc Schmitt 2003-03-25 10:53:52 UTC
I did an upgrade to 2.4.18-27.7.xsmp, the problem remains:

 11:42am  up 2 days,  2:01, 33 users,  load average: 3.01, 3.29, 3.16
203 processes: 198 sleeping, 5 running, 0 zombie, 0 stopped
CPU0 states: 96.2% user,  3.3% system, 11.3% nice,  0.0% idle
CPU1 states: 95.0% user,  4.2% system, 94.1% nice,  0.0% idle
Mem:  2064336K av, 1933252K used,  131084K free,       0K shrd,  110096K buff
Swap: 2096472K av,       0K used, 2096472K free                 1236936K cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
10364 gonnet    39  19  2428 2428   304 R N  95.0  0.1   1:29 Switch4.LI
10405 gonnet    25   0  322M 322M   596 R    84.3 15.9   2:18 mapleTTY
10685 gonnet    39  19  2416 2416   296 R N  12.2  0.1   0:02 Switch4.LI
 1917 root       5 -10  308M  51M  5116 S <   4.1  2.5 154:25 X
20918 gonnet    15   0 16604  16M 14804 R     2.1  0.8   0:14 kdeinit
10678 gonnet    15   0  1212 1212   916 R     1.1  0.0   0:00 top
20857 gonnet    15   0 13768  13M 13012 S     0.1  0.6   4:15 kdeinit
20891 gonnet    15   0 18060  17M 15264 S     0.1  0.8   1:44 kdeinit
20902 gonnet    15   0 16652  16M 14804 S     0.1  0.8   0:15 kdeinit
    1 root      15   0   480  480   420 S     0.0  0.0   0:07 init

You can see that the top process is at nice 19, while the one at nice
0 does not get as much cpu.  In normal circustances, the nice 0 should
get 100% and the other two at nice 19 50% each.

Could someone look into this, please? Thanks.

Comment 3 Marc Schmitt 2003-05-27 12:53:31 UTC
The problem is gone in 2.4.20-13.7smp, I'm closing this bug. Thanks!


Note You need to log in before you can comment on or make changes to this bug.