Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 80023 - Kernel BUG at page_alloc.c:220!
Summary: Kernel BUG at page_alloc.c:220!
Keywords:
Status: CLOSED DUPLICATE of bug 79924
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.2
Hardware: i686
OS: Linux
high
high
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-12-18 21:46 UTC by Paul Zimdars
Modified: 2006-02-21 18:50 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-02-21 18:50:24 UTC


Attachments (Terms of Use)

Description Paul Zimdars 2002-12-18 21:46:51 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020823
Netscape/7.0

Description of problem:
We have a 64 node cluster. We run a scientific job that heavily depends on
memory and cpu. 

Here is the uname output from a node:

Linux mach-0-0 2.4.18-17.7.xsmp #6 Tue Dec 17 16:41:44 PST 2002 i686 unknown

The error below can be caused by any process such as (bash, sh, kswapd, etc..).
I also turned off SMP and gave the test a try without a single crash. When I
turned SMP back on the nodes would start to die. We loose between 5-10 nodes out
of 64 each run and usually within the first 10-15 minutes.


Nov 22 18:51:59 mach-0-35 kernel: kernel BUG at page_alloc.c:220!
Nov 22 18:51:59 mach-0-35 kernel: invalid operand: 0000
Nov 22 18:51:59 mach-0-35 kernel: CPU:    0
Nov 22 18:51:59 mach-0-35 kernel: EIP:    0010:[rmqueue+525/592]    Not tainted
Nov 22 18:51:59 mach-0-35 kernel: EIP:    0010:[<c0132c6d>]    Not tainted
Nov 22 18:51:59 mach-0-35 kernel: EFLAGS: 00010202
Nov 22 18:51:59 mach-0-35 kernel: eax: 00000040   ebx: c23bc8f0   ecx: 00038000
  edx: 0006942f
Nov 22 18:51:59 mach-0-35 kernel: esi: c028b128   edi: 00048000   ebp: c1000020
  esp: efe31dcc
Nov 22 18:51:59 mach-0-35 kernel: ds: 0018   es: 0018   ss: 0018
Nov 22 18:51:59 mach-0-35 kernel: Process mlsl2 (pid: 1928, stackpage=efe31000)
Nov 22 18:51:59 mach-0-35 kernel: Stack: 00038000 0003142f 00000296 00000000
c028b128 c028b200 000001ff 00000000
Nov 22 18:51:59 mach-0-35 kernel:        00000025 c0132f01 c028b128 c028b1fc
000001d2 00000018 00104025 00000000
Nov 22 18:51:59 mach-0-35 kernel:        00000001 00000025 c0127ded 69430025
00000000 f69451c0 f61bec60 efef2118
Nov 22 18:51:59 mach-0-35 kernel: Call Trace:    [__alloc_pages+81/384]
[do_anonymous_page+93/368] [do_no_page+71/576] [it_real_fn+16/80] [han
dle_mm_fault+154/288]
Nov 22 18:51:59 mach-0-35 kernel: Call Trace:    [<c0132f01>] [<c0127ded>]
[<c0127f47>] [<c011c5e0>] [<c01281da>]
Nov 22 18:51:59 mach-0-35 kernel:   [<c011d57b>] [<c011d431>] [<c012900a>]
[<c010a64d>] [<c011472a>] [<c012939b>]
Nov 22 18:51:59 mach-0-35 kernel:   [<c01293ab>] [<c010ea9e>] [<c0114570>]
[<c0108bfc>]
Nov 22 18:51:59 mach-0-35 kernel: Code: 0f 0b dc 00 81 4b 25 c0 8b 43 18 a9 80
00 00 00 74 08 0f 0b


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.We have a program that processes satellite data using PVM.
2.Tried it with and without PVM. Same results.


    

Actual Results:  5-10 nodes would die.

Expected Results:  No crash.

Additional info:

64 node cluster configuration. The drives are IDE, we used RedHat 7.2, ext3, 2
GB virtual memory and 4gb swap.

Comment 1 Paul Zimdars 2002-12-18 21:49:01 UTC
Ack sorry..ignore this one. I had the wrong window open and hit enter. Must of
recreated the same bug as # 79924

Comment 2 Dave Jones 2003-12-17 02:28:39 UTC

*** This bug has been marked as a duplicate of 79924 ***

Comment 3 Red Hat Bugzilla 2006-02-21 18:50:24 UTC
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.


Note You need to log in before you can comment on or make changes to this bug.