Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 455310 - LS21 locks up booting MRG RT kernel
Summary: LS21 locks up booting MRG RT kernel
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-kernel
Version: 1.0
Hardware: All
OS: Linux
low
low
Target Milestone: ---
: ---
Assignee: Red Hat Real Time Maintenance
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-07-14 19:29 UTC by Clark Williams
Modified: 2008-08-14 21:31 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-08-14 21:31:18 UTC


Attachments (Terms of Use)
Boot log for LS21 lockup (deleted)
2008-07-14 19:29 UTC, Clark Williams
no flags Details

Description Clark Williams 2008-07-14 19:29:41 UTC
Description of problem:
RT kernel fails to boot on blade1 of HSV bladecenter (blade2 boots the kernel
and runs fine). Same blade boots and runs RHEL5.2 kernel.

Version-Release number of selected component (if applicable):

kernel-rt-2.6.24.7-72.el5rt

How reproducible:

Every time.

Steps to Reproduce:
1. Install RHEL5.2
2. Install MRG RT kernel
3. Boot RT kernel
  
Actual results:

Kernel hangs after reporting amount of memory available (see attached console
output).


Expected results:

Running kernel

Additional info:

Debbugging printk's indicate that the hang is occuring in
calibrate_delay_direct(). Jiffies are not incrementing, so the calibration loop
never terminates.

Comment 1 Clark Williams 2008-07-14 19:29:41 UTC
Created attachment 311759 [details]
Boot log for LS21 lockup

Comment 2 Clark Williams 2008-07-14 21:17:04 UTC
I swapped the two LS21's that were in slots 1 & 2 and the failing blade
(formerly in slot 1) reported double bit errors on DIMM slots 5 & 6, disabled
the two slots and then booted on up. 

Here's a cut-n-paste from the web interface to the event log:

1  E  BLADE_02 	 07/14/08, 21:07:55 	(SN#YK10A269W03L) DIMM number 5 failed.
2  E  BLADE_02 	 07/14/08, 21:07:55 	(SN#YK10A269W03L) POSTBIOS: 289 Board 1
DIMM Pair 3 Double Bit Error.
3  E  BLADE_02 	 07/14/08, 21:07:54 	(SN#YK10A269W03L) DIMM number 6 failed.
4  E  BLADE_02 	 07/14/08, 21:07:54 	(SN#YK10A269W03L) POSTBIOS: 289 Board 1
DIMM Pair 3 Double Bit Error.
5  I  BLADE_02 	 07/14/08, 21:07:25 	(SN#YK10A269W03L) System Reboot



Comment 3 Clark Williams 2008-08-14 21:31:18 UTC
Closing due to confirmed h/w error


Note You need to log in before you can comment on or make changes to this bug.