Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 229880 - Hang in RHEL3U8 with serveraid 4H and ips driver ver 7.10
Summary: Hang in RHEL3U8 with serveraid 4H and ips driver ver 7.10
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.8
Hardware: i386
OS: Linux
Target Milestone: ---
Assignee: Red Hat Kernel Manager
QA Contact: Brian Brock
Depends On:
TreeView+ depends on / blocked
Reported: 2007-02-23 22:53 UTC by Gary M. Gaydos
Modified: 2007-11-17 01:14 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2007-10-19 18:38:24 UTC
Target Upstream Version:

Attachments (Terms of Use)
spec file for custom patched kernel backporting ips 7.12.02 (deleted)
2007-02-23 22:56 UTC, Gary M. Gaydos
no flags Details
patch file referenced in spec file (deleted)
2007-02-23 22:58 UTC, Gary M. Gaydos
no flags Details

Description Gary M. Gaydos 2007-02-23 22:53:23 UTC
Description of problem:
SCSI reset on the console followed by a dead hang.

Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux ES release 3 (Taroon Update 8)

Linux 2.4.21-47.0.1.ELsmp #1 SMP Fri Oct 13 17:56:20
EDT 2006 i686 i686 i386 GNU/Linux

[mrx@ltc-eth1000 mrx]$ cat /proc/scsi/ips/0

IBM ServeRAID General Information:

        Controller Type                   : ServeRAID 4H
        IO region                         : 0x2300 (256 bytes)
        Memory region                     : 0xedf00000 (1048576 bytes)
        Shared memory address             : 0xf883f000
        IRQ number                        : 24
        BIOS Version                      : 7.10.23
        Firmware Version                  : 7.10.24
        Boot Block Version                : 4.00.26
        Driver Version                    : 7.10.18
        Driver Build                      : 731

How reproducible:
less than once per month.

Steps to Reproduce:
1.  fails in normal server operation under heavy i/o load.  typically cron.daily
or cron.weekly.  The server runs innd, postfix, antivirus, web.
Actual results:
scsi reset followed by dead hang.  power off and restart required to recover.

Expected results:
no hang on scsi reset

Additional info:
Our RHEL4 U4 systems errata kernels ship with version 7.12.05 of ips driver.  We
have not encountered hangs due to scsi resets.

Version 7.12 ips firmware and ips driver

Reference in the README:
2.1  ServeRAID Family 7.10 to 7.12
Fixed problem with ServeRAID-4H firmware

Discussion thread about scsi resets on ips

4) We've patched a RHEL3 U8 2.4.21-47.0.1.ELsmp with a version 7.12.02 ips
driver.  It appears stable on two test systems under heavy i/o load running
stress, one of them periodically has scsi resets and does not hang.  Obviously
this isn't tested as thoroughly as you QC your kernels.  We plan to implement
our patched kernel on our crashing production server along with serveraid
firmware and harddrive firmware updates Feb 28.  I'll attach the patch and .spec
file for your examination.

Please consider updating RHEL3 errata kernels with newer ips drivers if they
contain bug fixes.

Let me know if you're going to pass this back to IBM, I have a similar bug
opened there.

7) The hanging server is System ID 1005928737 in

Comment 1 Gary M. Gaydos 2007-02-23 22:56:52 UTC
Created attachment 148725 [details]
spec file for custom patched kernel backporting ips 7.12.02

Comment 2 Gary M. Gaydos 2007-02-23 22:58:14 UTC
Created attachment 148726 [details]
patch file referenced in spec file

Comment 3 RHEL Product and Program Management 2007-10-19 18:38:24 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
For more information of the RHEL errata support policy, please visit:
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.

Note You need to log in before you can comment on or make changes to this bug.