Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 162065 - aacraid driver hangs if Adaptec 2230SLP array not optimal
Summary: aacraid driver hangs if Adaptec 2230SLP array not optimal
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: athlon
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Tom Coughlan
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 168424
TreeView+ depends on / blocked
 
Reported: 2005-06-29 16:12 UTC by David Milburn
Modified: 2007-11-30 22:07 UTC (History)
7 users (show)

Fixed In Version: RHSA-2006-0144
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-03-15 16:09:59 UTC


Attachments (Terms of Use)
This patch to remove the aac_handle_aif() code did not help. (deleted)
2005-06-29 16:15 UTC, David Milburn
no flags Details | Diff
Patch to turn on dprintk and add more debug printks, attaching console messages. (deleted)
2005-06-29 16:16 UTC, David Milburn
no flags Details | Diff
Console messages showing the driver stuck in aac_queue_get() (deleted)
2005-06-29 16:17 UTC, David Milburn
no flags Details
Test patch to use old comm interface, after syncing to latest, proving that it wasn't an old_comm problem. (deleted)
2005-09-14 21:11 UTC, David Milburn
no flags Details | Diff
Patch RHEL3 U5 driver to not touch InboundMailbox7 register and reduce number of fibs (deleted)
2005-09-14 21:13 UTC, David Milburn
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0144 qe-ready SHIPPED_LIVE Moderate: Updated kernel packages available for Red Hat Enterprise Linux 3 Update 7 2006-03-15 05:00:00 UTC

Description David Milburn 2005-06-29 16:12:21 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050302 Firefox/1.0.1 Fedora/1.0.1-1.3.2

Description of problem:
Using an Adaptec 2230SLP RAID controller with 2 73GB disks in a RAID-1 setup. If the array is not "optimal" then RHEL will quit responding to keyboard, mouse and network (system hung). System is in a state were fib_adapter_complete() calls acc_queue_get() which in turns calls aac_get_entry(), acc_get_entry() is always 
returning 0 causing the driver to be stuck in the following loop in aac_queue_get():

else if (qid == AdapHighRespQueue || qid == AdapNormRespQueue)
{
        while(!aac_get_entry(dev, qid, &entry, index, nonotify)) 
	{
			/* if no entries wait for some if caller wants to */
                        DPRINTK("RespQueue: No entries, wait...\n");
	}
}


Version-Release number of selected component (if applicable):
kernel-2.4.21-32.0.1.EL

How reproducible:
Always

Steps to Reproduce:
1. Boot with RAID array not optimal.
2.
3.
  

Actual Results:  System will hang, no response from keyboard, mouse, or networking.

Expected Results:  System should boot up and function as normal.

Additional info:

Based upon Alan Cox's comments for 2.6 http://lkml.org/lkml/2005/1/14/252, tried
to remove the aac_handle_aif() code from the 2.4 driver, the system still hung
when booting with raid not optimal. Also turned on dprintk and added some more 
debug statements, console messages attached.

Comment 1 David Milburn 2005-06-29 16:15:24 UTC
Created attachment 116133 [details]
This patch to remove the aac_handle_aif() code did not help.

Comment 2 David Milburn 2005-06-29 16:16:53 UTC
Created attachment 116134 [details]
Patch to turn on dprintk and add more debug printks, attaching console messages.

Comment 3 David Milburn 2005-06-29 16:17:55 UTC
Created attachment 116135 [details]
Console messages showing the driver stuck in aac_queue_get()

Comment 49 Tom Coughlan 2005-11-01 23:51:18 UTC
Please test the kernel located at:

http://people.redhat.com/coughlan/.2.4.21-37.7.ELdrvrtest2/

to verify that it solves the problem. 

This contains version 1.1.5-2412 of the aacraid driver. This is the latest from
Adaptec, and is a candidate for U7. 

Comment 55 Ernie Petrides 2005-11-23 00:36:27 UTC
A fix for this problem has just been committed to the RHEL3 U7
patch pool this evening (in kernel version 2.4.21-37.10.EL).


Comment 62 Red Hat Bugzilla 2006-03-15 16:10:00 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0144.html



Note You need to log in before you can comment on or make changes to this bug.