Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 596927 - PCI card fails to work until reboot when hotplugged
Summary: PCI card fails to work until reboot when hotplugged
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.0
Hardware: All
OS: Linux
high
medium
Target Milestone: rc
: ---
Assignee: Don Zickus
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-05-27 19:07 UTC by Mike Gahagan
Modified: 2010-06-30 14:13 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-06-30 14:13:17 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Mike Gahagan 2010-05-27 19:07:30 UTC
Description of problem:

When attempting to hot-plug a pci network card, the card remains non-functional until the system is rebooted. The card does show up in lspci.


lspci entry for the card:
0c:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)


Version-Release number of selected component (if applicable):
RHEL6.0-Snapshot-5
2.6.32-30.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. Boot system, add card, power on slot
2.
3.
  
Actual results:

Card shows up in lspci and dmesg, but is unusable until the system is rebooted. The appropriate kernel module loads, but so far I can't tell if the problem is a driver issue or related to hotplug.

Expected results:

Card works when hotplugged.


Additional info:

pciehp 0000:00:07.0:pcie04: Button pressed on Slot(2)
pciehp 0000:00:07.0:pcie04: PCI slot #2 - powering on due to button press.
pci 0000:0b:00.0: PME# supported from D0 D3hot D3cold
pci 0000:0b:00.0: PME# disabled
pci 0000:0b:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'
pci 0000:0c:00.0: reg 10 64bit mmio: [0x000000-0x1ffffff]
pci 0000:0c:00.0: reg 30 32bit mmio pref: [0x6140000-0x615ffff]
pci 0000:0c:00.0: PME# supported from D3hot D3cold
pci 0000:0c:00.0: PME# disabled
pci 0000:0b:00.0: bridge io port: [0x00-0xfff]
pci 0000:0b:00.0: bridge 32bit mmio: [0x000000-0x0fffff]
pci 0000:0b:00.0: bridge 64bit mmio pref: [0x000000-0x0fffff]
pci 0000:0b:00.0: BAR 14: can't allocate mem resource [0x9a000000-0x9a8fffff]
pci 0000:0c:00.0: BAR 0: can't allocate mem resource [0x000000-0x1ffffff]
pci 0000:0b:00.0: PCI bridge, secondary bus 0000:0c
pci 0000:0b:00.0:   bridge window [io  disabled]
pci 0000:0b:00.0:   bridge window [mem disabled]
pci 0000:0b:00.0:   bridge window [0x91900000-0x919fffff]
pci 0000:0b:00.0: enabling device (0000 -> 0002)
pci 0000:0b:00.0: setting latency timer to 64
shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.8 (Feb 15, 2010)
bnx2 0000:0c:00.0: PCI INT A -> GSI 30 (level, low) -> IRQ 30
bnx2 0000:0c:00.0: Cannot find PCI device base address, aborting.
bnx2 0000:0c:00.0: PCI INT A disabled
pciehp 0000:00:07.0:pcie04: Button pressed on Slot(2)
pciehp 0000:00:07.0:pcie04: PCI slot #2 - powering off due to button press.
pciehp 0000:00:07.0:pcie04: Button pressed on Slot(2)
pciehp 0000:00:07.0:pcie04: PCI slot #2 - powering on due to button press.
pci 0000:0b:00.0: PME# supported from D0 D3hot D3cold
pci 0000:0b:00.0: PME# disabled
pci 0000:0b:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'
pci 0000:0c:00.0: reg 10 64bit mmio: [0x000000-0x1ffffff]
pci 0000:0c:00.0: reg 30 32bit mmio pref: [0x8f4a0000-0x8f4bffff]
pci 0000:0c:00.0: PME# supported from D3hot D3cold
pci 0000:0c:00.0: PME# disabled
pci 0000:0b:00.0: bridge io port: [0x00-0xfff]
pci 0000:0b:00.0: bridge 32bit mmio: [0x000000-0x0fffff]
pci 0000:0b:00.0: bridge 64bit mmio pref: [0x000000-0x0fffff]
pci 0000:0b:00.0: BAR 14: can't allocate mem resource [0x9a000000-0x9a8fffff]
pci 0000:0c:00.0: BAR 0: can't allocate mem resource [0x000000-0x1ffffff]
pci 0000:0b:00.0: PCI bridge, secondary bus 0000:0c
pci 0000:0b:00.0:   bridge window [io  disabled]
pci 0000:0b:00.0:   bridge window [mem disabled]
pci 0000:0b:00.0:   bridge window [0x91900000-0x919fffff]
pci 0000:0b:00.0: enabling device (0000 -> 0002)
pci 0000:0b:00.0: setting latency timer to 64
bnx2 0000:0c:00.0: PCI INT A -> GSI 30 (level, low) -> IRQ 30
bnx2 0000:0c:00.0: Cannot find PCI device base address, aborting.
bnx2 0000:0c:00.0: PCI INT A disabled

Comment 2 Don Zickus 2010-06-25 15:32:47 UTC
Hmm...

pci 0000:0b:00.0: BAR 14: can't allocate mem resource [0x9a000000-0x9a8fffff]
pci 0000:0c:00.0: BAR 0: can't allocate mem resource [0x000000-0x1ffffff]

For some reason PCI can't allocate memory for it...

pci 0000:0b:00.0: PCI bridge, secondary bus 0000:0c
pci 0000:0b:00.0:   bridge window [io  disabled]
pci 0000:0b:00.0:   bridge window [mem disabled]
pci 0000:0b:00.0:   bridge window [0x91900000-0x919fffff]
pci 0000:0b:00.0: enabling device (0000 -> 0002)
pci 0000:0b:00.0: setting latency timer to 64
bnx2 0000:0c:00.0: PCI INT A -> GSI 30 (level, low) -> IRQ 30
bnx2 0000:0c:00.0: Cannot find PCI device base address, aborting.
bnx2 0000:0c:00.0: PCI INT A disabled 

As a result bnx2 can't find memory to talk to it thus it aborts.  bnx2 probably should have failed to install here but instead stays registered against the device and shows up in the lspci table though it can't talk to the device. :-/

I'll try to figure out why pci can't allocate memory for the card.

Comment 6 Prarit Bhargava 2010-06-30 14:13:17 UTC
Mike,

Don did a bit of investigating and found the following:

1.  When booting with the bnx2 card in the hotplug slot, the system boots and the card is available for system use.

dmesg contains:

Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.8 (Feb 15, 2010)
bnx2 0000:84:00.0: PCI INT A -> GSI 54 (level, low) -> IRQ 54
bnx2 0000:84:00.0: setting latency timer to 64
bnx2 0000:84:00.0: firmware: requesting bnx2/bnx2-mips-09-5.0.0.j9.fw
bnx2 0000:84:00.0: firmware: requesting bnx2/bnx2-rv2p-09-5.0.0.j10.fw

and

bnx2 0000:84:00.1: PCI INT B -> GSI 61 (level, low) -> IRQ 61
bnx2 0000:84:00.1: setting latency timer to 64
bnx2 0000:84:00.1: firmware: requesting bnx2/bnx2-mips-09-5.0.0.j9.fw
bnx2 0000:84:00.1: firmware: requesting bnx2/bnx2-rv2p-09-5.0.0.j10.fw

2.  When booting without then bnx2 card in the hotplug slot, the system boots but the card is not available for system use because there is not enough memory reserved in the slot's parent bus.

From dmesg:

pci 0000:84:00.0: BAR 0: can't allocate mem resource [0xd2000000-0xd1ffffff]
pci 0000:84:00.1: BAR 0: can't allocate mem resource [0xd2000000-0xd1ffffff]

****This is an expected failure of hotplugging this particular device on this system****

From lspci -t

 +-[0000:80]-+-00.0-[81]--
 |           +-01.0-[82]--
 |           +-03.0-[83]--
 |           +-07.0-[84-86]--+-00.0
 |           |               \-00.1

(bus 0000:80:07.0 has children pci busses 0000:84:00 through 0000:86:00.  The bnx2 card is on 0000:84:00 and has two devices, function 0 and function 1, or 0000:84:00.0 and 0000:84:00.1 respectively)

From the case where we booted with the card in the hotplug slot, bus 0000:84:00 has a memory window of 

d0000000-d3ffffff : PCI Bus 0000:84

and each card has allocated PCI memory of

  d0000000-d1ffffff : 0000:84:00.0
    d0000000-d1ffffff : bnx2

  d2000000-d3ffffff : 0000:84:00.1
    d2000000-d3ffffff : bnx2

Note that each card requires 0x1ffffff ( = 1f * M = 32M)  of memory.

From the case where we booted without the card in the hotplug slot and added it later, bus 0000:84:00 has a memory window of

d1000000-d1ffffff : PCI Bus 0000:84

When the card is enabled, the PCI subsystem attempts to allocate memory to the card.  This fails because the available memory is less than the memory required to bring the card into service.  ie) both functions require 64M and there is only 32M available in the busses' memory window.

This, as mentioned before, is expected to fail.

The problem is the BIOS; it is not leaving an acceptable amount of memory on each hotplug bus at boot time.

My suggestion is that you work with a Rip-And-Replace hotswap method on these systems, ie) boot with the card in the slot and hotswap from there.

Closing as NOTABUG.

P.


Note You need to log in before you can comment on or make changes to this bug.