Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 158030 - e1000 network drivers sleeps when it should not....
Summary: e1000 network drivers sleeps when it should not....
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 3
Hardware: x86_64
OS: Linux
Target Milestone: ---
Assignee: John W. Linville
QA Contact: Brian Brock
Depends On:
TreeView+ depends on / blocked
Reported: 2005-05-17 22:57 UTC by Tom Mitchell
Modified: 2007-11-30 22:11 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2005-05-19 17:44:46 UTC

Attachments (Terms of Use)
Full console dump... (deleted)
2005-05-17 23:00 UTC, Tom Mitchell
no flags Details

Description Tom Mitchell 2005-05-17 22:57:15 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.7) Gecko/20050416 Fedora/1.0.3-1.3.1 Firefox/1.0.3

Description of problem:
The e1000 netwroik driver goes to sleep when it should not
and later the nmi_watchdog  kicks things over.

A stack trace looks like --

       <ffffffff8010fe68>{oops_end+40} <ffffffff8010fe61>{oops_end+33}
       <ffffffff80122afb>{do_page_fault+1963} <ffffffff80211910>{vgacon_cursor+0 }
       <ffffffff80138a8d>{release_console_sem+333} <ffffffff80138ac9>{release_co nsole_sem+393}
       <ffffffff80138d30>{vprintk+528} <ffffffff8010f041>{error_exit+0}
       <ffffffff80211910>{vgacon_cursor+0} <ffffffff80131ab4>{dequeue_task+4}
       <ffffffff80131e05>{deactivate_task+21} <ffffffff803460d0>{schedule+512}
       <ffffffff80112d8a>{timer_interrupt+1066} <ffffffff8015bd3c>{handle_IRQ_ev ent+44}
       <ffffffff8015bebc>{__do_IRQ+332} <ffffffff801419cd>{__mod_timer+317}
       <ffffffff803477ad>{schedule_timeout+253} <ffffffff801425a0>{process_timeo ut+0}
       <ffffffff8014260d>{msleep+93} <ffffffff880b7cc8>{:e1000:e1000_config_dsp_ after_link_change+744}
       <ffffffff880b4d2a>{:e1000:e1000_watchdog+42} <ffffffff801419cd>{__mod_tim er+317}
       <ffffffff880b4d00>{:e1000:e1000_watchdog+0} <ffffffff80141e7e>{run_timer_ softirq+398}
       <ffffffff8013daf1>{__do_softirq+113} <ffffffff8013dba5>{do_softirq+53}
       <ffffffff8010eea5>{apic_timer_interrupt+133}  <EOI> <ffffffff8010c720>{de fault_idle+0}
       <ffffffff8010c740>{default_idle+32} <ffffffff8010c88f>{cpu_idle+63}

The trick to reproduce this is a network link connector that
is not up to snuff and wiggle it. The link goes down as expected
but the driver does an unsafe sleep and the watchdog cries wolf
(as it should)....  does a wolf sound like: Aiee, Aiee, Aiee in the night

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.activate ethN on top of the e1000 driver
2. plug/ unplug the connector
3. system panics Oops...

Actual Results:   <3>Debug: sleeping function called from invalid context at include/linux/rwsem. h:43
in_atomic():1, irqs_disabled():0

Call Trace:<ffffffff801327cf>{__might_sleep+191} <ffffffff80139349>{profile_task _exit+41}
<ffffffff8013ac72>{do_exit+34} <ffffffff8010fe68>{oops_end+40}
<ffffffff8011005d>{die_nmi+173} <ffffffff8011b26c>{nmi_watchdog_tick+220}
<ffffffff80110ab2>{default_do_nmi+130} <ffffffff8011b346>{do_nmi+134}
<ffffffff8010f423>{paranoid_exit+0} <ffffffff80348369>{.text.lock.spinlock+2}
 <EOE> <ffffffff8013198a>{task_rq_lock+74} <ffffffff80131fcb>{try_to_wake_up+43} 
       <ffffffff80133ce0>{__wake_up_common+64} <ffffffff80133d53>{__wake_up+67}
       <ffffffff802dd574>{sock_def_readable+68} <ffffffff803418b7>{unix_stream_s endmsg+711}
       <ffffffff802d9de9>{sock_sendmsg+297} <ffffffff8015d5fc>{find_get_page+92} 
       <ffffffff8015e4dc>{filemap_nopage+396} <ffffffff8016ecd2>{handle_mm_fault +418}
       <ffffffff8014eec0>{autoremove_wake_function+0} <ffffffff802d9b00>{sockfd_ lookup+32}
       <ffffffff802db6b9>{sys_sendto+233} <ffffffff8019494b>{do_ioctl+123}
       <ffffffff80194cab>{vfs_ioctl+827} <ffffffff80194d3a>{sys_ioctl+106}
Kernel panic - not syncing: Aiee, killing interrupt handler!

Expected Results:  Should down the link....
and up the link when restored.

Additional info:

Dual processor, AMD Opteron, Kernel is 64 bit...

Comment 1 Tom Mitchell 2005-05-17 23:00:23 UTC
Created attachment 114489 [details]
Full console dump...

Just in case I pruned the text in the original post too much
here is the full console listing of the Oops

Comment 2 John W. Linville 2005-05-18 15:24:09 UTC
I believe this issue is fixed in the test kernels here: 
Wanna give them a try to confirm?  Thanks! 

Comment 3 Tom Mitchell 2005-05-18 23:05:40 UTC
I have 2.6.11-1.21_FC3.jwltest.9smp installed and running now.
I will poke and prod and try to reproduce the Oops.


Comment 4 Tom Mitchell 2005-05-19 01:54:14 UTC
Uptime is about 4 hours now and the cable/connector is clearly bad.

While networking was worthless when I had this link up
I was able to debug it, bring it down and bring up the other
from the console......

# grep e1000_watchdog_task /var/log/messages | head -1
May 18 14:56:26 box-12 kernel: e1000: eth0: e1000_watchdog_task: NIC Link is Up
1000 Mbps Full Duplex
# grep e1000_watchdog_task /var/log/messages | tail  -1
May 18 16:51:30 box-12 kernel: e1000: eth1: e1000_watchdog_task: NIC Link is Up
1000 Mbps Full Duplex

How many times in this two hours you ask....  ;-)
# grep e1000_watchdog_task /var/log/messages | wc
    617    8648   56814
some are up some are down messages so divide in half.

So it appear that my issue has been addressed.


Comment 5 John W. Linville 2005-05-19 17:44:46 UTC
Excellent!  Now, get yourself another cable and you should be set... :-) 

Note You need to log in before you can comment on or make changes to this bug.