Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 229816

Summary: firewire crashed (2940.fc7)
Product: [Fedora] Fedora Reporter: Bill Nottingham <notting>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED RAWHIDE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: fenlason, krh, rvokal, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-12-04 01:36:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Bill Nottingham 2007-02-23 17:01:42 UTC
Description of problem:

general protection fault: 0000 [1] SMP 
last sysfs file: /block/sda/size
CPU 1 
Modules linked in: loop nfs lockd nfs_acl i915 drm netconsole autofs4 hidp
rfcomm l2cap bluetooth sunrpc nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state
nf_conntrack nfnetlink ipt_REJECT iptable_filter ip_tables xt_tcpudp ip6t_REJECT
ip6table_filter ip6_tables x_tables cpufreq_ondemand dm_multipath video sbs
i2c_ec button dock battery asus_acpi ac ipv6 lp fw_sbp2 snd_hda_intel
snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
snd_seq_device snd_pcm_oss parport_pc snd_mixer_oss rtc_cmos parport rtc_core
serio_raw rtc_lib snd_pcm ata_generic i2c_i801 snd_timer i2c_core fw_ohci snd
pcspkr iTCO_wdt fw_core iTCO_vendor_support soundcore e1000 snd_page_alloc
shpchp sr_mod cdrom sg dm_snapshot dm_zero dm_mirror dm_mod pata_marvell ahci
libata sd_mod scsi_mod ext3 jbd mbcache ehci_hcd ohci_hcd uhci_hcd
Pid: 0, comm: swapper Not tainted 2.6.20-1.2940.fc7 #1
RIP: 0010:[<ffffffff88150588>]  [<ffffffff88150588>]
RSP: 0018:ffff81007e63bdd0  EFLAGS: 00010002
RAX: 000000000000ffc0 RBX: 6b6b6b6b6b6b6b5b RCX: ffffffff808b9400
RDX: 6b6b6b6b6b6b6b6b RSI: 000000000000016a RDI: ffff8100277b6010
RBP: ffff81007e63be20 R08: 000000000000016a R09: 0000000000000001
R10: ffffffff88150525 R11: ffff81007e607a80 R12: 0000000000000011
R13: ffff8100798aa498 R14: 00000000ffc00000 R15: ffff81007e63be30
FS:  0000000000000000(0000) GS:ffff81007e6a1e88(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000006c3988 CR3: 0000000069e52000 CR4: 00000000000006e0
Process swapper (pid: 0, threadinfo ffff81007e634000, task ffff81007e696100)
Stack:  ffff81004f989318 ffff8100798aa000 ffc14520795efc78 0000000000000206
 0000ffc0798aa000 ffff81007987dc48 0000000000000000 0000000000000009
 0000000000000001 ffff8100798aae78 ffff81007e63bea0 ffffffff8817cfe4
Call Trace:
 <IRQ>  [<ffffffff8817cfe4>] :fw_ohci:handle_ar_packet+0xf5/0x100
 [<ffffffff802a35e4>] trace_hardirqs_on+0x11c/0x15a
 [<ffffffff8817de00>] :fw_ohci:ar_context_tasklet+0xd9/0xee
 [<ffffffff8029108a>] tasklet_action+0x5e/0xb2
 [<ffffffff80211c73>] __do_softirq+0x5f/0xe3
 [<ffffffff8025d2ac>] call_softirq+0x1c/0x28
 [<ffffffff8026c001>] do_softirq+0x3d/0xab
 [<ffffffff80290f38>] irq_exit+0x4e/0x50
 [<ffffffff8026c1b3>] do_IRQ+0x144/0x166
 [<ffffffff8025602a>] mwait_idle+0x0/0x50
 [<ffffffff8025c666>] ret_from_intr+0x0/0xf
 <EOI>  [<ffffffff80261096>] __sched_text_start+0xb06/0xb3e
 [<ffffffff80256070>] mwait_idle+0x46/0x50
 [<ffffffff80248672>] enter_idle+0x22/0x24
 [<ffffffff8024886d>] cpu_idle+0xa1/0xc4
 [<ffffffff8027573e>] start_secondary+0x2bb/0x2ca

Code: 48 8b 53 10 0f 18 0a 48 8b 45 b8 48 8d 7b 10 48 83 c0 28 48 
RIP  [<ffffffff88150588>] :fw_core:fw_core_handle_response+0xab/0x126
 RSP <ffff81007e63bdd0>
Kernel panic - not syncing: Aiee, killing interrupt handler!

I wasn't actually *doing* anything with firewire at the time - the drive
was sitting idle. I suppose HAL may have been polling it.

Comment 1 Kristian Høgsberg 2007-03-01 23:28:12 UTC
There is a list corruption bug in the drivers that I've been chasing for a
while, and I hope this is the same issue.  It looks like the async receive
tasklet is touching freed memory in the transaction list, which is consistent
with the other crashes I've seen.

Has this happened again or was this a one-time crash?  Is there anything you can
do to reproduce it?  I might have tracked it down, but I wont have a fix until a
couple of weeks from now.

Comment 2 Bill Nottingham 2007-03-01 23:32:27 UTC
The device has been unplugged for a while, but I've seen it happen a few times.
It's happened when the box is (apparently) idle, so perhaps it involves HAL
poking the device.

Comment 3 Kristian Høgsberg 2007-03-16 00:50:16 UTC
I'm pretty sure this bug is fixed now, if you could give the device a try again
with kernel 1.2989.fc7 or later, that'd be cool.

Comment 4 Bill Nottingham 2007-08-29 16:52:30 UTC
Seems to work.