Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 224134 - kswapd->prune_dcache crash on 2.6.9-42.0.3.EL
Summary: kswapd->prune_dcache crash on 2.6.9-42.0.3.EL
Keywords:
Status: CLOSED DUPLICATE of bug 177357
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Kernel Maintainer List
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-01-24 11:44 UTC by Colin.Simpson
Modified: 2007-11-30 22:07 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-02-05 21:57:26 UTC


Attachments (Terms of Use)

Description Colin.Simpson 2007-01-24 11:44:56 UTC
Description of problem:

Kernel panic on 2.6.9-42.0.3.EL. Perhaps is related to a high load. The only
reason I have for this is that two systems running the same internal app seem to
have had a similar kernel panic. 
Jan 17 09:39:03 wheelhouse kernel: Unable to handle kernel paging request at
virtual address 0005005a
Jan 17 09:39:03 wheelhouse kernel:  printing eip:
Jan 17 09:39:03 wheelhouse kernel: c018b280
Jan 17 09:39:03 wheelhouse kernel: *pde = 0fd3e067
Jan 17 09:39:03 wheelhouse kernel: Oops: 0000 [#1]
Jan 17 09:39:03 wheelhouse kernel: Modules linked in: nfs nfsd exportfs nfs_acl
parport_pc lp parport autofs4 lockd sunrp
c ipt_REJECT ipt_state ip_conntrack iptable_filter ip_tables dm_mirror dm_mod
button battery ac nvidia(U) i2c_core md5 ip
v6 joydev uhci_hcd ehci_hcd hw_random snd_intel8x0 snd_ac97_codec snd_pcm_oss
snd_mixer_oss snd_pcm snd_timer snd_page_al
loc snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore e1000 ext3 jbd
ata_piix libata sd_mod scsi_mod
Jan 17 09:39:03 wheelhouse kernel: CPU:    0
Jan 17 09:39:04 wheelhouse kernel: EIP:    0060:[<c018b280>]    Tainted: P      VLI
Jan 17 09:39:04 wheelhouse kernel: EFLAGS: 00010202   (2.6.9-42.0.3.EL) 
Jan 17 09:39:04 wheelhouse kernel: EIP is at iput+0x25/0x61
Jan 17 09:39:04 wheelhouse kernel: eax: 00050046   ebx: cb2d37dc   ecx: e0d603f3
  edx: cb2d37dc
Jan 17 09:39:04 wheelhouse kernel: esi: cb2d37dc   edi: cf982040   ebp: 00000007
  esp: dfd1fee4
Jan 17 09:39:04 wheelhouse kernel: ds: 007b   es: 007b   ss: 0068
Jan 17 09:39:04 wheelhouse kernel: Process kswapd0 (pid: 46, threadinfo=dfd1f000
task=dfcfe0b0)
Jan 17 09:39:04 wheelhouse kernel: Stack: d065dac8 c0186408 00000000 00000000
000000e1 00000000 dff6e9e0 c0186ec9 
Jan 17 09:39:04 wheelhouse kernel:        c015630b 00039d00 00000000 00000066
00000000 00000906 000000d0 00000020 
Jan 17 09:39:04 wheelhouse kernel:        c036c860 00000000 c036c860 00000000
c015794b dfd1ff60 00000906 dfd1ff9c 
Jan 17 09:39:04 wheelhouse kernel: Call Trace:
Jan 17 09:39:04 wheelhouse kernel:  [<c0186408>] prune_dcache+0x501/0x70c
Jan 17 09:39:04 wheelhouse kernel:  [<c0186ec9>] shrink_dcache_memory+0x16/0x2d
Jan 17 09:39:04 wheelhouse kernel:  [<c015630b>] shrink_slab+0xf7/0x14c
Jan 17 09:39:04 wheelhouse kernel:  [<c015794b>] balance_pgdat+0x1b3/0x2cb
Jan 17 09:39:04 wheelhouse kernel:  [<c0157b1c>] kswapd+0xb9/0xbb
Jan 17 09:39:04 wheelhouse kernel:  [<c0121853>] autoremove_wake_function+0x0/0x2d
Jan 17 09:39:04 wheelhouse kernel:  [<c0318d7e>] ret_from_fork+0x6/0x14
Jan 17 09:39:04 wheelhouse kernel:  [<c0121853>] autoremove_wake_function+0x0/0x2d
Jan 17 09:39:04 wheelhouse kernel:  [<c0157a63>] kswapd+0x0/0xbb
Jan 17 09:39:04 wheelhouse kernel:  [<c01041dd>] kernel_thread_helper+0x5/0xb
Jan 17 09:39:04 wheelhouse kernel: Code: ff e9 72 fd ff ff 53 85 c0 89 c3 74 58
83 bb 9c 01 00 00 20 8b 80 d4 00 00 00 8b
 40 24 75 08 0f 0b 54 04 76 08 33 c0 85 c0 74 0b <8b> 50 14 85 d2 74 04 89 d8 ff
d2 8d 43 1c ba f0 e0 36 c0 e8 18 
Jan 17 09:39:04 wheelhouse kernel:  <0>Fatal exception: panic in 5 seconds

And the second system,
Jan 23 13:56:58 clubba kernel: Unable to handle kernel paging request at virtual
address 0040f82e
Jan 23 13:56:58 clubba kernel:  printing eip:
Jan 23 13:56:58 clubba kernel: c0171af6
Jan 23 13:56:58 clubba kernel: *pde = 00000000
Jan 23 13:56:58 clubba kernel: Oops: 0000 [#1]
Jan 23 13:56:58 clubba kernel: SMP 
Jan 23 13:56:58 clubba kernel: Modules linked in: vmnet(U) vmmon(U) nvidia(U)
nfs nfsd exportfs nfs_acl parport_pc lp parport auto
fs4 i2c_dev i2c_core lockd sunrpc ipt_REJECT ipt_state ip_conntrack
iptable_filter ip_tables dm_mirror dm_mod button battery ac md
5 ipv6 joydev uhci_hcd ehci_hcd hw_random snd_intel8x0 snd_ac97_codec
snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc s
nd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore tg3 floppy ext3 jbd
ata_piix libata sd_mod scsi_mod
Jan 23 13:56:58 clubba kernel: CPU:    1
Jan 23 13:56:58 clubba kernel: EIP:    0060:[<c0171af6>]    Tainted: P      VLI
Jan 23 13:56:58 clubba kernel: EFLAGS: 00010202   (2.6.9-42.0.3.ELsmp) 
Jan 23 13:56:58 clubba kernel: EIP is at iput+0x25/0x61
Jan 23 13:56:58 clubba kernel: eax: 0040f81a   ebx: e8fcebc4   ecx: f8c01e4a  
edx: e8fcebc4
Jan 23 13:56:58 clubba kernel: esi: ebb3a344   edi: ebb3a34c   ebp: e8fcebc4  
esp: f7cfaee0
Jan 23 13:56:58 clubba kernel: ds: 007b   es: 007b   ss: 0068
Jan 23 13:56:58 clubba kernel: Process kswapd0 (pid: 53, threadinfo=f7cfa000
task=f7d1b7b0)
Jan 23 13:56:58 clubba kernel: Stack: c221e640 c016f6b6 00000000 00000062
00000000 00000080 00000000 f7ffe9a0 
Jan 23 13:56:58 clubba kernel:        c016fa7c c01497f8 0004e200 00000000
00000001 00000000 00031fc4 000000d0 
Jan 23 13:56:58 clubba kernel:        00000020 c032a380 00000001 c0328f80
00000001 c014aa93 c02d26bd 00031fc4 
Jan 23 13:56:58 clubba kernel: Call Trace:
Jan 23 13:56:58 clubba kernel:  [<c016f6b6>] prune_dcache+0x29d/0x31b
Jan 23 13:56:58 clubba kernel:  [<c016fa7c>] shrink_dcache_memory+0x16/0x2d
Jan 23 13:56:58 clubba kernel:  [<c01497f8>] shrink_slab+0xf8/0x161
Jan 23 13:56:58 clubba kernel:  [<c014aa93>] balance_pgdat+0x1e1/0x30e
Jan 23 13:56:58 clubba kernel:  [<c02d26bd>] schedule+0x86d/0x8db
Jan 23 13:56:58 clubba kernel:  [<c0120420>] prepare_to_wait+0x12/0x4c
Jan 23 13:56:58 clubba kernel:  [<c014ac8a>] kswapd+0xca/0xcc
Jan 23 13:56:58 clubba kernel:  [<c01204f5>] autoremove_wake_function+0x0/0x2d
Jan 23 13:56:58 clubba kernel:  [<c02d46e6>] ret_from_fork+0x6/0x14
Jan 23 13:56:58 clubba kernel:  [<c01204f5>] autoremove_wake_function+0x0/0x2d
Jan 23 13:56:58 clubba kernel:  [<c014abc0>] kswapd+0x0/0xcc
Jan 23 13:56:58 clubba kernel:  [<c01041f5>] kernel_thread_helper+0x5/0xb
Jan 23 13:56:58 clubba kernel: Code: ff e9 e5 fe ff ff 53 85 c0 89 c3 74 58 83
bb 3c 01 00 00 20 8b 80 a4 00 00 00 8b 40 24 75 08 
0f 0b 54 04 34 c3 2e c0 85 c0 74 0b <8b> 50 14 85 d2 74 04 89 d8 ff d2 8d 43 1c
ba 70 de 32 c0 e8 62 
Jan 23 13:56:58 clubba kernel:  <0>Fatal exception: panic in 5 seconds



Version-Release number of selected component (if applicable):
kernel 2.6.9-42.0.3

How reproducible:
Not very, as I'm not quite sure what caused it.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Daniel J Blueman 2007-01-25 11:43:38 UTC
Multiple people have reported this in bug 177357, which was (incorrectly) marked
as a duplicate of another problem and closed.

I'm hitting this kswapd->prune_dcache bug quite frequently - around once a week,
on a few machines under constant processor-bound load and high(er) memory pressure.

Configuration is stock RHEL4 U4 + latest errata 2.6.9-42.0.3, dual-SMP, x86-64
(so not only i686 as above), 4GB memory

Crash signature is:

__down_read_trylock+18      prune_dcache+568
shrink_dcache_memory+20     shrink_slab+188
balance_pgdat+538           kswapd+252
autoremove_wake_function+0  autoremove_wake_function+0
child_rip+8                 kswapd+0
child_rip+0

RIP: _spin_lock_irqsave+40

Let me know if getting a crash-dump and making it available to someone would help.

[suggest changing summary to "kswapd->prune_dcache crash on 2.6.9-42.0.3.EL"]

Comment 2 Colin.Simpson 2007-01-25 11:53:08 UTC
Changed Summary. I'd struggle to get a crash dump as it seems to hit machines
randomly. Has someone passed this through a RHN subscription to escalate? I will
if no one else has.

Comment 3 Daniel J Blueman 2007-01-25 11:58:44 UTC
It is crashing a number of systems I've seen, randomly, so would suggest a race.

I have asked for bug 177357 to be re-opened, since there are others reporting it
too there; it's probably a good idea to escalate this.

Comment 4 Jason Baron 2007-02-05 21:57:26 UTC

*** This bug has been marked as a duplicate of 177357 ***


Note You need to log in before you can comment on or make changes to this bug.