Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 232705 - LSPP: getting slab corruption messages
Summary: LSPP: getting slab corruption messages
Keywords:
Status: CLOSED DUPLICATE of bug 223919
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.0
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Eric Paris
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks: RHEL5LSPPCertTracker
TreeView+ depends on / blocked
 
Reported: 2007-03-16 18:25 UTC by Steve Grubb
Modified: 2007-11-30 22:07 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-03-20 14:45:39 UTC
Target Upstream Version:


Attachments (Terms of Use)
proposed patch (deleted)
2007-03-19 07:27 UTC, Alexander Viro
no flags Details | Diff

Description Steve Grubb 2007-03-16 18:25:33 UTC
Description of problem:
When running both 65-68 lspp kernels, I get slab corruption when I connect with
vpn to read email:

Mar 15 07:02:55 localhost kernel: Slab corruption: (Not tainted)
start=ffff81002b8c9000, len=4096
Mar 15 07:02:55 localhost kernel:
Mar 15 07:02:55 localhost kernel: Call Trace:
Mar 15 07:02:55 localhost kernel:  [<ffffffff80007144>] check_poison_obj+0x7e/0x1d4
Mar 15 07:02:55 localhost kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7
Mar 15 07:02:55 localhost kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7
Mar 15 07:02:55 localhost kernel:  [<ffffffff8000cb63>]
cache_alloc_debugcheck_after+0x35/0x1cc
Mar 15 07:02:55 localhost kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7
Mar 15 07:02:55 localhost kernel:  [<ffffffff8000ad3e>] kmem_cache_alloc+0xe7/0xf7
Mar 15 07:02:55 localhost kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7
Mar 15 07:02:55 localhost kernel:  [<ffffffff800253c7>] __user_walk_fd+0x22/0x5c
Mar 15 07:02:55 localhost kernel:  [<ffffffff80057b05>] sys_linkat+0x65/0x12e
Mar 15 07:02:55 localhost kernel:  [<ffffffff8003e0e3>]
hrtimer_try_to_cancel+0x4f/0x5b
Mar 15 07:02:55 localhost kernel:  [<ffffffff8005e098>] hrtimer_cancel+0x14/0x21
Mar 15 07:02:55 localhost kernel:  [<ffffffff800c14ca>]
audit_syscall_entry+0x154/0x18a
Mar 15 07:02:55 localhost kernel:  [<ffffffff80072434>]
syscall_trace_enter+0x9a/0x9e
Mar 15 07:02:55 localhost kernel:  [<ffffffff800ea6d6>] sys_link+0x19/0x1b
Mar 15 07:02:55 localhost kernel:  [<ffffffff80060d9a>] tracesys+0xd1/0xdb
Mar 15 07:02:55 localhost kernel:
Mar 15 07:02:55 localhost kernel: 000: 2f 68 6f 6d 65 2f 73 67 72 75 62 62 2f 2e
49 43
Mar 15 07:02:55 localhost kernel: 010: 45 61 75 74 68 6f 72 69 74 79 2d 6c 00 5a
5a 5a
Mar 15 07:02:55 localhost kernel: 020: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a
Mar 15 07:02:55 localhost kernel: 030: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a
Mar 15 07:02:55 localhost kernel: 040: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a
Mar 15 07:02:55 localhost kernel: 050: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a


Version-Release number of selected component (if applicable):
lspp.68

How reproducible:
very - using x86_64 machine

Steps to Reproduce:
1. vpn in to network
2. browse email on imap server
3. look at syslog

Comment 1 Klaus Heinrich Kiwi 2007-03-16 22:28:20 UTC
I have the same similar messages, but apparentely when running the LTP suite the
second time in a row. Shortly after the message, the machine hangs with an
unidentified error (looks like a trace) in the console (hand-copied message below:)
die
do_trap
do_invalid
free_block
printk
error_exit
release_console_sem
free_block
drain_array
cache_recap
run_workqueue
cache_recap
worker_thread
worker_thread
default_wake_function
worker_thread
kthread
trace_hardirqs_on_thunk
child_rip
_spin_unlock_irq
restore_args
kthread
child_rip

The machine is a blade server HS21 - Intel Xeon based, using x86_64 arch.

The funny thing is that this same suite runs in the opteron-based LS21 with no
such problems


/var/log/messages:
Mar 15 20:47:48 bracer2 kernel: Slab corruption: (Not tainted)
start=ffff8100526a0000, len=4096 
Mar 15 20:47:48 bracer2 kernel: 
Mar 15 20:47:48 bracer2 kernel: Call Trace: 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80007144>] check_poison_obj+0x7e/0x1d4 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff800c1c8a>] __audit_getname+0xc4/0x139 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff8000cb63>]
cache_alloc_debugcheck_after+0x35/0x1cc 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff8000ad3e>] kmem_cache_alloc+0xe7/0xf7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff800577d8>] sys_symlinkat+0x36/0xf2 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff800c14d6>]
audit_syscall_entry+0x154/0x18a 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80072474>] syscall_trace_enter+0x9a/0x9e 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff8004ebf5>] sys_symlink+0x11/0x13 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80060dda>] tracesys+0xd1/0xdb 
Mar 15 20:47:48 bracer2 kernel: 
Mar 15 20:47:48 bracer2 kernel: 000: 6f 62 6a 65 63 74 00 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:48 bracer2 kernel: 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:48 bracer2 kernel: 020: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:48 bracer2 kernel: 030: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:48 bracer2 kernel: 040: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:48 bracer2 kernel: 050: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:48 bracer2 kernel: Slab corruption: (Not tainted)
start=ffff8100526a0000, len=4096 
Mar 15 20:47:48 bracer2 kernel: 
Mar 15 20:47:48 bracer2 kernel: Call Trace: 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80007144>] check_poison_obj+0x7e/0x1d4 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff800c1c8a>] __audit_getname+0xc4/0x139 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff8000cb63>]
cache_alloc_debugcheck_after+0x35/0x1cc 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8000ad3e>] kmem_cache_alloc+0xe7/0xf7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800577d8>] sys_symlinkat+0x36/0xf2 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800c14d6>]
audit_syscall_entry+0x154/0x18a 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80072474>] syscall_trace_enter+0x9a/0x9e 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8004ebf5>] sys_symlink+0x11/0x13 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80060dda>] tracesys+0xd1/0xdb 
Mar 15 20:47:49 bracer2 kernel: 
Mar 15 20:47:49 bracer2 kernel: 000: 6f 62 6a 65 63 74 00 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 020: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 030: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 040: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 050: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: Slab corruption: (Not tainted)
start=ffff8100526a0000, len=4096 
Mar 15 20:47:49 bracer2 kernel: 
Mar 15 20:47:49 bracer2 kernel: Call Trace: 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80007144>] check_poison_obj+0x7e/0x1d4 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800c1c8a>] __audit_getname+0xc4/0x139 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8000cb63>]
cache_alloc_debugcheck_after+0x35/0x1cc 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8000ad3e>] kmem_cache_alloc+0xe7/0xf7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800577d8>] sys_symlinkat+0x36/0xf2 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800c14d6>]
audit_syscall_entry+0x154/0x18a 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80072474>] syscall_trace_enter+0x9a/0x9e 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8004ebf5>] sys_symlink+0x11/0x13 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80060dda>] tracesys+0xd1/0xdb 
Mar 15 20:47:49 bracer2 kernel: 
Mar 15 20:47:49 bracer2 kernel: 000: 6f 62 6a 65 63 74 00 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 020: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 030: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 040: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 050: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: Slab corruption: (Not tainted)
start=ffff8100526a0000, len=4096 
Mar 15 20:47:49 bracer2 kernel: 
Mar 15 20:47:49 bracer2 kernel: Call Trace: 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80007144>] check_poison_obj+0x7e/0x1d4 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800c1c8a>] __audit_getname+0xc4/0x139 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8000cb63>]
cache_alloc_debugcheck_after+0x35/0x1cc 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8000ad3e>] kmem_cache_alloc+0xe7/0xf7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800577d8>] sys_symlinkat+0x36/0xf2 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800c14d6>]
audit_syscall_entry+0x154/0x18a 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80072474>] syscall_trace_enter+0x9a/0x9e 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8004ebf5>] sys_symlink+0x11/0x13 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80060dda>] tracesys+0xd1/0xdb 
Mar 15 20:47:49 bracer2 kernel: 
Mar 15 20:47:49 bracer2 kernel: 000: 73 79 6d 62 6f 6c 69 63 00 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 020: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 030: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 040: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 050: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: Slab corruption: (Not tainted)
start=ffff8100526a0000, len=4096 
Mar 15 20:47:49 bracer2 kernel: 
Mar 15 20:47:49 bracer2 kernel: Call Trace: 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80007144>] check_poison_obj+0x7e/0x1d4 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8001c49c>] open_namei+0x5cb/0x6f4 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8001c49c>] open_namei+0x5cb/0x6f4 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8000cb63>]
cache_alloc_debugcheck_after+0x35/0x1cc 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8001c49c>] open_namei+0x5cb/0x6f4 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8000ad3e>] kmem_cache_alloc+0xe7/0xf7 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8001c49c>] open_namei+0x5cb/0x6f4 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8002948b>] do_filp_open+0x28/0x46 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff80069211>] _spin_unlock+0x26/0x2a 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff80016a67>] get_unused_fd+0xfc/0x10d 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8001ab59>] do_sys_open+0x4f/0xcd 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff800340de>] sys_open+0x1b/0x1d 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff80060dda>] tracesys+0xd1/0xdb 
Mar 15 20:47:50 bracer2 kernel: 
Mar 15 20:47:50 bracer2 kernel: 000: 73 79 6d 62 6f 6c 69 63 00 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:50 bracer2 kernel: 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:50 bracer2 kernel: 020: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:50 bracer2 kernel: 030: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:50 bracer2 kernel: 040: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:50 bracer2 kernel: 050: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 

Comment 2 Steve Grubb 2007-03-17 22:59:25 UTC
I did a lot of test compiles and found this bug does not exist in the base
kernel. So I removed patches until the problem was gone. The patch that is
causing this bug is 4.223919.2. If you decipher the text in the slab, you'll
find its an ascii string and usually a path or URL.

Comment 3 Alexander Viro 2007-03-19 07:27:34 UTC
Created attachment 150353 [details]
proposed patch

Comment 4 Steve Grubb 2007-03-20 14:45:39 UTC
This problem was solved by the above patch which should be combined with the
patch  in bz #223919. So, this is a duplicate of that bz.

*** This bug has been marked as a duplicate of 223919 ***


Note You need to log in before you can comment on or make changes to this bug.