Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 232705

Summary: LSPP: getting slab corruption messages
Product: Red Hat Enterprise Linux 5 Reporter: Steve Grubb <sgrubb>
Component: kernelAssignee: Eric Paris <eparis>
Status: CLOSED DUPLICATE QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: high    
Version: 5.0CC: aviro, iboverma, klaus, latten
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-03-20 14:45:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 224041    
Attachments:
Description Flags
proposed patch none

Description Steve Grubb 2007-03-16 18:25:33 UTC
Description of problem:
When running both 65-68 lspp kernels, I get slab corruption when I connect with
vpn to read email:

Mar 15 07:02:55 localhost kernel: Slab corruption: (Not tainted)
start=ffff81002b8c9000, len=4096
Mar 15 07:02:55 localhost kernel:
Mar 15 07:02:55 localhost kernel: Call Trace:
Mar 15 07:02:55 localhost kernel:  [<ffffffff80007144>] check_poison_obj+0x7e/0x1d4
Mar 15 07:02:55 localhost kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7
Mar 15 07:02:55 localhost kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7
Mar 15 07:02:55 localhost kernel:  [<ffffffff8000cb63>]
cache_alloc_debugcheck_after+0x35/0x1cc
Mar 15 07:02:55 localhost kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7
Mar 15 07:02:55 localhost kernel:  [<ffffffff8000ad3e>] kmem_cache_alloc+0xe7/0xf7
Mar 15 07:02:55 localhost kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7
Mar 15 07:02:55 localhost kernel:  [<ffffffff800253c7>] __user_walk_fd+0x22/0x5c
Mar 15 07:02:55 localhost kernel:  [<ffffffff80057b05>] sys_linkat+0x65/0x12e
Mar 15 07:02:55 localhost kernel:  [<ffffffff8003e0e3>]
hrtimer_try_to_cancel+0x4f/0x5b
Mar 15 07:02:55 localhost kernel:  [<ffffffff8005e098>] hrtimer_cancel+0x14/0x21
Mar 15 07:02:55 localhost kernel:  [<ffffffff800c14ca>]
audit_syscall_entry+0x154/0x18a
Mar 15 07:02:55 localhost kernel:  [<ffffffff80072434>]
syscall_trace_enter+0x9a/0x9e
Mar 15 07:02:55 localhost kernel:  [<ffffffff800ea6d6>] sys_link+0x19/0x1b
Mar 15 07:02:55 localhost kernel:  [<ffffffff80060d9a>] tracesys+0xd1/0xdb
Mar 15 07:02:55 localhost kernel:
Mar 15 07:02:55 localhost kernel: 000: 2f 68 6f 6d 65 2f 73 67 72 75 62 62 2f 2e
49 43
Mar 15 07:02:55 localhost kernel: 010: 45 61 75 74 68 6f 72 69 74 79 2d 6c 00 5a
5a 5a
Mar 15 07:02:55 localhost kernel: 020: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a
Mar 15 07:02:55 localhost kernel: 030: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a
Mar 15 07:02:55 localhost kernel: 040: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a
Mar 15 07:02:55 localhost kernel: 050: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a


Version-Release number of selected component (if applicable):
lspp.68

How reproducible:
very - using x86_64 machine

Steps to Reproduce:
1. vpn in to network
2. browse email on imap server
3. look at syslog

Comment 1 Klaus Heinrich Kiwi 2007-03-16 22:28:20 UTC
I have the same similar messages, but apparentely when running the LTP suite the
second time in a row. Shortly after the message, the machine hangs with an
unidentified error (looks like a trace) in the console (hand-copied message below:)
die
do_trap
do_invalid
free_block
printk
error_exit
release_console_sem
free_block
drain_array
cache_recap
run_workqueue
cache_recap
worker_thread
worker_thread
default_wake_function
worker_thread
kthread
trace_hardirqs_on_thunk
child_rip
_spin_unlock_irq
restore_args
kthread
child_rip

The machine is a blade server HS21 - Intel Xeon based, using x86_64 arch.

The funny thing is that this same suite runs in the opteron-based LS21 with no
such problems


/var/log/messages:
Mar 15 20:47:48 bracer2 kernel: Slab corruption: (Not tainted)
start=ffff8100526a0000, len=4096 
Mar 15 20:47:48 bracer2 kernel: 
Mar 15 20:47:48 bracer2 kernel: Call Trace: 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80007144>] check_poison_obj+0x7e/0x1d4 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff800c1c8a>] __audit_getname+0xc4/0x139 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff8000cb63>]
cache_alloc_debugcheck_after+0x35/0x1cc 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff8000ad3e>] kmem_cache_alloc+0xe7/0xf7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff800577d8>] sys_symlinkat+0x36/0xf2 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff800c14d6>]
audit_syscall_entry+0x154/0x18a 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80072474>] syscall_trace_enter+0x9a/0x9e 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff8004ebf5>] sys_symlink+0x11/0x13 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80060dda>] tracesys+0xd1/0xdb 
Mar 15 20:47:48 bracer2 kernel: 
Mar 15 20:47:48 bracer2 kernel: 000: 6f 62 6a 65 63 74 00 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:48 bracer2 kernel: 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:48 bracer2 kernel: 020: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:48 bracer2 kernel: 030: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:48 bracer2 kernel: 040: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:48 bracer2 kernel: 050: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:48 bracer2 kernel: Slab corruption: (Not tainted)
start=ffff8100526a0000, len=4096 
Mar 15 20:47:48 bracer2 kernel: 
Mar 15 20:47:48 bracer2 kernel: Call Trace: 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80007144>] check_poison_obj+0x7e/0x1d4 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff800c1c8a>] __audit_getname+0xc4/0x139 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff8000cb63>]
cache_alloc_debugcheck_after+0x35/0x1cc 
Mar 15 20:47:48 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8000ad3e>] kmem_cache_alloc+0xe7/0xf7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800577d8>] sys_symlinkat+0x36/0xf2 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800c14d6>]
audit_syscall_entry+0x154/0x18a 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80072474>] syscall_trace_enter+0x9a/0x9e 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8004ebf5>] sys_symlink+0x11/0x13 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80060dda>] tracesys+0xd1/0xdb 
Mar 15 20:47:49 bracer2 kernel: 
Mar 15 20:47:49 bracer2 kernel: 000: 6f 62 6a 65 63 74 00 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 020: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 030: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 040: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 050: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: Slab corruption: (Not tainted)
start=ffff8100526a0000, len=4096 
Mar 15 20:47:49 bracer2 kernel: 
Mar 15 20:47:49 bracer2 kernel: Call Trace: 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80007144>] check_poison_obj+0x7e/0x1d4 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800c1c8a>] __audit_getname+0xc4/0x139 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8000cb63>]
cache_alloc_debugcheck_after+0x35/0x1cc 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8000ad3e>] kmem_cache_alloc+0xe7/0xf7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800577d8>] sys_symlinkat+0x36/0xf2 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800c14d6>]
audit_syscall_entry+0x154/0x18a 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80072474>] syscall_trace_enter+0x9a/0x9e 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8004ebf5>] sys_symlink+0x11/0x13 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80060dda>] tracesys+0xd1/0xdb 
Mar 15 20:47:49 bracer2 kernel: 
Mar 15 20:47:49 bracer2 kernel: 000: 6f 62 6a 65 63 74 00 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 020: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 030: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 040: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 050: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: Slab corruption: (Not tainted)
start=ffff8100526a0000, len=4096 
Mar 15 20:47:49 bracer2 kernel: 
Mar 15 20:47:49 bracer2 kernel: Call Trace: 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80007144>] check_poison_obj+0x7e/0x1d4 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800c1c8a>] __audit_getname+0xc4/0x139 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8000cb63>]
cache_alloc_debugcheck_after+0x35/0x1cc 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8000ad3e>] kmem_cache_alloc+0xe7/0xf7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80012eab>] getname+0x26/0x1c7 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800577d8>] sys_symlinkat+0x36/0xf2 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff800c14d6>]
audit_syscall_entry+0x154/0x18a 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80072474>] syscall_trace_enter+0x9a/0x9e 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff8004ebf5>] sys_symlink+0x11/0x13 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80060dda>] tracesys+0xd1/0xdb 
Mar 15 20:47:49 bracer2 kernel: 
Mar 15 20:47:49 bracer2 kernel: 000: 73 79 6d 62 6f 6c 69 63 00 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 020: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 030: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 040: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: 050: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:49 bracer2 kernel: Slab corruption: (Not tainted)
start=ffff8100526a0000, len=4096 
Mar 15 20:47:49 bracer2 kernel: 
Mar 15 20:47:49 bracer2 kernel: Call Trace: 
Mar 15 20:47:49 bracer2 kernel:  [<ffffffff80007144>] check_poison_obj+0x7e/0x1d4 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8001c49c>] open_namei+0x5cb/0x6f4 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8001c49c>] open_namei+0x5cb/0x6f4 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8000cb63>]
cache_alloc_debugcheck_after+0x35/0x1cc 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8001c49c>] open_namei+0x5cb/0x6f4 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8000ad3e>] kmem_cache_alloc+0xe7/0xf7 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8001c49c>] open_namei+0x5cb/0x6f4 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8002948b>] do_filp_open+0x28/0x46 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff80069211>] _spin_unlock+0x26/0x2a 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff80016a67>] get_unused_fd+0xfc/0x10d 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff8001ab59>] do_sys_open+0x4f/0xcd 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff800340de>] sys_open+0x1b/0x1d 
Mar 15 20:47:50 bracer2 kernel:  [<ffffffff80060dda>] tracesys+0xd1/0xdb 
Mar 15 20:47:50 bracer2 kernel: 
Mar 15 20:47:50 bracer2 kernel: 000: 73 79 6d 62 6f 6c 69 63 00 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:50 bracer2 kernel: 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:50 bracer2 kernel: 020: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:50 bracer2 kernel: 030: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:50 bracer2 kernel: 040: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 
Mar 15 20:47:50 bracer2 kernel: 050: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 

Comment 2 Steve Grubb 2007-03-17 22:59:25 UTC
I did a lot of test compiles and found this bug does not exist in the base
kernel. So I removed patches until the problem was gone. The patch that is
causing this bug is 4.223919.2. If you decipher the text in the slab, you'll
find its an ascii string and usually a path or URL.

Comment 3 Alexander Viro 2007-03-19 07:27:34 UTC
Created attachment 150353 [details]
proposed patch

Comment 4 Steve Grubb 2007-03-20 14:45:39 UTC
This problem was solved by the above patch which should be combined with the
patch  in bz #223919. So, this is a duplicate of that bz.

*** This bug has been marked as a duplicate of 223919 ***