Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 456101 - F10 pv_ops xen: ext3:do_split() oops during yum update on i686
Summary: F10 pv_ops xen: ext3:do_split() oops during yum update on i686
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel-xen
Version: rawhide
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Xen Maintainance List
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: F10_XenPvOps
TreeView+ depends on / blocked
 
Reported: 2008-07-21 15:20 UTC by Mark McLoughlin
Modified: 2009-12-14 20:40 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-07-21 16:54:54 UTC


Attachments (Terms of Use)
config-2.6.27-0.2.rc0.git6.fc10.i686.xen (deleted)
2008-07-21 15:20 UTC, Mark McLoughlin
no flags Details

Description Mark McLoughlin 2008-07-21 15:20:09 UTC
With kernel-xen-2.6.26-0.1.rc6.git2.fc10.i686

Seen this a few times on i686 now, but not on x86_64. I don't have a better
reproducer than "during a yum update":

BUG: unable to handle kernel paging request at c7553000
IP: [<e08a0109>] :ext3:do_split+0x1f4/0x41f
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: bridge bnep rfcomm l2cap bluetooth autofs4 sunrpc ipt_REJECT
nf_conntrack_ipv4 iptable_filter ip_tables i\
p6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter
ip6_tables x_tables ipv6 loop dm_multipath pcsp\
kr xen_netfront dm_snapshot dm_zero dm_mirror dm_log dm_mod xen_blkfront ext3
jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last u\
nloaded: microcode]

Pid: 1878, comm: yum Tainted: G        W (2.6.26-0.1.rc6.git2.fc10.i686.xen #1)
EIP: 0061:[<e08a0109>] EFLAGS: 00210206 CPU: 0
EIP is at do_split+0x1f4/0x41f [ext3]
EAX: c7552b20 EBX: 000004e0 ECX: c7552ffe EDX: 00000000
ESI: 00000000 EDI: 00000800 EBP: d7ce3da8 ESP: d7ce3d30
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
Process yum (pid: 1878, ti=d7ce3000 task=d95fc7a0 task.ti=d7ce3000)
Stack: df205000 d7ce3e3c dfaaaee4 db473000 000004e0 d7ce3d8c 00001000 ccd0e820 
       d6127de0 c7553000 df204000 c7552000 0000009c c7552ff8 4ab2d388 00000fd4 
       d7ce3dac e08a03a0 df204000 c7552b20 00000014 64c74e0c 8f48d87c 00000002 
Call Trace:
 [<e08a03a0>] ? add_dirent_to_buf+0x6c/0x26c [ext3]
 [<e08a0968>] ? ext3_add_entry+0x3c8/0x787 [ext3]
 [<c0650015>] ? _spin_unlock+0x1d/0x20
 [<c040484f>] ? xen_restore_fl_direct_end+0x0/0x1
 [<c044b567>] ? lock_acquire+0x84/0x90
 [<e08a1ef2>] ? ext3_rename+0x1b5/0x434 [ext3]
 [<c0499011>] ? vfs_rename+0x273/0x3d2
 [<c0650015>] ? _spin_unlock+0x1d/0x20
 [<c049a704>] ? sys_renameat+0x188/0x1f3
 [<c064ef35>] ? mutex_unlock+0x8/0xa
 [<c040484f>] ? xen_restore_fl_direct_end+0x0/0x1
 [<c064ef25>] ? __mutex_unlock_slowpath+0x105/0x10d
 [<c064ef35>] ? mutex_unlock+0x8/0xa
 [<c064ef35>] ? mutex_unlock+0x8/0xa
 [<c04090bf>] ? do_IRQ+0xac/0xc5
 [<c05582c2>] ? xen_evtchn_do_upcall+0xe4/0x111
 [<c049a781>] ? sys_rename+0x12/0x14
 [<c0406bfa>] ? syscall_call+0x7/0xb
 =======================
Code: 89 56 04 89 79 f8 89 f1 3b 4d d4 77 d7 85 c0 74 07 8b 75 bc 31 c0 eb ee 8b
7d a0 31 f6 31 d2 8b 45 d4 8b 5d 98 d1 ef 8\
d 4c 18 fe <8b> 19 83 e9 08 89 d8 66 d1 e8 0f b7 c0 8d 04 02 39 f8 77 08 0f 
EIP: [<e08a0109>] do_split+0x1f4/0x41f [ext3] SS:ESP 0069:d7ce3d30
---[ end trace 4eaa2a86a8e2da22 ]---


The git tree corresponding to this build is:

http://git.et.redhat.com/?p=linux-2.6-fedora-pvops.git;a=commit;h=8e2c4a66e3132aa5e5209906484f3a0ab50e7a44

Comment 1 Mark McLoughlin 2008-07-21 15:20:09 UTC
Created attachment 312270 [details]
config-2.6.27-0.2.rc0.git6.fc10.i686.xen

Comment 2 Mark McLoughlin 2008-07-21 16:01:08 UTC
See also bug #451068 and:

  http://www.kerneloops.org/search.php?search=do_split

That was a gcc bug supposedly fixed by gcc-4.3.1-3

However, looking at:

http://kojipkgs.fedoraproject.org/packages/kernel-xen-2.6/2.6.27/0.2.rc0.git6.fc10/data/logs/i686/root.log

this package was built with gcc-4.3.1-4



Comment 3 Jeremy Fitzhardinge 2008-07-21 16:04:27 UTC
So, to be clear, this oops happens only:
 - under Xen
 - on i386
 - in this place
?

The fault address looks perfectly reasonable, so I assume it's some kind of
use-after-free detected by DEBUG_PAGEALLOC.  I'll try to reproduce it, but at
first look it doesn't seem terribly Xen-specific.

Comment 4 Mark McLoughlin 2008-07-21 16:10:59 UTC
Yep, nevermind this one Jeremy - most probably a gcc bug

Comment 5 Mark McLoughlin 2008-07-21 16:35:34 UTC
Looking at the where it was previously mis-compiled, we don't seem to have the
same issue:

    72e1:       8b 7d a0                mov    -0x60(%ebp),%edi
    72e4:       31 f6                   xor    %esi,%esi
    72e6:       31 d2                   xor    %edx,%edx
    72e8:       8b 45 d4                mov    -0x2c(%ebp),%eax
    72eb:       8b 5d 98                mov    -0x68(%ebp),%ebx
    72ee:       d1 ef                   shr    %edi
    72f0:       8d 4c 18 fe             lea    -0x2(%eax,%ebx,1),%ecx
    72f4:       66 8b 19                mov    (%ecx),%bx

With the previous gcc-4.3.1 bug, this last line was:

    7109:	8b 19                	mov    (%ecx),%ebx

i.e. %ebx vs. %bx was apparently the problem previously


Comment 6 Mark McLoughlin 2008-07-21 16:54:54 UTC
Bah, this seems to have been a total mixup:

(In reply to comment #0)
> With kernel-xen-2.6.26-0.1.rc6.git2.fc10.i686
...
> Pid: 1878, comm: yum Tainted: G        W (2.6.26-0.1.rc6.git2.fc10.i686.xen #1)

I should have been running kernel-xen-2.6.27-0.2.rc0.git6.fc10.i686


Note You need to log in before you can comment on or make changes to this bug.