Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 453635 - kernel BUG at fs/ext4/mballoc.c:1648!
Summary: kernel BUG at fs/ext4/mballoc.c:1648!
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 9
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Eric Sandeen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-07-01 16:45 UTC by Jeff Moyer
Modified: 2008-11-10 14:50 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-11-10 14:50:50 UTC


Attachments (Terms of Use)

Description Jeff Moyer 2008-07-01 16:45:25 UTC
Description of problem:
------------[ cut here ]------------
kernel BUG at fs/ext4/mballoc.c:1648!
invalid opcode: 0000 [1] SMP 
CPU 0 
Modules linked in: nfsd auth_rpcgss exportfs nls_utf8 nfs lockd nfs_acl
usb_storage bridge bnep rfcomm l2cap bluetooth autofs4 fuse sunrpc ip6t_REJECT
xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables
x_tables cpufreq_ondemand acpi_cpufreq freq_table ext4dev jbd2 crc16 ext2
dm_mirror dm_multipath dm_mod ipv6 sr_mod cdrom ata_generic ppdev snd_hda_intel
floppy dcdbas parport_pc parport snd_seq_dummy snd_seq_oss i2c_i801 pcspkr
i2c_core firewire_ohci snd_seq_midi_event ata_piix sg iTCO_wdt firewire_core
snd_seq pata_acpi iTCO_vendor_support snd_seq_device crc_itu_t snd_pcm_oss
snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_hwdep snd tg3 joydev button
i82975x_edac soundcore edac_core ahci libata sd_mod scsi_mod ext3 jbd mbcache
uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
Pid: 4357, comm: fio Not tainted 2.6.25.6-55.fc9.x86_64 #1
RIP: 0010:[<ffffffff8832ae8d>]  [<ffffffff8832ae8d>]
:ext4dev:ext4_mb_new_blocks+0x1043/0x2175
RSP: 0018:ffff81006cca3a98  EFLAGS: 00010246
RAX: 0000000000008000 RBX: 0000000000008000 RCX: 0000000000008000
RDX: 0000000000008000 RSI: 0000000000008000 RDI: 000000000000000c
RBP: ffff81006cca3c58 R08: 000000000000000d R09: ffff81007b044fff
R10: 000000000000000d R11: 0000000000000001 R12: 0000000000000000
R13: ffff81001e9f31f8 R14: 0000000000000fce R15: ffff81001e9f3238
FS:  00007f5a30f966f0(0000) GS:ffffffff813f2000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000032c30dad60 CR3: 000000006ac42000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process fio (pid: 4357, threadinfo ffff81006cca2000, task ffff810016c7a000)
Stack:  ffff810027481f00 ffff810044c02420 0000000000000000 ffff81006cca3bf8
 0000000300000000 0000000000000000 0000000000000000 ffff8100587dc000
 0000000000000292 ffff81006cca3d64 ffff81006cca3cf8 ffff81007f6e1300
Call Trace:
 [<ffffffff88302e22>] ? :jbd2:find_revoke_record+0x5a/0x89
 [<ffffffff883032cf>] ? :jbd2:jbd2_journal_cancel_revoke+0x11c/0x163
 [<ffffffff8832699b>] :ext4dev:ext4_ext_get_blocks+0x83a/0xa2f
 [<ffffffff8128e800>] ? __down_read+0x1a/0x98
 [<ffffffff88317f41>] :ext4dev:ext4_get_blocks_wrap+0xd3/0x110
 [<ffffffff88323de3>] :ext4dev:ext4_fallocate+0x194/0x350
 [<ffffffff810b7e55>] ? notify_change+0x2fb/0x30e
 [<ffffffff8106c5bf>] ? audit_syscall_entry+0x126/0x15a
 [<ffffffff8106c290>] ? audit_syscall_exit+0x331/0x353
 [<ffffffff810a30ef>] sys_fallocate+0xfb/0x11f
 [<ffffffff8100c052>] tracesys+0xd5/0xda


Code: 88 48 c7 c6 70 1e 33 88 31 c0 e8 53 4d ff ff e9 da 00 00 00 49 8b 45 08 48
8b 80 58 02 00 00 48 8b 48 10 48 63 c2 48 39 c8 72 04 <0f> 0b eb fe 48 63 45 a4
48 39 c8 72 04 0f 0b eb fe 41 80 bd 82 
RIP  [<ffffffff8832ae8d>] :ext4dev:ext4_mb_new_blocks+0x1043/0x2175
 RSP <ffff81006cca3a98>
---[ end trace e1aedad6ea231792 ]---


Version-Release number of selected component (if applicable):
2.6.25.6-55.fc9.x86_64

How reproducible:
Not sure.

Steps to Reproduce:

Use the following fio work file:

[global]
ioengine=libaio
iodepth=64
bs=4k
; job files should be pre-allocated, and each file should be created
; in turn so as not to interleave disk blocks.
direct=1
size=1024m
overwrite=1
create_serialize=1
unlink=0
;thread

[aio-test1]
rw=write

[aio-test2]
rw=read

[aio-test3]
rw=randwrite

[aio-test4]
rw=randread
  
Actual results:
Backtrace reported above, and file system does not like to do I/O after this.

Comment 1 Jeff Moyer 2008-07-01 17:20:24 UTC
OK, this is reproducible.  Time for another reboot.

Comment 2 Jeff Moyer 2008-07-01 17:45:39 UTC
1640 static void ext4_mb_measure_extent(struct ext4_allocation_context *ac,
1641                                         struct ext4_free_extent *ex,
1642                                         struct ext4_buddy *e4b)
1643 {
1644         struct ext4_free_extent *bex = &ac->ac_b_ex;
1645         struct ext4_free_extent *gex = &ac->ac_g_ex;
1646 
1647         BUG_ON(ex->fe_len <= 0);
1648         BUG_ON(ex->fe_len >= EXT4_BLOCKS_PER_GROUP(ac->ac_sb));
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1649         BUG_ON(ex->fe_start >= EXT4_BLOCKS_PER_GROUP(ac->ac_sb));
1650         BUG_ON(ac->ac_status != AC_STATUS_CONTINUE);

Here's what fio is doing:

open("aio-test1.1.0", O_RDWR|O_CREAT|O_DIRECT, 0600) = 8
fstat(8, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
close(8)                                = 0
write(1, "aio-test1: Laying out IO file(s)"..., 55aio-test1: Laying out IO
file(s) (1 file(s) / 1024MiB)
) = 55
open("aio-test1.1.0", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 8
ftruncate(8, 1073741824)                = 0
syscall_285(0x8, 0, 0, 0x40000000, 0x40000000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
Message from syslogd@segfault at Jul  1 13:42:01 ...
 kernel: ------------[ cut here ]------------



Comment 3 Jeff Moyer 2008-07-01 17:54:48 UTC
(sorry for so many updates, but this is crashing my desktop machine, so I can
only get so far before I need to reboot!)

And, of course, syscall 285 is fallocate (but we all knew that, given the stack
trace):

#define __NR_fallocate                          285
__SYSCALL(__NR_fallocate, sys_fallocate)


Comment 4 Jeff Moyer 2008-07-01 19:05:50 UTC
FYI, I booted 2.6.26-rc8 and could not reproduce the problem with that kernel.

Comment 5 Chuck Ebbert 2008-11-10 14:50:50 UTC
Pretty sure this is fixed now.


Note You need to log in before you can comment on or make changes to this bug.