Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 451068 - ext3: oops in do_split, miscompilation with gcc 4.3.1
Summary: ext3: oops in do_split, miscompilation with gcc 4.3.1
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: gcc
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 451487 451546 451573 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-06-12 17:01 UTC by Eric Sandeen
Modified: 2008-06-26 04:43 UTC (History)
11 users (show)

Fixed In Version: 4.3.1-3
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-06-26 04:43:42 UTC


Attachments (Terms of Use)
first part of oops (deleted)
2008-06-12 17:02 UTC, Eric Sandeen
no flags Details
2nd part of oops (deleted)
2008-06-12 17:03 UTC, Eric Sandeen
no flags Details
3rd part of oops (deleted)
2008-06-12 17:03 UTC, Eric Sandeen
no flags Details
preprocessed namei.i from 2.6.26-0.57.rc5.git3.fc10.i686 (deleted)
2008-06-13 04:19 UTC, Eric Sandeen
no flags Details
do_split disassembly from 4.3.0 (deleted)
2008-06-13 04:21 UTC, Eric Sandeen
no flags Details
do_split disassembly from 4.3.1 (deleted)
2008-06-13 04:22 UTC, Eric Sandeen
no flags Details


Links
System ID Priority Status Summary Last Updated
GNU Compiler Collection 36533 None None None Never

Description Eric Sandeen 2008-06-12 17:01:47 UTC
clumens & jeremy both hit this... see soon-to-be-attached jpegs

Comment 1 Eric Sandeen 2008-06-12 17:02:42 UTC
Created attachment 309103 [details]
first part of oops

Comment 2 Eric Sandeen 2008-06-12 17:03:12 UTC
Created attachment 309105 [details]
2nd part of oops

Comment 3 Eric Sandeen 2008-06-12 17:03:34 UTC
Created attachment 309106 [details]
3rd part of oops

Comment 5 Eric Sandeen 2008-06-12 20:34:31 UTC
actually I'll take this, I think it's my fault and I can reproduce it :)

Comment 6 Eric Sandeen 2008-06-12 22:15:36 UTC
I had a hunch that it might be gcc's fault; all the oopsing kernels were built
on shiny new 4.3.1, I tested 4.3.0 and had no problems.

Thanks to Roland for all his help looking into this one....

<roland> the bug is that for ptr[-1].size it went from *(short*)&ptr[-1].size to
*(long*)&ptr[-1].size 
<roland> it's gcc's fault

I'll get a proper gcc bug report filed tonight or tomorrow... in the meantime
looks like gcc 4.3.1 in rawhide is slightly busted...

-Eric

Comment 7 Eric Sandeen 2008-06-13 04:16:49 UTC
This is with:

[root@magnesium ~]# rpm -q gcc
gcc-4.3.1-1.i386

[root@magnesium ~]# gcc -v
Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk
--disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-cpu=generic --build=i386-redhat-linux
Thread model: posix
gcc version 4.3.1 20080609 (Red Hat 4.3.1-1) (GCC) 



Comment 8 Eric Sandeen 2008-06-13 04:19:47 UTC
Created attachment 309166 [details]
preprocessed namei.i from 2.6.26-0.57.rc5.git3.fc10.i686

Comment 9 Eric Sandeen 2008-06-13 04:21:53 UTC
Created attachment 309167 [details]
do_split disassembly from 4.3.0

Comment 10 Eric Sandeen 2008-06-13 04:22:37 UTC
Created attachment 309168 [details]
do_split disassembly from 4.3.1

Comment 11 Eric Sandeen 2008-06-13 04:28:44 UTC
The interesting bit:

        for (i = count-1; i >= 0; i--) {
                /* is more than half of this entry in 2nd half of the block? */
                if (size + map[i].size/2 > blocksize/2)
     906:       8b 7d a0                mov    -0x60(%ebp),%edi
     909:       31 f6                   xor    %esi,%esi
     90b:       31 d2                   xor    %edx,%edx
     90d:       8b 45 d4                mov    -0x2c(%ebp),%eax
     910:       8b 5d 98                mov    -0x68(%ebp),%ebx
     913:       d1 ef                   shr    %edi
     915:       8d 4c 18 fe             lea    -0x2(%eax,%ebx,1),%ecx
     919:       66 8b 19                mov    (%ecx),%bx

The only difference between compilers seems to be %bx vs. %ebx on this last line.

map[i].size is a u16, and it looks like what is happening is that if it loads 4
bytes instead of 2, it crosses the page boundary and we go "BUG: unable to
handle kernel paging request at <first byte in next page>"

Thanks,
-Eric

Comment 12 Jakub Jelinek 2008-06-13 06:48:29 UTC
What exact gcc options were used to compile namei.i?

Comment 13 Eric Sandeen 2008-06-13 13:18:11 UTC
Sorry, knew I was forgetting something:

  gcc -Wp,-MD,/root/ext3/.namei.o.d  -nostdinc -isystem
/usr/lib/gcc/i386-redhat-linux/4.3.1/include -D__KERNEL__ -Iinclude  -include
include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs
-fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os  
-fno-stack-protector -m32 -msoft-float -mregparm=3 -freg-struct-return
-mpreferred-stack-boundary=2  -march=i686 -mtune=generic -mtune=generic
-ffreestanding -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe
-Wno-sign-compare -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2
-mno-3dnow -Iinclude/asm-x86/mach-generic -Iinclude/asm-x86/mach-default
-fno-omit-frame-pointer -fno-optimize-sibling-calls -g
-Wdeclaration-after-statement -Wno-pointer-sign    -DMODULE -D"KBUILD_STR(s)=#s"
-D"KBUILD_BASENAME=KBUILD_STR(namei)"  -D"KBUILD_MODNAME=KBUILD_STR(ext3)" -c -o
/root/ext3/namei.o /root/ext3/namei.c


Comment 14 Eric Sandeen 2008-06-13 13:19:18 UTC
Ah that was namei.o; here's namei.i just to be exact about what you asked:

  gcc -E -Wp,-MD,/root/ext3/.namei.i.d  -nostdinc -isystem
/usr/lib/gcc/i386-redhat-linux/4.3.1/include -D__KERNEL__ -Iinclude  -include
include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs
-fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os  
-fno-stack-protector -m32 -msoft-float -mregparm=3 -freg-struct-return
-mpreferred-stack-boundary=2  -march=i686 -mtune=generic -mtune=generic
-ffreestanding -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe
-Wno-sign-compare -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2
-mno-3dnow -Iinclude/asm-x86/mach-generic -Iinclude/asm-x86/mach-default
-fno-omit-frame-pointer -fno-optimize-sibling-calls -g
-Wdeclaration-after-statement -Wno-pointer-sign    -DMODULE -D"KBUILD_STR(s)=#s"
-D"KBUILD_BASENAME=KBUILD_STR(namei)"  -D"KBUILD_MODNAME=KBUILD_STR(ext3)"   -o
/root/ext3/namei.i /root/ext3/namei.c

Comment 15 Jakub Jelinek 2008-06-13 17:21:07 UTC
Caused by
http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=135124


Comment 16 Chris Lumens 2008-06-16 12:45:36 UTC
*** Bug 451573 has been marked as a duplicate of this bug. ***

Comment 17 Chris Lumens 2008-06-16 13:13:45 UTC
*** Bug 451546 has been marked as a duplicate of this bug. ***

Comment 18 Chris Lumens 2008-06-16 13:21:01 UTC
*** Bug 451487 has been marked as a duplicate of this bug. ***

Comment 19 Eric Sandeen 2008-06-17 02:58:19 UTC
Jakub, any ETA on a fix for this?  Should we un-tag gcc 4.3.1 from rawhide for now?

Thanks,
-Eric

Comment 20 G.Wolfe Woodbury 2008-06-19 21:29:42 UTC
meanwhile, as a workaround for rawhide installs, use ext2 instead of ext3 or ext4

it hits the ext4 filesystems as well.

Comment 21 Eric Sandeen 2008-06-19 21:33:44 UTC
Actually any ext* filesystem which enables the dir_index feature is likely
susceptible; another workaround would be to turn this feature off.

-Eric

Comment 22 Jakub Jelinek 2008-06-25 09:51:37 UTC
Should be fixed in gcc-4.3.1-3.

Comment 23 Eric Sandeen 2008-06-26 04:43:42 UTC
WORKSFORME, I rebuilt the latest kernel w/ this version, did a big yum update,
no problems.

I think 2.6.26-0.93.rc8.fc10 should be the first kernel built with this.

Thanks!

-Eric


Note You need to log in before you can comment on or make changes to this bug.