Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 455179 - SIGKILL may crash in flush_old_exec/release_task
Summary: SIGKILL may crash in flush_old_exec/release_task
Keywords:
Status: CLOSED DUPLICATE of bug 452706
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.7
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Jerome Marchand
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On: 311931
Blocks: 461297
TreeView+ depends on / blocked
 
Reported: 2008-07-13 14:37 UTC by Jan Kratochvil
Modified: 2008-11-26 15:37 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-11-26 15:27:19 UTC


Attachments (Terms of Use)
Testcase. (deleted)
2008-07-13 14:37 UTC, Jan Kratochvil
no flags Details

Description Jan Kratochvil 2008-07-13 14:37:45 UTC
Description of problem:
Attached testcase causes Kernel BUG crash.
It SIGKILLs a process doing execve() in a loop.

Version-Release number of selected component (if applicable):
RHEL-4.7 kernel-smp-2.6.9-78.EL.x86_64
Heuristically tested as non-crashing:
RHEL-5.2 kernel-2.6.18-92.el5.x86_64
F-9 kernel-2.6.25.9-76.fc9.x86_64
F-9 kernel-vanilla-2.6.25.6-55.vanilla.fc9.x86_64
(but no-one knows if the race isn't just less reproducible there)

How reproducible:
At most several seconds.

Steps to Reproduce:
1. gcc -o exitcrash exitcrash.c -Wall -ggdb2 -pthread -D_GNU_SOURCE 
2. ./exitcrash

Actual results:
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at signal:377
invalid operand: 0000 [1] SMP
CPU 0
Modules linked in: md5 ipv6 parport_pc lp parport autofs4 sunrpc ds yenta_socket
pcmcia_core cpufreq_powersave loop button battery ac uhci_hcd ehci_hcd hw_random
snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore
snd_page_alloc tg3 floppy sr_mod dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod
ahci libata sd_mod scsi_mod
Pid: 31269, comm: exe Not tainted 2.6.9-78.ELsmp
RIP: 0010:[<ffffffff80141f0a>] <ffffffff80141f0a>{__exit_signal+29}
RSP: 0018:0000010023895c58  EFLAGS: 00010046
RAX: 000001003d2d20d0 RBX: 0000000000000000 RCX: 0000000000000054
RDX: 000001000000c000 RSI: ffffffff8050e600 RDI: 000001003d2d2030
RBP: 000001003d2d2030 R08: 0000000000000000 R09: 00000001801ae824
R10: 0000000000000000 R11: ffffffff801ae824 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 000001002383f700
FS:  0000000000000000(0000) GS:ffffffff8050d280(005b) knlGS:00000000f7fdeba0
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000f7fdd388 CR3: 0000000000101000 CR4: 00000000000006e0
Process exe (pid: 31269, threadinfo 0000010023894000, task 00000100246457f0)
Stack: 000001003d2d2030 000001003d2d2030 000001003d2d2030 0000000000000000
       0000000000000000 ffffffff80139c21 000001000000c000 0000000000000010
       000001003d2d2030 000001003eb4dac0
Call Trace:<ffffffff80139c21>{release_task+126}
<ffffffff80185c9f>{flush_old_exec+1696}
       <ffffffff8017bbf1>{vfs_read+248} <ffffffff80130807>{load_elf32_binary+1673}
       <ffffffff801a6c26>{load_elf_binary+5452}
<ffffffff8015e3aa>{generic_file_aio_read+48}
       <ffffffff8017bacd>{do_sync_read+178} <ffffffff8013017e>{load_elf32_binary+0}
       <ffffffff80186789>{search_binary_handler+209}
<ffffffff801a3487>{compat_do_execve+398}
       <ffffffff80128757>{sys32_execve+53} <ffffffff801269cd>{ia32_ptregs_common+37}


Code: 0f 0b 8a 25 33 80 ff ff ff ff 79 01 8b 03 85 c0 75 0c 0f 0b
RIP <ffffffff80141f0a>{__exit_signal+29} RSP <0000010023895c58>
 <0>Kernel panic - not syncing: Oops

Expected results:
No crash.

Additional info:
The extra thread there may be redundant, it is derived from a ptrace-testsuite
testcase late-ptrace-may-attach-check.c.

Comment 1 Jan Kratochvil 2008-07-13 14:37:45 UTC
Created attachment 311664 [details]
Testcase.

Comment 2 Jan Kratochvil 2008-07-13 17:18:13 UTC
Threading appears to be required to crash it, Bug 311931 may need more fixes.

Kernel 2.6.9-78.ELsmp on an x86_64

RHTS Job 25225 - intel-s5000phb-01.rhts.bos.redhat.com
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at signal:377
invalid operand: 0000 [1] SMP 
CPU 5 
Modules linked in: md5 ipv6 parport_pc lp parport autofs4 sunrpc ds yenta_socket
pcmcia_core cpufreq_powersave loop button battery ac uhci_hcd ehci_hcd
i5000_edac edac_mc hw_random e1000 dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod
ata_piix libata mptscsih mptsas mptspi mptscsi mptbase sd_mod scsi_mod
Pid: 1, comm: init Not tainted 2.6.9-78.ELsmp
RIP: 0010:[<ffffffff80141f0a>] <ffffffff80141f0a>{__exit_signal+29}
RSP: 0018:000001003fb61e68  EFLAGS: 00010046
RAX: 000001003ba47890 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000007fbfffd501 RSI: 0000000000000000 RDI: 000001003ba477f0
RBP: 000001003ba477f0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 000001003ba47918 R15: 0000007fbfffd584
FS:  0000002a95562360(0000) GS:ffffffff8050d500(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000409fe028 CR3: 0000000037e12000 CR4: 00000000000006e0
Process init (pid: 1, threadinfo 000001003fb60000, task 000001000153f7f0)
Stack: 000001003ba477f0 000001003ba477f0 00000000000064fa 0000000000000000 
       0000000000000000 ffffffff80139c21 0000007fbfffd501 000001003ba477f0 
       00000000000064fa 0000000000000000 
Call Trace:<ffffffff80139c21>{release_task+126} <ffffffff8013c3f2>{do_wait+2758} 
       <ffffffff80134709>{default_wake_function+0}
<ffffffff80134709>{default_wake_function+0} 
       <ffffffff8011037f>{sysret_signal+28} <ffffffff801102f6>{system_call+126} 
       

Code: 0f 0b 8a 25 33 80 ff ff ff ff 79 01 8b 03 85 c0 75 0c 0f 0b 
RIP <ffffffff80141f0a>{__exit_signal+29} RSP <000001003fb61e68>
 <0>Kernel panic - not syncing: Oops


Comment 4 RHEL Product and Program Management 2008-09-03 13:02:59 UTC
Updating PM score.

Comment 5 RHEL Product and Program Management 2008-09-19 13:52:51 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 6 Jerome Marchand 2008-10-24 13:43:02 UTC
I didn't reproduce the bug as easily as stated above. I had to adjust the timeout to a few minutes to reproduce it on x86_64, but it's still systematic. I haven't reproduce it so far on an other arch, but I keep trying. I don't think it's x86_64 specific.

Comment 7 Jerome Marchand 2008-11-12 12:14:00 UTC
I still don't know too much about why the crash happens, but a least I reproduced it on i686. The reproducibility of that bug depends a lot on the machine it runs on.

Comment 8 Jerome Marchand 2008-11-26 15:27:19 UTC
This a duplicate of 452706. It's already fixed in recent kernels.

*** This bug has been marked as a duplicate of bug 452706 ***

Comment 9 Jan Kratochvil 2008-11-26 15:37:43 UTC
Denys,
found out this testcase+Bug is forgotten to be included in the ptrace testsuite and also in the tests/kernel/syscalls/ptrace/BUGS RHEL Bugs list.


Note You need to log in before you can comment on or make changes to this bug.