Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 236375 - [RHEL5 RT] BUG: scheduling with irqs disabled:
Summary: [RHEL5 RT] BUG: scheduling with irqs disabled:
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-kernel
Version: 1.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Steven Rostedt
QA Contact:
URL: http://rhts.lab.boston.redhat.com/cgi...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-04-13 14:38 UTC by Jeff Burke
Modified: 2008-02-27 19:58 UTC (History)
0 users

Fixed In Version: 2.6.21-31.el5rt
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-06-20 16:04:22 UTC
Target Upstream Version:


Attachments (Terms of Use)
fix for irqs disabled while sending segfault (deleted)
2007-06-01 23:27 UTC, Steven Rostedt
no flags Details | Diff

Description Jeff Burke 2007-04-13 14:38:25 UTC
Description of problem:
 On dl360-01.rhts.boston.redhat.com - x86_64 While running test
/kernel/distribution/ltp/20070228 the following BUG happens.

BUG: scheduling with irqs disabled: crash02/0x00000000/16756
caller is rt_spin_lock_slowlock+0x104/0x1b5

Version-Release number of selected component (if applicable):
 2.6.20-12.el5rt

How reproducible:
 Unknown

Actual results:

Call Trace:
 [<ffffffff8106d603>] dump_trace+0xb6/0x3ce
 [<ffffffff8106d962>] show_trace+0x47/0x68
 [<ffffffff8106d998>] dump_stack+0x15/0x17
 [<ffffffff810646a4>] schedule+0x82/0x102
 [<ffffffff81065691>] rt_spin_lock_slowlock+0x104/0x1b5
 [<ffffffff81065f8e>] rt_spin_lock+0x1f/0x21
 [<ffffffff8109b352>] force_sig_info+0x28/0xb4
 [<ffffffff81068c95>] do_page_fault+0x801/0x813
 [<ffffffff81066aed>] error_exit+0x0/0x84
 [<0000000000401a87>]

Expected results:
 Test should pass

Additional info:

Comment 2 Jeff Burke 2007-05-18 00:25:54 UTC
This also happens on the 2.6.21-6.el5rt kernel

<May/17 02:03 pm>BUG: scheduling with irqs disabled: crash02/0x00000000/18587
<May/17 02:03 pm>caller is rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:03 pm>Call Trace:
<May/17 02:03 pm> [<ffffffff8106d59a>] dump_trace+0xaa/0x32a
<May/17 02:03 pm> [<ffffffff8106d85b>] show_trace+0x41/0x5c
<May/17 02:03 pm> [<ffffffff8106d88b>] dump_stack+0x15/0x17
<May/17 02:03 pm> [<ffffffff8106468c>] schedule+0x82/0x102
<May/17 02:03 pm> [<ffffffff8106566a>] rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:03 pm> [<ffffffff81065f46>] rt_spin_lock+0x1f/0x21
<May/17 02:03 pm> [<ffffffff8109aeb2>] force_sig_info+0x28/0xb4
<May/17 02:03 pm> [<ffffffff81068c5d>] do_page_fault+0x801/0x813
<May/17 02:03 pm> [<ffffffff81066a9d>] error_exit+0x0/0x84
<May/17 02:03 pm> [<0000000000401a87>]
<May/17 02:05 pm>BUG: scheduling with irqs disabled: crash02/0x00000000/18590
<May/17 02:05 pm>caller is rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:05 pm>Call Trace:
<May/17 02:05 pm> [<ffffffff8106d59a>] dump_trace+0xaa/0x32a
<May/17 02:05 pm> [<ffffffff8106d85b>] show_trace+0x41/0x5c
<May/17 02:05 pm> [<ffffffff8106d88b>] dump_stack+0x15/0x17
<May/17 02:05 pm> [<ffffffff8106468c>] schedule+0x82/0x102
<May/17 02:05 pm> [<ffffffff8106566a>] rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:05 pm> [<ffffffff81065f46>] rt_spin_lock+0x1f/0x21
<May/17 02:05 pm> [<ffffffff8109aeb2>] force_sig_info+0x28/0xb4
<May/17 02:05 pm> [<ffffffff81068c5d>] do_page_fault+0x801/0x813
<May/17 02:05 pm> [<ffffffff81066a9d>] error_exit+0x0/0x84
<May/17 02:05 pm> [<0000000000401a87>]
<May/17 02:07 pm>BUG: scheduling with irqs disabled: crash02/0x00000000/18589
<May/17 02:07 pm>caller is rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:07 pm>Call Trace:
<May/17 02:07 pm> [<ffffffff8106d59a>] dump_trace+0xaa/0x32a
<May/17 02:07 pm> [<ffffffff8106d85b>] show_trace+0x41/0x5c
<May/17 02:07 pm> [<ffffffff8106d88b>] dump_stack+0x15/0x17
<May/17 02:07 pm> [<ffffffff8106468c>] schedule+0x82/0x102
<May/17 02:07 pm> [<ffffffff8106566a>] rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:07 pm> [<ffffffff81065f46>] rt_spin_lock+0x1f/0x21
<May/17 02:07 pm> [<ffffffff8109aeb2>] force_sig_info+0x28/0xb4
<May/17 02:07 pm> [<ffffffff81068c5d>] do_page_fault+0x801/0x813
<May/17 02:07 pm> [<ffffffff81066a9d>] error_exit+0x0/0x84
<May/17 02:07 pm> [<0000000000401a87>]
<May/17 02:07 pm>BUG: scheduling with irqs disabled: crash02/0x00000000/18586
<May/17 02:07 pm>caller is rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:07 pm>Call Trace:
<May/17 02:07 pm> [<ffffffff8106d59a>] dump_trace+0xaa/0x32a
<May/17 02:07 pm> [<ffffffff8106d85b>] show_trace+0x41/0x5c
<May/17 02:07 pm> [<ffffffff8106d88b>] dump_stack+0x15/0x17
<May/17 02:07 pm> [<ffffffff8106468c>] schedule+0x82/0x102
<May/17 02:07 pm> [<ffffffff8106566a>] rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:07 pm> [<ffffffff81065f46>] rt_spin_lock+0x1f/0x21
<May/17 02:07 pm> [<ffffffff8109aeb2>] force_sig_info+0x28/0xb4
<May/17 02:07 pm> [<ffffffff81068c5d>] do_page_fault+0x801/0x813
<May/17 02:07 pm> [<ffffffff81066a9d>] error_exit+0x0/0x84
<May/17 02:07 pm> [<0000000000401a87>]
<May/17 02:08 pm>BUG: scheduling with irqs disabled: crash02/0x00000000/18587
<May/17 02:08 pm>caller is rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:08 pm>Call Trace:
<May/17 02:08 pm> [<ffffffff8106d59a>] dump_trace+0xaa/0x32a
<May/17 02:08 pm> [<ffffffff8106d85b>] show_trace+0x41/0x5c
<May/17 02:08 pm> [<ffffffff8106d88b>] dump_stack+0x15/0x17
<May/17 02:08 pm> [<ffffffff8106468c>] schedule+0x82/0x102
<May/17 02:08 pm> [<ffffffff8106566a>] rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:08 pm> [<ffffffff81065f46>] rt_spin_lock+0x1f/0x21
<May/17 02:08 pm> [<ffffffff8109aeb2>] force_sig_info+0x28/0xb4
<May/17 02:08 pm> [<ffffffff81068c5d>] do_page_fault+0x801/0x813
<May/17 02:08 pm> [<ffffffff81066a9d>] error_exit+0x0/0x84
<May/17 02:08 pm> [<0000000000401a87>]
<May/17 02:11 pm>BUG: scheduling with irqs disabled: crash02/0x00000000/18589
<May/17 02:11 pm>caller is rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:11 pm>Call Trace:
<May/17 02:11 pm> [<ffffffff8106d59a>] dump_trace+0xaa/0x32a
<May/17 02:11 pm> [<ffffffff8106d85b>] show_trace+0x41/0x5c
<May/17 02:11 pm> [<ffffffff8106d88b>] dump_stack+0x15/0x17
<May/17 02:11 pm> [<ffffffff8106468c>] schedule+0x82/0x102
<May/17 02:11 pm> [<ffffffff8106566a>] rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:11 pm> [<ffffffff81065f46>] rt_spin_lock+0x1f/0x21
<May/17 02:11 pm> [<ffffffff8109aeb2>] force_sig_info+0x28/0xb4
<May/17 02:11 pm> [<ffffffff81068c5d>] do_page_fault+0x801/0x813
<May/17 02:11 pm> [<ffffffff81066a9d>] error_exit+0x0/0x84
<May/17 02:11 pm> [<0000000000401a87>]
<May/17 02:11 pm>logger: /usr/bin/rhts-test-runner.sh 1080 hearbeat...
<May/17 02:12 pm>logger: /usr/bin/rhts-test-runner.sh 1140 hearbeat...
<May/17 02:13 pm>logger: /usr/bin/rhts-test-runner.sh 1200 hearbeat...
<May/17 02:13 pm>BUG: scheduling with irqs disabled: crash02/0x00000000/18587
<May/17 02:13 pm>caller is rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:13 pm>Call Trace:
<May/17 02:13 pm> [<ffffffff8106d59a>] dump_trace+0xaa/0x32a
<May/17 02:13 pm> [<ffffffff8106d85b>] show_trace+0x41/0x5c
<May/17 02:13 pm> [<ffffffff8106d88b>] dump_stack+0x15/0x17
<May/17 02:13 pm> [<ffffffff8106468c>] schedule+0x82/0x102
<May/17 02:13 pm> [<ffffffff8106566a>] rt_spin_lock_slowlock+0xfe/0x1a1
<May/17 02:13 pm> [<ffffffff81065f46>] rt_spin_lock+0x1f/0x21
<May/17 02:13 pm> [<ffffffff8109aeb2>] force_sig_info+0x28/0xb4
<May/17 02:13 pm> [<ffffffff81068c5d>] do_page_fault+0x801/0x813
<May/17 02:13 pm> [<ffffffff81066a9d>] error_exit+0x0/0x84
<May/17 02:13 pm> [<0000000000401a87>]

http://rhts.lab.boston.redhat.com/cgi-bin/rhts/test_log.cgi?id=68172
This was on ibm-defiant.rhts.boston.redhat.com running the racer test.

Comment 3 Jeff Burke 2007-05-23 13:03:21 UTC
In 2.6.21-11.el5rt I see the following output:
BUG: scheduling with irqs disabled: IRQ-9/0x00000001/105
caller is rt_spin_lock_slowlock+0xfe/0x1a1

Call Trace:
 [<ffffffff8106d5da>] dump_trace+0xaa/0x32a
 [<ffffffff8106d89b>] show_trace+0x41/0x5c
 [<ffffffff8106d8cb>] dump_stack+0x15/0x17
 [<ffffffff8106468c>] schedule+0x82/0x102
 [<ffffffff8106566a>] rt_spin_lock_slowlock+0xfe/0x1a1
 [<ffffffff81065f46>] rt_spin_lock+0x1f/0x21
 [<ffffffff811bab3f>] serial8250_console_write+0x61/0x148
 [<ffffffff81091b94>] __call_console_drivers+0x68/0x79
 [<ffffffff81091c14>] _call_console_drivers+0x6f/0x73
 [<ffffffff81017357>] release_console_sem+0x139/0x1cf
 [<ffffffff810923fb>] vprintk+0x2c0/0x314
 [<ffffffff810924b6>] printk+0x67/0x69
 [<ffffffff8106db5b>] printk_address+0x93/0x9d
 [<ffffffff8106dc58>] print_trace_address+0xc/0xe
 [<ffffffff8106d4ac>] dump_trace_unwind+0x56/0xda
 [<ffffffff810b1b94>] unwind_init_running+0x20/0x22
 [<ffffffff8106d5da>] dump_trace+0xaa/0x32a
 [<ffffffff8106d89b>] show_trace+0x41/0x5c
 [<ffffffff8106d8cb>] dump_stack+0x15/0x17
 [<ffffffff810c5007>] __report_bad_irq+0x38/0x87
 [<ffffffff810c521a>] note_interrupt+0x1c4/0x1fc
 [<ffffffff810c452c>] thread_simple_irq+0x6c/0x7e
 [<ffffffff810c4c79>] do_irqd+0x13b/0x394
 [<ffffffff81033d40>] kthread+0xf5/0x128
 [<ffffffff8105ff68>] child_rip+0xa/0x12

BUG: scheduling while atomic: IRQ-9/0x00000001/105, CPU#0

Call Trace:
 [<ffffffff8106d5da>] dump_trace+0xaa/0x32a
 [<ffffffff8106d89b>] show_trace+0x41/0x5c
 [<ffffffff8106d8cb>] dump_stack+0x15/0x17
 [<ffffffff810636b8>] __sched_text_start+0x98/0xd20
 [<ffffffff810646ec>] schedule+0xe2/0x102
 [<ffffffff8106566a>] rt_spin_lock_slowlock+0xfe/0x1a1
 [<ffffffff81065f46>] rt_spin_lock+0x1f/0x21
 [<ffffffff811bab3f>] serial8250_console_write+0x61/0x148
 [<ffffffff81091b94>] __call_console_drivers+0x68/0x79
 [<ffffffff81091c14>] _call_console_drivers+0x6f/0x73
 [<ffffffff81017357>] release_console_sem+0x139/0x1cf
 [<ffffffff810923fb>] vprintk+0x2c0/0x314
 [<ffffffff810924b6>] printk+0x67/0x69
 [<ffffffff8106db5b>] printk_address+0x93/0x9d
 [<ffffffff8106dc58>] print_trace_address+0xc/0xe
 [<ffffffff8106d4ac>] dump_trace_unwind+0x56/0xda
 [<ffffffff810b1b94>] unwind_init_running+0x20/0x22
 [<ffffffff8106d5da>] dump_trace+0xaa/0x32a
 [<ffffffff8106d89b>] show_trace+0x41/0x5c
 [<ffffffff8106d8cb>] dump_stack+0x15/0x17
 [<ffffffff810c5007>] __report_bad_irq+0x38/0x87
 [<ffffffff810c521a>] note_interrupt+0x1c4/0x1fc
 [<ffffffff810c452c>] thread_simple_irq+0x6c/0x7e
 [<ffffffff810c4c79>] do_irqd+0x13b/0x394
 [<ffffffff81033d40>] kthread+0xf5/0x128
 [<ffffffff8105ff68>] child_rip+0xa/0x12

 [<ffffffff8106d89b>] show_trace+0x41/0x5c
syscalls/lseek/l [<ffffffff8106d8cb>] dump_stack+0x15/0x17

Comment 4 Steven Rostedt 2007-06-01 23:13:51 UTC
comment #3 is a different bug (different source of the problem, although it
gives the same error).

I put in a patch to do a might_sleep() in rt_spin_lock_fastlock (sent to LKML
too).  With the patch, I get the "scheduling with irqs disabled" everytime I
have a program crash (purposely).  So this is *not* fixed, and is *not* a dup of
237463.


Comment 5 Steven Rostedt 2007-06-01 23:27:06 UTC
Created attachment 155954 [details]
fix for irqs disabled while sending segfault

Comment 6 Steven Rostedt 2007-06-01 23:27:53 UTC
Now it's solved.

In do_page_fault if the faulting address from user code is > PAGE_OFFSET (say
(void*)-1) then the code will jump right to bad_area_nosemaphore.  The code then
checks if it was userspace that caused the fault, and if so it sends a signal. 
But it skipped the local_irq_enable() part, so interrupts are disabled here.

So faulting at something like (void*)1 would not cause a problem (IRQS would be
enabled since the condition of address > PAGE_OFFSET fails).  But if the user
code faults at (void*)-1 interrupts would never be enabled, since the test for
address > PAGE_OFFSET succeeds.

Attached is a patch to always enable interrupts for user faults.



Comment 7 Clark Williams 2007-06-20 16:04:22 UTC
this patch is now part of the -rt patch series


Note You need to log in before you can comment on or make changes to this bug.