Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 454679 - RHEL3.9: mttr race causes kernel hang on boot
Summary: RHEL3.9: mttr race causes kernel hang on boot
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.9
Hardware: x86_64
OS: Linux
Target Milestone: ---
Assignee: Don Howard
QA Contact: Martin Jenner
Depends On:
TreeView+ depends on / blocked
Reported: 2008-07-09 17:57 UTC by Eli Collins
Modified: 2012-06-20 16:09 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2012-06-20 16:09:20 UTC
Target Upstream Version:

Attachments (Terms of Use)
Boot logging (deleted)
2008-07-09 17:57 UTC, Eli Collins
no flags Details

Description Eli Collins 2008-07-09 17:57:20 UTC
There's a race in arch/x86_64/kernel/mtrr.c that can result in a kernel hang and
subsequent NMI lockup detection on boot on SMP systems. I've uploaded boot
output with loglevel 7 that illustrates this hang. It may also exist in i386.

The race is caused by set_mtrr_smp getting re-entered before other cpus have a
chance to leave ipi_handler. Here's pseudo code annotated with where the CPUs in
the uploaded serial logging are hung. I determined this from the RIPs in the
panics and by disassembling smp_call_function and ipi_handler.

The function set_mtrr_smp is executed by a single master cpu:

1. wait_barrier_mtrr_disable = TRUE
   wait_barrier_execute = TRUE 
   wait_barrier_cache_enable = TRUE

2. undone_count = 7
3. send ipis and wait for responses      # CPU 3
4. disable interrupts
5. spin while undone_count > 0

6. undone_count = 7
7. wait_barrier_mtrr_disable = FALSE
8. spin while undone_count > 0

9. undone_count = 7
10. wait_barrier_execute = FALSE
11. spin while undone_count > 0

12. wait_barrier_cache_enable = FALSE
13. enable interrupts

The function ipi_handler is executed by slave cpus:

1. disable interrupts
2. undone_count--
3. spin while wait_barrier_mtrr_disable  # CPUS 0,1,2,5,7

4. undone_count--
5. spin while wait_barrier_execute

6. undone_count--
7. spin while wait_barrier_cache_enable  # CPUS 4,6

8. enable interrupts

When all slave cpus reach step 6 they unblock the master cpu at step 11. If the
master can leave and re-enter set_mtrr_smp (via say back-to-back calls to
mtrr_add_page in mtrr_write) and set wait_barrier_cache_enable to TRUE (step 1)
then any slave that has not yet gone from step 6 to step 7 will hang in step 7.
The master gets to step 3, sends IPIs and waits for all other cpus to respond,
which they won't since they're spinning. The system is hung at this point and
NMI lockup detection will kick in.

This race is unlikely to occur since all slave cpus are likely to execute from
step 6 to step 7 before the master cpu re-enters set_mtrr_smp. However this may
occur more frequently in an overcommited virtual enviornment (the uploaded
example occurred in a VM).

A simple fix would be to have the master cpu defer exiting set_mtrr_smp until
all slave cpus have left ipi_handler.

Comment 1 Eli Collins 2008-07-09 17:57:20 UTC
Created attachment 311397 [details]
Boot logging

Comment 2 Jiri Pallich 2012-06-20 16:09:20 UTC
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. 
Please See

If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.

Note You need to log in before you can comment on or make changes to this bug.