Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 452693 - POSIX timer set to fire immediately does not fire
Summary: POSIX timer set to fire immediately does not fire
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-kernel
Version: 1.0
Hardware: i386
OS: Linux
low
high
Target Milestone: 1.0.1
: ---
Assignee: Steven Rostedt
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-06-24 14:52 UTC by Roland Westrelin
Modified: 2008-08-26 19:57 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-08-26 19:57:46 UTC


Attachments (Terms of Use)
reproducer for timer_settime(3p) bug (deleted)
2008-06-27 21:11 UTC, Clark Williams
no flags Details
updated reproducer for timer_settime(3p) problem (deleted)
2008-06-30 15:50 UTC, Clark Williams
no flags Details
shell script to run reproducer until it fails (deleted)
2008-06-30 16:40 UTC, Clark Williams
no flags Details
hrtimer: prevent migration for raising CPU (deleted)
2008-07-03 20:54 UTC, Clark Williams
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2008:0585 normal SHIPPED_LIVE Important: kernel security and bug fix update 2008-08-26 19:56:57 UTC

Description Roland Westrelin 2008-06-24 14:52:53 UTC
Description of problem:

We sometime have a strange behaviour with POSIX timers where we program
a timer to fire immediately (get the current time with clock_gettime(()
and program the timer with timer_settime()) but the timer's signal is
never delivered.

This behaviour can be observed on all 2.6.24.7 kernels but not on the
2.6.24.4 kernels which make me suspect a kernel bug.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Clark Williams 2008-06-27 21:11:35 UTC
Created attachment 310481 [details]
reproducer for timer_settime(3p) bug

Test program that shows the timer_settime(3p) bug

Comment 2 Clark Williams 2008-06-30 15:50:37 UTC
Created attachment 310594 [details]
updated reproducer for timer_settime(3p) problem

Changed to just detect signal fire rather than use pause

Comment 3 Clark Williams 2008-06-30 15:53:01 UTC
When i wrote the above reproducer, I noticed that I was seeing the printf from
the signal handler before entering the pause. Luis changed the test case to just
set a variable, usleep for a bit after setting the timer, then check if the
signal handler had modified the variable. 

So far we have not seen a case on our -rt kernels where the signal has not been
delivered. Do you have a different reproducer we could try?

Comment 4 Clark Williams 2008-06-30 16:40:47 UTC
Created attachment 310598 [details]
shell script to run reproducer until it fails

Shell script to run the reproducer either as sched_other or sched_fifo until it
fails or until ctl-C is hit

Comment 5 Clark Williams 2008-06-30 16:42:09 UTC
Update since I last posted; we have not seen this behavior on a SCHED_OTHER
thread, but can reliably reproduce it on a SCHED_FIFO thread. 

Comment 6 Luis Claudio R. Goncalves 2008-06-30 17:59:51 UTC
After doing different tests I also noticed that when the reproducer runs as
SCHED_FIFO we have some eventual fails. It doesn't matter whether it runs at
priority 2, 30 or 97, as long as it runs as SCHED_FIFO.

I started a new set of tests to narrow this issue down.

As a side note, I was unable to reproduce this behavior with the rt-vanilla kernel.

Comment 7 Clark Williams 2008-07-03 20:54:22 UTC
Created attachment 310963 [details]
hrtimer: prevent migration for raising CPU

From: Steven Rostedt <srostedt@redhat.com>
Subject: hrtimer: prevent migration for raising CPU

Due to a possible deadlock, the waking of the softirq was pushed outside
of the hrtimer base locks. Unfortunately this allows the task to migrate
after setting up the softirq and raising it. Since softirqs run a queue that
is per-cpu we may raise the softirq on the wrong CPU and this will keep
the queued softirq task from running.

To solve this issue, this patch disables preemption around the releasing
of the hrtimer lock and raising of the softirq.

Comment 8 Roland Westrelin 2008-07-15 08:13:06 UTC
I confirm this is fixed in -72.

Comment 12 errata-xmlrpc 2008-08-26 19:57:46 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2008-0585.html


Note You need to log in before you can comment on or make changes to this bug.