Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 152319

Summary: possible deadlock in ctime() called from signal handler
Product: [Fedora] Fedora Reporter: Jason Vas Dias <jvdias>
Component: sysklogdAssignee: Jason Vas Dias <jvdias>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: sundaram
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Fixed In Version: 1.4.1-28 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-09-05 00:01:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Jason Vas Dias 2005-03-28 15:43:33 UTC
Description of problem:

 From: Miquel van Smoorenburg <>
 To: Chris Stromsoe <>
 Cc: Andrew Morton <>,, Erik
 Horn <>, Erik Horn <>
 Subject: syslogd hang / livelock (was: 2.6.10-rc3, syslogd hangs then
 processes get stuck in schedule_timeout)
 Date: Sat, 26 Mar 2005 12:37:39 +0100
 At Tue, 21 Dec 2004 16:39:43 -0800 (PST) Chris Stromsoe wrote:
 > I'm still seeing this problem.  It repeats every week or week and a
 > usually after logs have been rotated and a dvd has been written. 
 > stops writing output, then everything that does schedule_timeout()
 > the process table fills, and everything grinds to a halt.
 > If the problem is detected early enough, syslogd can be manually
 > and restarted, unwedging everything and returning everything to normal 
 > operation.
 I'm seeing the same problem here, and it is not a kernel bug. I'm only
 Cc'ing this to linux-kernel so that it shows up in the archives, since
 there have been postings about the same thing in the past.
 There are 2 issues here:
 1. syslogd hangs sometimes when running under a 2.6 kernel.
    This is because syslogd set up a timer, called by alarm() every
    20 minutes by default, which writes a "MARK" entry in one of
    the logfiles to show that syslogd is still alive.
    That code calls ctime(), which is not re-entrant - and recent
    glibcs __libc_lock() around ctime() calls, which doesn't do anything
    on a 2.4 kernel, but uses a futex on a 2.6 kernel.
    So if syslogd happens to be inside ctime() in the main routine,
    SIGALRM hits, and ctime() is called again, syslogd locks up.
    A sysrq-T trace will show it hanging in futex_wait.
 2. syslog() uses blocking AF_UNIX SOCK_DGRAM sockets.
    When an application calls syslog(), and syslogd is not responding
    and the socket buffers get full, the app will hang in connect()
    or send(). This is different from BSD, where send() will return
    ENOBUFS in this case.
    Try killall -STOP syslogd, then generate some syslog traffic
    (say with while :; do logger hello; done) and try to ssh into
    the system - no go. Everything that uses syslog() hangs.
 1. syslogd
     Run syslogd with the -m0 option so that it won't do MARKing.
     The real solution is to fix syslogd to use ctime_r, or better,
     to just let the ALRM handler set a flag and do the MARK
     logging in the main loop.
 2. syslog()
     Arguably syslog() shouldn't hang like it does now. But making it
     non-blocking could lead to information loss - a hacker generating
     lots of bogus syslog messages so that real messages get lost.
     On the other hand, she can do that anyway and fill up the disk
     which gives a similar (although more noticable) effect. I'm not
     sure how or if this should be fixed.

Version-Release number of selected component (if applicable):

Comment 1 Jason Vas Dias 2005-03-28 15:53:06 UTC
Problem #1 (potential ctime() deadlock when '-m' option is > 0) is
still an issue and will be fixed in the next release: syslogd will 
no longer generate any I/O  nor call ctime() within the signal handler.
Note that '-m 0' is in the options installed in /etc/sysconfig/syslog
by default .

Problem #2 (blocking SOCK_DGRAM recv) should be fixed now, and is a
duplicate of bug #140983 .


Comment 2 Jason Vas Dias 2005-03-28 20:46:40 UTC
This is now fixed with sysklogd-1.4.1-28+ in FC4 .