Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 4446 - klogd takes up 100% CPU time after unaligned trap
Summary: klogd takes up 100% CPU time after unaligned trap
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: sysklogd
Version: 6.0
Hardware: alpha
OS: Linux
high
high
Target Milestone: ---
Assignee: David Lawrence
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 1999-08-09 15:40 UTC by niles
Modified: 2008-05-01 15:37 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 1999-08-09 18:59:17 UTC


Attachments (Terms of Use)

Description niles 1999-08-09 15:40:05 UTC
Running emacs on my DP264 causes an unaligned trap message,
which in turn seems to cause klogd to go in to a bad state
where it uses 100% CPU time if the scheduler with let it.
I imagine this happens after any unaligned trap, but I
have'nt verified that.  This is serious since most people
will use this machine for it's performance and this could
seriously affect it.

Comment 1 niles 1999-08-09 15:50:59 UTC
This may be related or a duplicate of Bug#:4371

Comment 2 Bill Nottingham 1999-08-09 16:15:59 UTC
Can you do an strace of klogd to see what happens when it
gets the unaligned trap?

Comment 3 niles 1999-08-09 16:28:59 UTC
This may be related or a duplicate of Bug#:4371

Comment 4 niles 1999-08-09 16:52:59 UTC
It's emacs, not klogd that causes the un-aligned trap.

When I try to strace emacs it causes a Segmentation fault.

Here's the last few lines of the emacs on alpha before the crash:
personality(0 /* PER_??? */)            = 0
getxpid()                               = 17010
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=8192*1024}) = 0
setrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=8192*1024}) = 0
getpgid(0)                              = 17009
getxpid()                               = 17010
setpgid(0, 17010)                       = 0

Program received signal SIGSEGV, Segmentation fault.
0x2000027cb1c in sigismember (set=0x1, signo=1) at sigismem.c:31
sigismem.c:31: No such file or directory.

Compared with the strace from a working emacs on i386:
personality(PER_LINUX)                  = 0
getpid()                                = 27104
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY})
= 0
setrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY})
= 0
getpgid(0)                              = 27102
getpid()                                = 27104
setpgid(0, 27104)                       = 0
SYS_175(0, 0xbffff48c, 0xbffff400, 0x8, 0) = 0

Notice the personality() function call.  I'm not sure who's calling
this but my guess would be the ld loader itself.

Here's the last bit of the strace from klogd:
write(4, "<6>Aug  9 12:43:58 kernel: No mo"..., 53) = 53
read(3, "<4>emacs(17173): unaligned trap "..., 4095) = 75
gettimeofday({934217052, 7439}, NULL)   = 0
write(4, "<4>Aug  9 12:44:12 kernel: emacs"..., 100) = 100
read(3, "<4>emacs(17173): unaligned trap "..., 4095) = 75
gettimeofday({934217052, 8416}, NULL)   = 0
write(4, "<4>Aug  9 12:44:12 kernel: emacs"..., 100) = 100
read(3, "<4>emacs(17173): unaligned trap "..., 4095) = 141
gettimeofday({934217052, 313240}, NULL) = 0
write(4, "<4>Aug  9 12:44:12 kernel: emacs"..., 101) = 101

After this klogd is caught in an infinite loop somewhere
that it never returns from.  Here's what gdb klogd says:

Starting program: /sbin/klogd -n -d
Logging line:
	Line: klogd %s-%s, log source = %s started.
	Priority: 6
Searching for symbol map.
Trying /boot/System.map-2.2.10.
Logging line:
	Line: Inspecting %s
	Priority: 6
Version string = 131594, Major = 2, Minor = 2, Patch = 10.
Comparing kernel 2.2.10 with symbol table 2.2.10.
Found table with matching version number.
End of search list encountered.
Version string = 131594, Major = 2, Minor = 2, Patch = 10.
Comparing kernel 2.2.10 with symbol table 2.2.10.
Logging line:
	Line: Loaded %d symbols from %s.
	Priority: 6
Logging line:
	Line: Symbols match kernel version %s.
	Priority: 6
Loading kernel module symbols - Size of table: 789
Logging line:
	Line: No module symbols loaded.
	Priority: 6
Logging line:
	Line: <4>emacs(17182): unaligned trap at 0000020000ac4fe8:
00000001203db8a2 2c 0

	Priority: 6
Logging line:
	Line: <4>emacs(17182): unaligned trap at 0000020000ac4f54:
00000001203db8a2 2c 0

	Priority: 6
Logging line:
	Line: <4>emacs(17182): unaligned trap at 0000020000ac8d60:
00000001203db8a2 28 16

	Priority: 6

After that klogd is hung.

What else can I try?

	Thanks, Rick Niles.

Comment 5 niles 1999-08-09 17:00:59 UTC
By looking at the 'dmesg' output I can see what klogd is actually
choking on:

emacs(17182): unaligned trap at 0000020000ac4fe8: 00000001203db8a2 2c
0
emacs(17182): unaligned trap at 0000020000ac4f54: 00000001203db8a2 2c
0
emacs(17182): unaligned trap at 0000020000ac8d60: 00000001203db8a2 28
16
>emacs(17182): unaligned trap at 00000<4>emacs(17183): unaligned trap
at 0000020000ac4fe8: 00000001203db8a2 2c 0

Notice how there was no carriage return before the next "<4>".  I
think this is a race condition either caused by the fact that
the DP264 machine is SMP or because it's just so fast. :)

Comment 6 Bill Nottingham 1999-08-09 18:11:59 UTC
Can you try the patch I'm attaching
(go to the 'e-mail' link on the bugzilla page)

This may not solve the problem, but it does fix at least one bug. :)

Comment 7 niles 1999-08-09 18:51:59 UTC
Oh yes!  Problem is fixed with this patch.
Thank you.

Comment 8 Bill Nottingham 1999-08-09 18:59:59 UTC
OK. sysklogd with this patch will be in next Raw Hide release.


Note You need to log in before you can comment on or make changes to this bug.