Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 155124 - nscd segfaults
Summary: nscd segfaults
Keywords:
Status: CLOSED DUPLICATE of bug 154782
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: FC4Target
TreeView+ depends on / blocked
 
Reported: 2005-04-16 17:51 UTC by Enrico Scholz
Modified: 2007-11-30 22:11 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-07-08 07:21:49 UTC


Attachments (Terms of Use)
'catchsegv nscd -d' output (deleted)
2005-04-16 17:51 UTC, Enrico Scholz
no flags Details

Description Enrico Scholz 2005-04-16 17:51:55 UTC
Description of problem:

| # nscd -d
| ...
| Segmentation fault


Version-Release number of selected component (if applicable):

nscd-2.3.4-21
glibc-2.3.4-21 (i386 arch)


How reproducible:

100%


Additional information:

can be reproduced with the i386 version of glibc only; i686 seems to work.

Comment 1 Enrico Scholz 2005-04-16 17:51:55 UTC
Created attachment 113272 [details]
'catchsegv nscd -d' output

Comment 2 Enrico Scholz 2005-04-16 18:05:08 UTC
stacktrace in gdb is:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1750770768 (LWP 9419)]
0x6aab7756 in gethostbyname2_r () from /usr/sbin/nscd
(gdb) bt
#0  0x6aab7756 in gethostbyname2_r () from /usr/sbin/nscd
#1  0x6aab263e in sighup_handler () from /usr/sbin/nscd
#2  0x97fc4943 in start_thread () from /lib/libpthread.so.0
#3  0x97f3ed4e in clone () from /lib/libc.so.6

Comment 3 Mark Goodman 2005-05-16 21:13:49 UTC
I can reproduce this with glibc i686 on FC4 test 3.

I got a similar backtrace before installing the debuginfo RPMs. After installing
glibc-debuginfo-common i386 and glibc-debuginfo i686, I get:

(gdb) bt full
#0  prune_cache (table=0x7bc040, now=1116277037) at cache.c:245
        runp = (struct hashentry *) 0xb72f64d9
        dh = (struct datahead *) 0x9b2f63b8
        run = Variable "run" is not available.
(gdb) bt
#0  prune_cache (table=0x7bc040, now=1116277037) at cache.c:245
#1  0x007ae63a in nscd_run (p=0x0) at connections.c:1179
#2  0x00547b80 in start_thread (arg=0xb72c0bb0) at pthread_create.c:261
#3  0x00c47b9e in ?? () from /lib/libc.so.6


Comment 4 James Bourne 2005-05-19 15:43:12 UTC
fedora core 4 test 3 (should update this entry to reflect that).
I'm finding this is caused ONLY when ssl is set to start_tls.  If ssl is set to
on, authentication fails to work and turning off ssl fixes the problem.

#0  0x00376f1e in ber_sockbuf_ctrl () from /lib/libnss_ldap.so.2
(gdb) bt
#0  0x00376f1e in ber_sockbuf_ctrl () from /lib/libnss_ldap.so.2
#1  0x0036bc1a in ldap_pvt_tls_inplace () from /lib/libnss_ldap.so.2
#2  0x0036d917 in ldap_start_tls_s () from /lib/libnss_ldap.so.2
#3  0x00347e3d in do_open () at ldap-nss.c:1273
#4  0x00348025 in do_init2 () at ldap-nss.c:959
#5  0x0034a8b5 in _nss_ldap_initgroups_dyn (
    user=0x3 <Address 0x3 out of bounds>, group=3, start=0x3, size=0x3, 
    groupsp=0x3, limit=3, errnop=0x3) at ldap-grp.c:912
#6  0x0028fbe4 in internal_getgrouplist (user=0x8d38cc8 "nscd", group=28, 
    size=0xbfac5b80, groupsp=0xbfac5b84, limit=-1) at initgroups.c:104
#7  0x0028fde1 in getgrouplist (user=0x8d38cc8 "nscd", group=28, groups=0x3, 
    ngroups=0xca1344) at initgroups.c:158
#8  0x00c91aed in nscd_init () at connections.c:1598
#9  0x00c910ad in main (argc=1, argv=0xbfac5ef4) at nscd.c:286

Hope that helps.

Regards
James

Comment 5 Jakub Jelinek 2005-05-19 16:15:36 UTC
Crash in /lib/libnss_ldap.so.2 is almost surely a bug in nss_ldap (until proven
otherwise), so please file that separately, under nss_ldap component.

Comment 6 Enrico Scholz 2005-06-01 20:09:09 UTC
Still with nscd-2.3.5-10


Comment 7 Pierre Ossman 2005-06-20 17:50:19 UTC
Same problem here. It crashes in the garbage collector. Version 2.3.5-10.

Comment 8 Enrico Scholz 2005-06-21 10:07:32 UTC
Chances are high, that it is related with bug #154782

It would be nice to see an errata soon...

Comment 9 James Bourne 2005-06-28 19:03:21 UTC
With ssl turned off (in this case) it is still happening.  Now nscd (FC4
release) is crashing.  Using catchsegv I get:
14140: Reloading "0" in password cache!
14140: Reloading "89" in password cache!
14140: Reloading "101" in password cache!
14140: remove INITGROUPS entry "mailman"
14140: remove INITGROUPS entry "cacti"
14140: remove GETHOSTBYADDR entry "198.161.98.242"
*** Segmentation fault
Register dump:

 EAX: b7f45708   EBX: 008c1cc0   ECX: b7465af0   EDX: 00000350
 ESI: b7465af0   EDI: 008c2140   EBP: b7d41ba0   ESP: b6b89ad4

 EIP: 008b9ece   EFLAGS: 00010282

 CS: 0073   DS: 007b   ES: 007b   FS: 0000   GS: 0033   SS: 007b

 Trap: 0000000e   Error: 00000006   OldMask: 00000000
 ESP/signal: b6b89ad4   CR2: b7465af0

Backtrace:
/lib/libSegFault.so[0x908115]
[0x53a420]
nscd[0x8b9948]
nscd[0x8b4616]
/lib/libpthread.so.0[0x685b80]
/lib/libc.so.6(__clone+0x5e)[0xc8bdee]

When I run nscd inside of gdb I get.
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1208730704 (LWP 14254)]
0x00126ece in gc (db=0x12f040) at mem.c:143
143               he[cnt] = (struct hashentry *) (db->data + run);
(gdb) bt
#0  0x00126ece in gc (db=0x12f040) at mem.c:143
#1  0x00126948 in prune_cache (table=0x12f040, now=1119985124) at cache.c:429
#2  0x00121616 in nscd_run (p=0x0) at connections.c:1179
#3  0x00764b80 in start_thread (arg=0xb7f43bb0) at pthread_create.c:261
#4  0x001fadee in ?? () from /lib/libc.so.6

I personally now view this as critical as this is in a production system and
with or without ssl the problem occurs.  nscd at this point is completely unusable.



Comment 10 James Bourne 2005-06-29 06:46:38 UTC
Exact back trace on a second machine now.  I've also discovered two other
things, this only happens after shutting down nscd, removing the contents of
/var/db/nscd and then starting nscd.  Second, dropping back to nscd from FC3
fixes the issue, even after deleting the cache in /var/db/nscd/.

I'm thinking this is not the same issue.  comments?

Comment 11 Enrico Scholz 2005-06-29 07:59:56 UTC
You could try the valgrind command from

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=154782#c3

and look if it reports the same uninitialized data. I would really like to see
an updated 'nscd' package; then it would be easy to check whether this bug
disappears also.

Comment 12 Enrico Scholz 2005-07-03 08:55:24 UTC
I installed nscd-2.3.5-11 from rawhide (can be installed alone without
additional dependencies) and cleared the database with 'rm -f /var/db/nscd/*'
(do not forget that!!). 

'nscd' is now running nearly one day on several machines where it crashed before.

Comment 13 Ulrich Drepper 2005-07-08 07:21:49 UTC
I think this is the same issue as bug 154782 (i.e., miscompiled code due to gcc
bug).  This bug can cause all kinds of problems.

*** This bug has been marked as a duplicate of 154782 ***


Note You need to log in before you can comment on or make changes to this bug.