Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1365812 - eu-stack killed by SIGABRT processing gcore created core file
Summary: eu-stack killed by SIGABRT processing gcore created core file
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: elfutils
Version: rawhide
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Mark Wielaard
QA Contact: Fedora Extras Quality Assurance
URL: https://retrace.fedoraproject.org/faf...
Whiteboard: abrt_hash:d5718852e850010617546bd78ad...
Depends On:
Blocks: 1371380 1371517
TreeView+ depends on / blocked
 
Reported: 2016-08-10 09:10 UTC by Jakub Filak
Modified: 2016-12-01 00:50 UTC (History)
15 users (show)

Fixed In Version: elfutils-0.167-1.fc25 elfutils-0.167-1.fc24
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1371380 1371517 (view as bug list)
Environment:
Last Closed: 2016-09-06 22:20:51 UTC


Attachments (Terms of Use)
File: backtrace (deleted)
2016-08-10 09:11 UTC, Jakub Filak
no flags Details
File: cgroup (deleted)
2016-08-10 09:11 UTC, Jakub Filak
no flags Details
File: core_backtrace (deleted)
2016-08-10 09:11 UTC, Jakub Filak
no flags Details
File: dso_list (deleted)
2016-08-10 09:11 UTC, Jakub Filak
no flags Details
File: environ (deleted)
2016-08-10 09:11 UTC, Jakub Filak
no flags Details
File: limits (deleted)
2016-08-10 09:11 UTC, Jakub Filak
no flags Details
File: maps (deleted)
2016-08-10 09:11 UTC, Jakub Filak
no flags Details
File: mountinfo (deleted)
2016-08-10 09:11 UTC, Jakub Filak
no flags Details
File: namespaces (deleted)
2016-08-10 09:11 UTC, Jakub Filak
no flags Details
File: open_fds (deleted)
2016-08-10 09:11 UTC, Jakub Filak
no flags Details
File: proc_pid_status (deleted)
2016-08-10 09:11 UTC, Jakub Filak
no flags Details
File: var_log_messages (deleted)
2016-08-10 09:11 UTC, Jakub Filak
no flags Details

Description Jakub Filak 2016-08-10 09:10:50 UTC
Description of problem:
I ran eu-stack with a core dump file generated by gcore.

$ sleep 1000 &
$ SLEEP_PID=$!
$ gcore $SLEEP_PID
$ eu-stack --executable=/usr/bin/sleep --core=core.$SLEEP_PID
eu-stack: link_map.c:846: dwfl_link_map_report: Assertion `in.d_size == phnum * phent' failed.
Aborted (core dumped)

Version-Release number of selected component:
elfutils-0.166-2.fc25

Additional info:
reporter:       libreport-2.7.2.6.g6ac1
backtrace_rating: 3
cmdline:        eu-stack --executable=/usr/bin/sleep --core=core.6984
executable:     /usr/bin/eu-stack
global_pid:     7060
kernel:         4.7.0-0.rc7.git4.2.fc25.x86_64
pkg_vendor:     Fedora Project
runlevel:       N 5
type:           CCpp
uid:            18601

Truncated backtrace:
Thread no. 1 (7 frames)
 #25 ??
 #26 dwfl_link_map_report at link_map.c:846
 #27 dwfl_core_file_report at core-file.c:531
 #28 parse_opt at stack.c:595
 #29 parser_parse_arg at argp-parse.c:716
 #30 parser_parse_next at argp-parse.c:865
 #31 __argp_parse at argp-parse.c:921

Comment 1 Jakub Filak 2016-08-10 09:11:01 UTC
Created attachment 1189518 [details]
File: backtrace

Comment 2 Jakub Filak 2016-08-10 09:11:02 UTC
Created attachment 1189519 [details]
File: cgroup

Comment 3 Jakub Filak 2016-08-10 09:11:04 UTC
Created attachment 1189520 [details]
File: core_backtrace

Comment 4 Jakub Filak 2016-08-10 09:11:06 UTC
Created attachment 1189521 [details]
File: dso_list

Comment 5 Jakub Filak 2016-08-10 09:11:08 UTC
Created attachment 1189522 [details]
File: environ

Comment 6 Jakub Filak 2016-08-10 09:11:09 UTC
Created attachment 1189523 [details]
File: limits

Comment 7 Jakub Filak 2016-08-10 09:11:11 UTC
Created attachment 1189524 [details]
File: maps

Comment 8 Jakub Filak 2016-08-10 09:11:13 UTC
Created attachment 1189525 [details]
File: mountinfo

Comment 9 Jakub Filak 2016-08-10 09:11:14 UTC
Created attachment 1189526 [details]
File: namespaces

Comment 10 Jakub Filak 2016-08-10 09:11:16 UTC
Created attachment 1189527 [details]
File: open_fds

Comment 11 Jakub Filak 2016-08-10 09:11:18 UTC
Created attachment 1189528 [details]
File: proc_pid_status

Comment 12 Jakub Filak 2016-08-10 09:11:19 UTC
Created attachment 1189529 [details]
File: var_log_messages

Comment 13 Matej Habrnal 2016-08-10 09:30:56 UTC
It is possible to generate backtrace from such a coredump using gdb:

$ gdb -batch -ex 'file /usr/bin/sleep' -ex "core-file core.$SLEEP_PID" -ex "bt"
[New LWP 22760]
Core was generated by `sleep'.
#0  0x00007fe59479c810 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:84
84	T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
#0  0x00007fe59479c810 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:84
#1  0x000055d61b75845f in rpl_nanosleep ()
#2  0x000055d61b7582c0 in xnanosleep ()
#3  0x000055d61b75587d in main ()

Comment 14 Jan Kratochvil 2016-08-10 11:55:09 UTC
The core file generated by gcore is bogus so it is rather a GDB bug:
3    AT_PHDR              Program headers for program    0x55848c2b5040
9    AT_ENTRY             Entry point of program         0x55848c2b6940
^^^ it points to nowhere:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000df8 0x000055848c4bb000 0x0000000000000000
                 0x0000000000001000 0x0000000000001000  R      1
  LOAD           0x0000000000001df8 0x000055848c4bc000 0x0000000000000000
                 0x0000000000001000 0x0000000000001000  RW     1
  LOAD           0x0000000000002df8 0x000055848e3b7000 0x0000000000000000
                 0x0000000000021000 0x0000000000021000  RW     1

Comment 15 Jan Kratochvil 2016-08-10 19:50:35 UTC
A workaround is:
(gdb) shell cat /proc/20477/coredump_filter
00000033
(gdb) shell echo 0x37 >/proc/20477/coredump_filter
(gdb) shell cat /proc/20477/coredump_filter
00000037
(gdb) gcore /tmp/sleep.core
(gdb) shell echo 0x33 >/proc/20477/coredump_filter

This is not a regression, before 'set use-coredump-filter' (gdb >=7.10) GDB also never dumped pages of binary code.

Comment 16 Mark Wielaard 2016-08-10 22:22:00 UTC
Please create a new bug, or clone this bug for gdb if you want to fix the bogus core file creation by gcore. It would certainly be nice to fix that. But the original bug is real. eu-stack does crash and it shouldn't, even on a bogus core file. The assert should be fixed by some other sanity check that doesn't cause eu-stack to abort.

Comment 17 Mark Wielaard 2016-08-11 21:19:03 UTC
So the assert is actually a good thing. It does show something was wrong with our assumption that in.d_size == phnum * phent, which we previously explicitly set in.d_size to. It got reset (to zero) by the core reading code when it detected an error in the core file. That is why we now try to reread the phdrs from the executable. Reusing the same buffer and size. So all we really need to do instead of asserting the size is as expected, to actually set the expected size:

diff --git a/libdwfl/link_map.c b/libdwfl/link_map.c
index 28d7382..604be1b 100644
--- a/libdwfl/link_map.c
+++ b/libdwfl/link_map.c
@@ -843,7 +843,10 @@ dwfl_link_map_report (Dwfl *dwfl, const void *auxv, size_t auxv_size,
                }
              off_t off = ehdr->e_phoff;
              assert (in.d_buf == NULL);
-             assert (in.d_size == phnum * phent);
+             /* Note this in the !in_ok path.  That means the memory_callback
+                failed.  But the callback might still have reset the in.d_buf
+                value (to zero).  So explicitly set it here again.  */
+             in.d_size = phnum * phent;
              in.d_buf = malloc (in.d_size);
              if (unlikely (in.d_buf == NULL))
                {

And with that we get:

$ LD_LIBRARY_PATH=backends:libelf:libdw src/stack --core=core.$SLEEP_PID --exec=/bin/sleep
PID 9603 - core
TID 9603:
#0  0x00007f5185837810 __nanosleep
#1  0x000055bc65b7f45f rpl_nanosleep
#2  0x000055bc65b7f2c0 xnanosleep
#3  0x000055bc65b7c87d main
#4  0x00007f518578f731 __libc_start_main
#5  0x000055bc65b7c969 _start

Comment 19 Fedora Update System 2016-08-26 14:59:55 UTC
elfutils-0.167-1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-de1f4e692b

Comment 20 Fedora Update System 2016-08-27 12:52:33 UTC
elfutils-0.167-1.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-de1f4e692b

Comment 21 Fedora Update System 2016-09-03 17:38:37 UTC
elfutils-0.167-1.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.

Comment 22 Fedora Update System 2016-09-04 21:44:48 UTC
elfutils-0.167-1.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-1bc61e8f20

Comment 23 Fedora Update System 2016-09-06 03:21:33 UTC
elfutils-0.167-1.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-1bc61e8f20

Comment 24 Fedora Update System 2016-09-06 22:20:38 UTC
elfutils-0.167-1.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.