Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 156647 - (gcc -O2) elinks segfault on ppc & ia64
Summary: (gcc -O2) elinks segfault on ppc & ia64
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: elinks
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Karel Zak
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: FC4Blocker
TreeView+ depends on / blocked
 
Reported: 2005-05-02 20:58 UTC by Jeremy Katz
Modified: 2007-11-30 22:11 UTC (History)
3 users (show)

Fixed In Version: 0.10.3-2
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-05-10 12:37:08 UTC


Attachments (Terms of Use)
Add debugging printouts to find_in_cache() (deleted)
2005-05-07 22:11 UTC, Miloslav Trmač
no flags Details | Diff
Log from -O0 (deleted)
2005-05-07 22:12 UTC, Miloslav Trmač
no flags Details
Log from -O2 (deleted)
2005-05-07 22:13 UTC, Miloslav Trmač
no flags Details

Description Jeremy Katz 2005-05-02 20:58:52 UTC
elinks seems to segfault on ppc when going to http://gate.crashing.org/~benh/xorg

#0  0x10058934 in doc_loading_callback ()
#1  0x100507c0 in connect_info ()
#2  0x100507c0 in connect_info ()
(gdb)

Comment 1 Karel Zak 2005-05-05 19:01:05 UTC
... and on ia64 too.

Comment 2 Miloslav Trmač 2005-05-07 22:11:00 UTC
Created attachment 114132 [details]
Add debugging printouts to find_in_cache()

The failure on ppc goes away after changing
CFLAGS="-O2 -g -W -Wall $(getconf LFS_CFLAGS)"
by s/-O2/-O0/.

Maybe the attached data will help somebody figure it out,
I don't know the code at all: applying the attached patch
shows that find_in_cache() starts returning NULL with -O2.

Comment 3 Miloslav Trmač 2005-05-07 22:12:03 UTC
Created attachment 114133 [details]
Log from -O0

Comment 4 Miloslav Trmač 2005-05-07 22:13:46 UTC
Created attachment 114134 [details]
Log from -O2

Both logs are from running 'elinks http://gate.crashing.org/~benh/xorg 2>$log';

I wasn't able to test elinks on ia64.

Comment 5 Karel Zak 2005-05-08 00:10:09 UTC
Note, I think it's possible test it on arbitrary HTML page. I had a problem with
elinks from actual FC4 and with upstream version 0.10.5 on pages like
<html><body>foo</body></html>.

Comment 6 Warren Togami 2005-05-08 04:15:57 UTC
If this happens with -O2 and not -O0, shouldn't this be assigned to gcc?


Comment 7 Jakub Jelinek 2005-05-09 09:56:02 UTC
Generally, if something works with -O0 and does not with -O2, it is more often
an application bug than GCC bug.  Only when you debug it and prove it is indeed
a GCC bug it should be reassigned to GCC.
Particularly in this case, the bug goes away with -O2 -fno-strict-aliasing,
and there are 94 places where GCC warns about aliasing problems:
grep warning.*type-punned elinks.log | sort -u | wc -l
94
Plus there are several places where the code violates those but GCC does not
warn.
Say in find_in_cache, all the lists.h macros used there are buggy.
And error.h even shows that the authors see the problems, just for unknown
reason can't admit it is their bug and not a compiler bug:
/* This function does nothing, except making compiler not to optimize certains
 * spots of code --- this is useful when that particular optimization is buggy.
 * So we are just workarounding buggy compilers. */
/* This function should be always used only in context of compiler version
 * specific macros. */
void do_not_optimize_here(void *x);

#if defined(__GNUC__) && __GNUC__ == 2 && __GNUC_MINOR__ <= 7
#define do_not_optimize_here_gcc_2_7(x) do_not_optimize_here(x)
#else
#define do_not_optimize_here_gcc_2_7(x)
#endif

#if defined(__GNUC__) && __GNUC__ == 3
#define do_not_optimize_here_gcc_3_x(x) do_not_optimize_here(x)
#else
#define do_not_optimize_here_gcc_3_x(x)
#endif

#if defined(__GNUC__) && __GNUC__ == 3 && __GNUC_MINOR__ == 3
#define do_not_optimize_here_gcc_3_3(x) do_not_optimize_here(x)
#else
#define do_not_optimize_here_gcc_3_3(x)
#endif

The lists implementation is broken by design, it just can't work that way.
You can't access the same object through aliasing incompatible types.
But lists.h is doing that a lot, it sometimes accesses next/prev as void *,
sometimes as struct cache_entry *, etc.
Cleanest fix IMHO would be to use a void *next; void *prev; structure and
put that structure as first field into the various structures that are chained
into lists, say:
struct cache_entry
{
  struct list_head_elinks head;
  ...
}
and then the macro use cached->head.prev, etc.  What will also work
is just make the prev/next pointers void *, but directly in the structure, say
struct cache_entry
{
  void *next; void *prev;
  ...
}
and have
struct list_head_elinks
{
  void *next; void *prev;
};

But writing/reading through void ** pointer and then writing/reading through
struct cache_entry ** pointer is violation of ISO C99 6.5 (6,7).

Comment 8 Miloslav Trmač 2005-05-10 12:37:08 UTC
Jakub, thanks again.


Note You need to log in before you can comment on or make changes to this bug.