|Summary:||mozilla hangs on futex(2)|
|Product:||[Retired] Red Hat Linux||Reporter:||Kjetil T. Homme <kjetilho>|
|Component:||glibc||Assignee:||Jakub Jelinek <jakub>|
|Status:||CLOSED CURRENTRELEASE||QA Contact:|
|Version:||9||CC:||fweimer, p, wtogami|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2004-05-26 11:00:06 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
Description Kjetil T. Homme 2003-03-02 15:56:46 UTC
From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030206 Description of problem: mozilla hangs randomly every 15 minutes or so when using the stock Phoebe5 kernel and glibc. name resolving seems to be an aggravating factor. Mozilla works fine with the Red Hat 8.0 errata kernel. the problem was present in Phoebe3 as well, but then the entire machine would eventually hang -- Phoebe5 is a nice improvement in that respect :-) Version-Release number of selected component (if applicable): How reproducible: Sometimes Steps to Reproduce: go to a busy web page with external adverts. it seems like the combination of CPU load and name resolving gives the best odds of hanging Mozilla. Actual Results: the arguments to the hanging futex varies. : [kjetilho@groucho ~]; strace -f -p 15964 futex(0x80d30f0, FUTEX_WAIT, 0, NULL) = -1 EINTR (Interrupted system call) --- SIGTERM (Terminated) @ 0 (0) --- : [kjetilho@groucho ~]; strace -f -p 11622 futex(0x41d3bf38, FUTEX_WAIT, 11629, NULL) = -1 EINTR (Interrupted system call) --- SIGTERM (Terminated) @ 0 (0) --- : [kjetilho@groucho ~]; strace -f -p 13161 futex(0x42131300, FUTEX_WAIT, -2, NULL <unfinished ...> Additional info: my system is a BP6 with 2x366 MHz Celeron (not overclocked). kernel-smp-2.4.20-2.48 glibc-2.3.1-46 mozilla-1.2.1-20
Comment 1 Kjetil T. Homme 2003-03-07 00:52:50 UTC
I made a snapshot of a page which seems to make Mozilla either crash or hang consistently. http://heim.ifi.uio.no/~kjetilho/tmp/mozillabug/www.nettavisen.no/servlets/page%3fsection=3&item=257901
Comment 2 Kjetil T. Homme 2003-03-07 00:54:29 UTC
I guess I should mention, with LD_ASSUME_KERNEL=2.2.5, mozilla works fine.
Comment 3 Ulrich Drepper 2003-04-14 07:11:08 UTC
Try glibc 2.3.2-27.9.
Comment 4 Kjetil T. Homme 2003-04-15 23:33:00 UTC
thanks, but it does not help. the snapshot I made still makes Mozilla hang rather consistently. kernel-smp-2.4.20-2.48 glibc-2.3.2-27.9 mozilla-1.2.1-26 one thing I noticed is that this always happens: # strace -p 10649 futex(0x42132320, FUTEX_WAIT, -3, NULL <unfinished ...> # strace -p 10649 futex(0x42132320, FUTEX_WAIT, -5, NULL <unfinished ...> # strace -p 10649 futex(0x42132320, FUTEX_WAIT, -7, NULL <unfinished ...> ie., the third argument is decremented by two for each time I attach using strace. I have no idea if this is relevant at all :-)
Comment 5 Ihar Filipau 2003-09-10 17:45:35 UTC
I'm experiencing similar problem. I'm running up-to-date RHL9 (I've up2date in my crontab). The only not normal component is Mozilla 1.4, installed from rpms available from www.mozilla.org. The problem: twice a week (or so) Mozilla stops to resolv names. After restart of mozilla name resolution starts to work once again. But it turns out that previous instance of mozilla can still hang in the memory. Previously I was just killing it - bat today I decide to see what is the problem. To my surprise it was (again, and over again - from hanging rpm story) futex(2): [ifilipau@hera ~]$ strace -p 17876 futex(0x42934d78, FUTEX_WAIT, 17906, NULL <unfinished ...> # ^C [ifilipau@hera ~]$ pid 17906 is already gone, but mozilla waits for something. killall mozilla-bin helps, but this is not nice. FYI. P.S. BTW RHL9 misses the man pages for futex(2)/(4)
Comment 6 Pádraig Brady 2003-09-11 09:16:50 UTC
Hi, me too. mozilla dns thread seems to hang up at: futex(0x42932c88, FUTEX_WAIT, 12418, NULL mozilla-1.4-0 kernel-smp-2.4.20-20.9 glibc-2.3.2-11.9 Note the problem did NOT occur with kernel-smp-2.4.20-8 I'll upgrade glibc to see if it helps
Comment 7 Pádraig Brady 2003-09-25 11:31:52 UTC
I upgraded glibc, which seemed to help actually, but the problem just happened again. (First time in 2 weeks). mozilla-1.4-0 kernel-smp-2.4.20-20.9 glibc-2.3.2-27.9
Comment 8 Pádraig Brady 2003-10-03 17:40:27 UTC
I've started maxing out my CPU now with 2 math calculation processes, and this mozilla/futex bug seems to trigger much more frequently.
Comment 9 Pádraig Brady 2003-11-18 09:43:47 UTC
mozilla-1.5-1 glibc-2.3.2-27.9 2.4.20-20.9smp Hmm I thought I resolved this a while ago, saying the new glibc didn't cause it? Anyway it's much more difficult to reproduce now, but it happened again with the above combination.
Comment 10 Alessandro Suardi 2004-05-25 18:37:55 UTC
Ximian Mozilla 1.4.2 / Galeon 1.3.7 under kernel 2.6.6 and later hang on futex() on a page containing Java code after upgrading to Sun JRE 1.5.0-beta. JRE 1.4.2_03-fcs does not suffer from this issue. Mozilla survives with LD_ASSUME_KERNEL=2.4.1, Galeon has java_vm going into a CPU spin even with LD_ASSUME_KERNEL=2.4.1. Before adding more detail, I'd like to know whether you're interested in such detail given that I do have a RH9 base distro but as you see I'm using XD2 and a beta JRE from Sun - so I'm unsure about the fact that my environment can be a candidate for your investigation.
Comment 11 Pádraig Brady 2004-05-26 08:45:11 UTC
Have to say I've updated to mozilla-1.6-0.rh90.dag as soon as it was available and have not noticed the problem since.
Comment 12 Kjetil T. Homme 2004-05-26 11:00:06 UTC
it hasn't happened me in a long time, and never in RHEL WS3, FC1 nor FC2. RHL9 is discontinued anyway. I'm taking the liberty of closing it.
Comment 13 Alessandro Suardi 2004-06-02 15:23:19 UTC
...and the problem went away for me upgrading to Sun JRE 1.5.0-beta2. So now Ximian Mozilla 1.4.2 works for me too under kernel 2.6.7-rc2 :)