|Summary:||firstboot fails "could not open display"|
|Product:||Red Hat Enterprise Linux 4||Reporter:||Daniel W. Ottey <daniel.ottey>|
|Component:||firstboot||Assignee:||Chris Lumens <clumens>|
|Status:||CLOSED ERRATA||QA Contact:||Alexander Todorov <atodorov>|
|Version:||4.0||CC:||andriusb, atodorov, charlotte.richardson, douglas.brown, jgranado, lawrence.newitt, shillman, smcgrath, tao, xgl-maint|
|Target Milestone:||---||Keywords:||OtherQA, Reopened|
|Fixed In Version:||Doc Type:||Bug Fix|
* on systems with two or more processors, a race condition existed between the X server starting and firstboot detecting it had started. If setxkbmap started and finished before metacity started, a "-terminate" command line argument sent to the X server by firstrun.py caused the X server to exit when the last running X client exited. By the time metacity was ready to start, no X server was available, so metacity would exit with an error message ("unable to open X display :1"). Firstboot no longer includes the "- terminate" command line argument, thus avoiding this race.
|Last Closed:||2009-05-18 20:29:09 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Bug Depends On:|
Description Daniel W. Ottey 2005-04-15 18:05:28 UTC
Description of problem: We have successfully run firstboot on our IA32 systems. In our first attempt at running it on our IA64 system, we have run into a problem. When firstboot runs on start, we receive a traceback error. I am attaching a screen capture of the error, since no log file was produced. We will be happy to produce any log files; if we only knew where the helpful information is! :-) Version-Release number of selected component (if applicable): firstboot-1.3.39-2
Comment 1 Daniel W. Ottey 2005-04-15 18:05:28 UTC
Created attachment 113239 [details] Screen capture of traceback error
Comment 2 Suzanne Hillman 2005-05-12 19:29:34 UTC
Are you able to run X by hand? ie, after the machine has finished booting, does X start?
Comment 3 Mike A. Harris 2005-07-06 22:26:29 UTC
The "FBIOPAN" error in the screenshot is indicative of a kernel fbdev bug I believe. There used to be a similar bug in our ppc kernel if I remember correctly in RHEL3. I don't recall the details of the issue, but searching bugzilla for "FBIOPAN" might yield some matches. Hope this helps.
Comment 7 Chris Lumens 2005-10-12 14:18:30 UTC
Daniel - Do you remember if you were able to do a graphical install on this machine? Does X start up fine for you after firstboot has failed?
Comment 8 Chris Lumens 2006-06-05 17:41:26 UTC
Closing since no information has been provided for several months. If you are still seeing this problem on more recent versions of RHEL4, please feel free to reopen this bug.
Comment 9 Charlotte Richardson 2007-07-25 20:54:35 UTC
We have this bug also. We have to work around it by editing firstboot.py. What is happening is a race between the X clients that firstboot.py starts. That script starts the Xserver with the -terminate option, so the Xserver will exit when the last remaining X client ends. However, since the script will kill the PID of the Xserver when it is really done with it, there was no need for this option being passed to X to get X to close itself. The firstboot.py scripts starts up both setxkbmap and metacity. If setxkbmap starts and ends before metacity starts, the -terminate will cause the Xserver to exit, and metacity will get the error shown in the earlier posting (and you will not see the firstboot screens). The fix is to take the -terminate option out of the line that starts up the Xserver in firstboot.py. That's what we did. Obviously, this doesn't come up if you are coming up to runlevel 3 instead of 5.
Comment 10 Andrius Benokraitis 2007-07-31 18:08:09 UTC
Reopening bug since Stratus is seeing this on x86_64.
Comment 11 Andrius Benokraitis 2007-07-31 18:09:38 UTC
Stratus has tested this on RHEL 4.5.
Comment 13 Chris Lumens 2007-08-06 17:51:38 UTC
We use these same flags in devel, so this should be fine for 4.x as well.
Comment 14 RHEL Product and Program Management 2008-02-01 19:14:07 UTC
This request was evaluated by Red Hat Product Management for inclusion, but this component is not scheduled to be updated in the current Red Hat Enterprise Linux release. If you would like this request to be reviewed for the next minor release, ask your support representative to set the next rhel-x.y flag to "?".
Comment 16 Alexander Todorov 2008-09-03 07:11:57 UTC
Daniel, Charlotte, can you guys report consistent steps to reproduce? I do understand this comment: <quote> If setxkbmap starts and ends before metacity starts, the -terminate will cause the Xserver to exit, and metacity will get the error shown in the earlier posting </quote> but I'm not clear how to reproduce it consistently. Can you please help?
Comment 17 Charlotte Richardson 2008-09-03 14:15:39 UTC
You need a multiprocessor system (I have eight CPUs) so that the various python pieces run simultaneously. Basically you need to have setxkbmap to be started and finished before metacity is started (on some other processor). You might be able to force the timing of it on a uniprocessor system by introducing a delay before starting metacity, I don't know. When I first saw this problem the system I was working on had only two processors, and the problem happened 100% of the time, so two processors is enough. You might ask Daniel what his system has. What is happening is that the "-terminate" command line argument to the X server causes it to exit when the last running X client exits, rather than continuing to run if there are no clients connected to it. So if setxkbmap runs first to determine keyboard information and metacity hasn't started up yet, when setxkbmap exits, the X server exits too, and so isn't there for metacity to use, hence the error "unable to open X display :1". Since the X server gets shut down later by its PID anyhow, all you need to do is get rid of the "-terminate". That's how we fixed it here. The RHEL5 version of this mechanism doesn't have this particular bug.
Comment 18 RHEL Product and Program Management 2008-09-18 19:15:52 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Comment 20 Chris Lumens 2008-11-19 18:13:39 UTC
This should be fixed in firstboot-1.3.39-7. Thanks for the patch.
Comment 22 Ruediger Landmann 2009-01-29 07:19:46 UTC
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: * on systems with two or more processors, a race condition existed between the X server starting and firstboot detecting it had started. If setxkbmap started and finished before metacity started, a "-terminate" command line argument sent to the X server by firstrun.py caused the X server to exit when the last running X client exited. By the time metacity was ready to start, no X server was available, so metacity would exit with an error message ("unable to open X display :1"). Firstboot no longer includes the "- terminate" command line argument, thus avoiding this race.
Comment 23 Chris Ward 2009-02-20 13:29:31 UTC
~~ Attention Partners! ~~ RHEL 4.8 Partner Alpha has been released on partners.redhat.com. There should be a fix present in the Beta, which addresses this bug. If you have already completed testing your other URGENT priority bugs, and you still haven't had a chance yet to test this bug, please do so at your earliest convenience, to ensure that only the highest possible quality bits are shipped in the upcoming public Beta drop. If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. Further questions can be directed to your Red Hat Partner Manager. Thanks, more information about Beta testing to come. - Red Hat QE Partner Management
Comment 24 Charlotte Richardson 2009-02-20 17:03:54 UTC
I was able to test this on a system that had exhibited the original problem, and firstboot is now behaving correctly. A check of /usr/share/firstboot/firstboot.py shows that the "-terminate" command line option tot he Xserver is now gone, which explains why it now works on a multiprocessor system (this particular one has 8 CPUs). Thanks!
Comment 25 Alexander Todorov 2009-02-23 08:47:12 UTC
changing status to VERIFIED based on comment #24
Comment 27 errata-xmlrpc 2009-05-18 20:29:09 UTC
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1005.html