Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 228511 - xen domain auto startup does not work reliable
Summary: xen domain auto startup does not work reliable
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen
Version: 5.0
Hardware: i386
OS: Linux
medium
high
Target Milestone: ---
: ---
Assignee: Xen Maintainance List
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 492190
TreeView+ depends on / blocked
 
Reported: 2007-02-13 14:54 UTC by Markus Kremer
Modified: 2009-05-01 20:26 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-04-22 10:34:44 UTC
Target Upstream Version:


Attachments (Terms of Use)
tgz logs and config (deleted)
2007-03-28 07:46 UTC, Markus Kremer
no flags Details

Description Markus Kremer 2007-02-13 14:54:43 UTC
Description of problem:
When creating many domains and doing reboots, not all domains are started.
 
message from /var/log/xen/xend.log
[2007-02-13 08:41:29 xend 2209] INFO (image:138) buildDomain os=linux dom=7 vcpus=1
[2007-02-13 08:41:35 xend 2209] INFO (image:214) configuring linux guest
[2007-02-13 08:41:37 xend 2209] INFO (image:138) buildDomain os=linux dom=8 vcpus=1
[2007-02-13 08:41:39 xend 2209] INFO (XendDomain:370) Domain vm_5 (8) unpaused.
[2007-02-13 08:41:39 xend.XendDomainInfo 2209] WARNING (XendDomainInfo:875)
Domain has crashed: name=vm_4 id=7.
[2007-02-13 08:41:40 xend.XendDomainInfo 2209] ERROR (XendDomainInfo:1661) VM
vm_4 restarting too fast (13.252752 seconds since the last restart).  Refusing
to restart to avoid loops.


Version-Release number of selected component (if applicable):
Version=5 beta 2
Hardware=ibmx306m
Memory=3GB
CPU=Intel(R) Pentium(R) 4 CPU 3.00GHz  (no HT/SMP enabled)
xen-libs-3.0.3-8.el5
xen-3.0.3-8.el5
kernel-xen-2.6.18-1.2747.el5

How reproducible:
everytime some VMs are missing.


Steps to Reproduce: 
- ks install server using base + @virtualisation packages 
- ks install 9 guests using virt-install
- sed -ie 's/XENDOMAINS_SAVE=.*/XENDOMAINS_SAVE=/' /etc/sysconfig/xendomains  #
this does a shutdown instead of suspend
- ln -s /etc/xen/MY_VMS_* /etc/xen/auto
- do reboot
- after reboot verify that all VMs are started

Actual results:

I did 13 reboots. This is how often each VM came up automatically. 
vm_1 13
vm_2 13
vm_3 12
vm_4 11
vm_5 9
vm_6 6
vm_7 5
vm_8 6
vm_9 6
 
So vm_7 only started 5 times at 13 tries.


Expected results:
all machines are started at every reboot

Additional info:
fc6 with 2.6.19 kernel has similar behaviour.
The "Domain has crashed:" entries

Comment 1 Daniel Berrange 2007-03-27 15:39:03 UTC
Hmm, this is a little worrying - if it can't deal with multiple VMs starting in
very quick succession it sounds like there is some race condition/scalability
issue hiding in either HV or the XenD stack.

Can you reproduce this again & capture the output of 'xm dmesg' once booting has
completed - this will hopefull show if there are any hypervisor issues being
reported. Also can you attach the full /var/log/xen/xend.log,
/var/log/xen/xend-debug.log, /var/log/xen/xen-hotplug.log and finally if any are
HVM guests, also the qemu-dm-*.log files

Finally, can you attach the /etc/xen config file for at least one of the guests
- if they are all basically the same config one is sufficient - if every VM is
different upload a representative set.


Comment 3 Markus Kremer 2007-03-28 07:46:31 UTC
Created attachment 151099 [details]
tgz logs and config

I am using only RHEL5 xen-guests, no HVM. (see first post)

[root@rhrc1s1 x]# crontab -l
01,31 * * * * /usr/sbin/xm list| logger -t XEN1
14,44 * * * * /usr/sbin/xm list| logger -t XEN2
15,45 * * * * /sbin/reboot


[root@rhrc1s1 x]# uname -a
Linux rhrc1s1 2.6.18-8.el5xen #1 SMP Fri Jan 26 14:42:21 EST 2007 i686 i686
i386 GNU/Linux
xm dmesg >var/log/xen/xm.dmesg.out
dmesg >var/log/xen/dmesg.out
The tgz file contains
/var/log/xen/*
/etc/xen/*

Comment 4 Chris Lalancette 2008-03-27 05:09:56 UTC
We did some work in 5.1 to make this less likely to happen, but I'm not sure if
it is completely fixed.  Is this still a problem?

Thanks,
Chris Lalancette

Comment 5 Michal Novotny 2009-04-15 10:26:37 UTC
Well, I have tried it using my SRPMS that can be found at http://people.redhat.com/minovotn/xen and I found no problem, I booted 9 domains total and all the domains booted correctly when testing on my box. The configuration was 4 PV and 5 FV machines...

Comment 7 Markus Kremer 2009-04-20 16:06:49 UTC
Michal,
I am unable to reproduce the problem with RH 5.3 after setting dom0_mem=512M in grub.conf. My tests ran 20 256 RH5.3 64 bit udoms.
Please set the state to fixed.

Comment 8 Chris Lalancette 2009-04-22 10:34:44 UTC
OK, thanks for the testing!  Will close as FIXED.

Chris Lalancette


Note You need to log in before you can comment on or make changes to this bug.