Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1059129 - Resource lock split brain causes VM to get paused after migration
Summary: Resource lock split brain causes VM to get paused after migration
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.2.0
Hardware: All
OS: All
Target Milestone: ---
: 3.3.1
Assignee: Vinzenz Feenstra [evilissimo]
QA Contact: Pavel Novotny
Whiteboard: virt
Depends On: 1028917
TreeView+ depends on / blocked
Reported: 2014-01-29 08:55 UTC by rhev-integ
Modified: 2018-12-09 17:27 UTC (History)
15 users (show)

Fixed In Version: is34
Doc Type: Bug Fix
Doc Text:
Virtual machines are no longer paused after migrations; hosts now correctly acquire resource locks for recently migrated virtual machines.
Clone Of: 1028917
Last Closed: 2014-02-27 09:43:57 UTC
oVirt Team: ---
Target Upstream Version:

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:0219 normal SHIPPED_LIVE vdsm 3.3.1 bug fix update 2014-02-27 14:42:16 UTC
oVirt gerrit 21963 None None None Never
oVirt gerrit 23939 None None None Never

Comment 1 Vinzenz Feenstra [evilissimo] 2014-02-03 16:14:49 UTC
Merged u/s to ovirt-3.3 as;a=commit;h=9369b370369057832eff41793075fc1a63c42279

Comment 3 Pavel Novotny 2014-02-11 21:53:10 UTC
Verified in vdsm-4.13.2-0.8.el6ev.x86_64 (is34).

Verification steps:
1. Preparation: On destination migration host, set 'migration_destination_timeout' to '120' in VDSM (located at /usr/lib64/python2.6/site-packages/vdsm/
   This reduces the verification time, otherwise the default is 6 hours.
2. Have a running VM (F19 in my case) with some ongoing memory-stressing operation (I used `memtester` utility). This should make the migration process long enough to give us time in step 3 to simulate the error-prone environment.
2. Migrate the VM from source host1 do destination host2. 
3. Immediately after migration starts, block on the source host1:
  - connection to destination host VDSM (simulating connection loss to dest. VDSM)
  `iptables -I OUTPUT 1 -p tcp -d <host2> --dport 54321 -j DROP`
  - connection to the storage (simulating migration error)
  `iptables -I OUTPUT 1 -d <storage> -j DROP`
4. Wait `migration_destination_timeout` seconds (120).

The migration fails (due to our blocking of storage) and is aborted.
On destination host, the migrating VM is destroyed (the host shows 0 running VMs and no VM migrating).
The VM stays on the source host (paused due to inaccessible storage; after unblocking the storage the VM should run as if nothing happened). 
The source host shows 1 running VM and no VM migrating.

Comment 5 errata-xmlrpc 2014-02-27 09:43:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

Note You need to log in before you can comment on or make changes to this bug.