Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1356027 - During cluster level upgrade - reconfig VMs to old cluster level compatibility level until they restart
Summary: During cluster level upgrade - reconfig VMs to old cluster level compatibilit...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.0.0
Hardware: Unspecified
OS: Unspecified
high
high vote
Target Milestone: ovirt-4.0.3
: 4.0.3
Assignee: Marek Libra
QA Contact: sefi litmanovich
URL:
Whiteboard:
Depends On: 1348907 1356194 1357513
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-13 09:53 UTC by Michal Skrivanek
Modified: 2016-08-31 09:34 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Cluster Compatibility Version change was forbidden till all cluster VMs are down. Consequence: Cluster version change was complicated to perform in production environment. Result: After Cluster Compatibility Version change in the Cluster Edit dialog, the user is requested to shut down and restart all running or suspended VMs as soon as possible. To further denote that, all running or suspended VMs are marked with the Next-Run icon (triangle with '!'). Hosted Engine and external VMs are excluded from this Next-Run setting. Custom compatibility version of a running or suspended VM is temporarily set to the previous cluster version until restart. Cluster Compatibility Version change is not allowed when a VM snapshot is in preview. The user has to either commit or undo such a preview. Known issues: High Available VM is missing the Next-Run mark after crash and automatic restart.
Clone Of: 1348907
Environment:
Last Closed: 2016-08-31 09:34:25 UTC
oVirt Team: Virt
rule-engine: ovirt-4.0.z+
rule-engine: blocker+
mgoldboi: planning_ack+
michal.skrivanek: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
oVirt gerrit 61079 master MERGED core: Temporal VM Custom Compat Version after Cluster Version change 2016-08-15 13:58:36 UTC
oVirt gerrit 61966 master MERGED webadmin: Count of affected VMs shown when CL change 2016-08-09 13:02:44 UTC
oVirt gerrit 61977 master MERGED core: Block cluster version change when Snapshot in Preview 2016-08-16 10:53:43 UTC
oVirt gerrit 62367 ovirt-engine-4.0 MERGED core: Block cluster version change when Snapshot in Preview 2016-08-22 12:18:25 UTC
oVirt gerrit 62378 ovirt-engine-4.0 MERGED core: Temporal VM CustCompatVer after Cluster Ver change 2016-08-24 08:58:58 UTC
oVirt gerrit 62379 ovirt-engine-4.0 MERGED webadmin: Count of affected VMs shown when CL change 2016-08-24 08:58:50 UTC
oVirt gerrit 62437 master MERGED webadmin: Rephrase Running VMs notification for Cluster Upgrade 2016-08-22 09:14:03 UTC
oVirt gerrit 62475 master MERGED core: SuspendedVMClusterEditChecker removed 2016-08-21 07:33:18 UTC
oVirt gerrit 62596 master MERGED webadmin: Rephrase Preview Snapshot warning 2016-08-21 12:12:30 UTC
oVirt gerrit 62598 ovirt-engine-4.0 MERGED core: SuspendedVMClusterEditChecker removed 2016-08-24 08:59:08 UTC
oVirt gerrit 62599 ovirt-engine-4.0 MERGED webadmin: Rephrase Running VMs notification for Cluster Upgrade 2016-08-24 08:59:22 UTC
oVirt gerrit 62600 ovirt-engine-4.0 MERGED webadmin: Rephrase Preview Snapshot warning 2016-08-24 08:58:44 UTC
oVirt gerrit 62601 ovirt-engine-4.0.3 MERGED core: Block cluster version change when Snapshot in Preview 2016-08-25 07:52:55 UTC
oVirt gerrit 62602 ovirt-engine-4.0.3 MERGED core: Temporal VM CustCompatVer after Cluster Ver change 2016-08-25 07:53:01 UTC
oVirt gerrit 62603 ovirt-engine-4.0.3 MERGED webadmin: Count of affected VMs shown when CL change 2016-08-25 07:52:49 UTC
oVirt gerrit 62605 ovirt-engine-4.0.3 MERGED core: SuspendedVMClusterEditChecker removed 2016-08-25 07:53:21 UTC
oVirt gerrit 62606 ovirt-engine-4.0.3 MERGED webadmin: Rephrase Running VMs notification for Cluster Upgr 2016-08-25 07:53:57 UTC
oVirt gerrit 62607 ovirt-engine-4.0.3 MERGED webadmin: Rephrase Preview Snapshot warning 2016-08-25 07:53:40 UTC

Description Michal Skrivanek 2016-07-13 09:53:58 UTC
+++ This bug was initially created as a clone of Bug #1348907 +++

--- Additional comment from Michal Skrivanek on 2016-07-13 11:46:45 CEST ---

we can take advantage of VM custom compatibility override introduced in 4.0 and temporarily change the VM's compat level to the old cluster. We can use the next_run config to revert back to the default(no override, inheriting the cluster's level) on VM shutdown

Comment 1 Marina 2016-07-13 15:40:54 UTC
Not to confuse with the original bug:
<mskrivanek> mku: that's another improvement of the process, but that is not backportable to 3.6 as it depends on a different 4.0 feature

Comment 2 Michal Skrivanek 2016-08-04 10:00:51 UTC
the way warning during the cluster level upgrade should work is:
during CL update - warn on suspended VMs, VMs with snapshots with RAM, running VMs, paused VMs
after CL upgrade - reconfig icon for suspended VMs, running VMs, paused VMs
on snapshot preview - block restoring with RAM

Comment 6 Michal Skrivanek 2016-08-23 13:54:23 UTC
pending testing with HE
still may cause issues on many running VMs due to bug 1366786 which is 4.0.4

Comment 7 sefi litmanovich 2016-08-30 15:55:28 UTC
Tested upgrade from cluster 3.6 (2 hosts with vdsm-4.17.33-1) to cluster 4.0 (upgraded the hosts to vdsm-4.18.11-1). On 3.6 created and ran vms with various configurations and kept vms running during/after upgrade.

Tested flows/configurations:

1. Snapshot creation before upgrade (with and without memory) and restoring after upgrade:
without memory - the vm snapshot is restored and vm starts with 4.0 xml.
with memory - we get the expected warning message, after confirming the vm is restored and starts with 3.6 xml.
Both are expected behaviours.
2. Snapshot create-preview-undo-clone-remove after upgrade - pass.
3. Migration after upgrade - passed.
5. Run once + cloud init - Tested run once before upgrade and used cloud init to set hostname/username/password, custom cpu type, changed console from spice to vnc - all configurations works fine as expected. After upgrade poweroff vm - vm's configuration revert back and vm starts with 4.0 xml. - pass
6. Consoles - both vnc and spice consoles had no regression after upgrade. On spice checked usb support, file transfer, copy paste support. - passed.
7. Memory hotplug after upgrade - passed.
8. Cpu hotplug after upgrade - passed. Tested twice: once when 'HotPlugCpuSupported' is set to 'false' for 3.6 with arch x86_64 - couldn't hot plug and got the expected error message. Once when it set to true - hot plug succeeded.
9. Nic hotplug - passed.
10. disk hotplug - passed.
11. HA vm - after upgrade kill vm's process see that it starts right away - passed.
12. Hyperv enlightenment for windows vm - passed: checked configuration level only - vms created as windows vm had all the hyper-v flags enabled in xml, whereas linux vms did not.
13. I/O threads - set a vm with 4 I/O threads - configuration wasn't changed after upgrade. - passed.

Problems:
Both problems that were found in pre integration build persisted:
1. HA VM isn't updated correctly in case that the process is killed in qemu level and engine invokes restart - https://bugzilla.redhat.com/show_bug.cgi?id=1369521
2. When upgrading cluster with HE vm, the HE vm's xml doesn't change and no mark for restart appears - https://bugzilla.redhat.com/show_bug.cgi?id=1370120

Verifying as there are open bugs on the issue and in most flows the upgrade works as expected.


Note You need to log in before you can comment on or make changes to this bug.