Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 697277 - Backend: wrong error message when migration fails
Summary: Backend: wrong error message when migration fails
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 2.2.7
Hardware: All
OS: Windows
unspecified
high
Target Milestone: ---
: 3.5.0
Assignee: Francesco Romani
QA Contact: Pavel Novotny
URL:
Whiteboard: virt
Depends On:
Blocks: 860222 rhev3.5beta 1156165
TreeView+ depends on / blocked
 
Reported: 2011-04-17 12:17 UTC by Yaniv Kaul
Modified: 2015-02-17 08:30 UTC (History)
9 users (show)

Fixed In Version: vt3
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-17 08:30:16 UTC
oVirt Team: ---
Target Upstream Version:
sherold: Triaged+


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
oVirt gerrit 23946 None None None Never
oVirt gerrit 25090 master ABANDONED core: don't log VM on error if migration fails Never

Description Yaniv Kaul 2011-04-17 12:17:18 UTC
Description of problem:
I've migrated a VM from host A to host B.
Regretfully, migration failed (due to timeout).
This has caused the VM on the destination to exit:
Thread-146826::DEBUG::2011-04-17 14:55:44,912::vm::1434::vds.vmlog.514e0257-3f28-4f39-9a44-d2b786146675::Changed state to Down: Migration failed

The event on the RHEVM is:
VM <vmname> is down. Exit message Migration failed.

while technically it is somewhat correct (the destination is down), the reality is that it should be still up (and hopefully running!) on the source!
The message, as is, is quite confusing and alarming.

Version-Release number of selected component (if applicable):
2.2.7

How reproducible:


Steps to Reproduce:
1. Cause migration to fail due to timeout (make the VM do a lot of IO, for example, or memory scanning).
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 lpeer 2011-04-21 10:41:13 UTC
When running a VM on a host RHEVM collects statistics from which it learns the status of the VM.

ATM VDSM returns on Down Vms one of two exit status: Normal / ERROR, and for error adding a String with exit message.

For having a special message on migration failed, RHEVM+VDSM need to support another exit code or have obligation that the message can not be changed and RHEVM can base logic on the content of the message. 
I personally prefer new exit code ERROR_ON_MIGRATION

Anyway moving to RFE to support either of the two options.

Comment 2 Simon Grinberg 2011-05-11 13:56:05 UTC
(In reply to comment #0)

> while technically it is somewhat correct (the destination is down), the reality
> is that it should be still up (and hopefully running!) on the source!
> The message, as is, is quite confusing and alarming.
> 

Kaul, as I understand this is the case, IE the source machine is up and running as it should. The problem is the event that declares the machine went down, while the status in the GUI returns to up - right? 

(In reply to comment #1)
> When running a VM on a host RHEVM collects statistics from which it learns the
> status of the VM.
> 
> ATM VDSM returns on Down Vms one of two exit status: Normal / ERROR, and for
> error adding a String with exit message.
> 
> For having a special message on migration failed, RHEVM+VDSM need to support
> another exit code or have obligation that the message can not be changed and
> RHEVM can base logic on the content of the message. 
> I personally prefer new exit code ERROR_ON_MIGRATION
> 
> Anyway moving to RFE to support either of the two options.

Livnat, if Kaul's response to my question is positive the problem is that RHEV Manager collects the status from both destination as source hosts and while encountering the down message from the destination it logs it, even though the VM itself is up and running in the source. In this case it is a bug and not an RFE. The backend should cross this message from the destination with the status in the source and conclude migration failure. Unless you are not maintaining a thread that follows migration from start to finish.

Comment 8 Michal Skrivanek 2014-01-30 09:55:39 UTC
while waiting on engine side refactoring, comment #1 makes sense to implement independently. Once we have a differentiation in VM's exit code the engine side change would be trivial to do

Comment 9 Francesco Romani 2014-01-30 10:05:30 UTC
seems related: https://bugzilla.redhat.com/show_bug.cgi?id=557125 , considering the amendements suggested by Federico and implemented in the last patchset:
http://gerrit.ovirt.org/#/c/22631/5/

Comment 10 Francesco Romani 2014-09-09 13:31:56 UTC
This independent fix: http://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=commit;h=68aba2b12b90a997cee0f1e0221eb6f48eb8fd35

for this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1104195

should have solved this problem as well. Moving to MODIFIED and abandoning my patch.

Comment 11 Eyal Edri 2014-09-10 20:21:58 UTC
fixed in vt3, moving to on_qa.
if you believe this bug isn't released in vt3, please report to rhev-integ@redhat.com

Comment 12 Pavel Novotny 2014-10-08 16:41:21 UTC
Verified in rhevm-3.5.0-0.14.beta.el6ev.noarch (vt5).

Verification steps follow the reproducer:
1. Have a VM with high CPU & IO load.
2. Start migration of the VM from host A to host B.

Result:
Migrations is cancelled due to timeout. VM event message says:
"Migration failed due to Error: Migration not in progress (VM: user-vm01, Source: A, Destination: B)."
The VM then remains running on the source host A.

Comment 13 Omer Frenkel 2015-02-17 08:30:16 UTC
RHEV-M 3.5.0 has been released


Note You need to log in before you can comment on or make changes to this bug.