Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1122639 - Error complaining about missing images for exported VM
Summary: Error complaining about missing images for exported VM
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.4.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: 3.5.0
Assignee: Ravi Nori
QA Contact: Jiri Belka
URL:
Whiteboard: infra
Depends On:
Blocks: rhev3.5beta 1156165
TreeView+ depends on / blocked
 
Reported: 2014-07-23 16:47 UTC by Fabian Salamanca
Modified: 2016-02-10 19:28 UTC (History)
12 users (show)

Fixed In Version: ovirt-3.5.0_rc1.1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-17 17:13:10 UTC
oVirt Team: Infra
Target Upstream Version:


Attachments (Terms of Use)
engine.log, vdsm, messages (deleted)
2014-07-23 16:49 UTC, Fabian Salamanca
no flags Details


Links
System ID Priority Status Summary Last Updated
oVirt gerrit 30935 master MERGED engine : Use java naming conventions Never
oVirt gerrit 31038 master MERGED engine : Zombie tasks should endWithFailure Never
oVirt gerrit 31219 ovirt-engine-3.5 MERGED engine : Use java naming conventions Never
oVirt gerrit 31220 ovirt-engine-3.5 MERGED engine : Zombie tasks should endWithFailure Never

Description Fabian Salamanca 2014-07-23 16:47:33 UTC
Description of problem: 
When I export any machine with more than 2 vdisks, any type, VDSM
complains about an ERROR and aborts de export of images, but the OVF is
left as a successful export.

RHEVM reports successful export and when I try to import de VM, I
receive a "missing image" error. Which is correct, the OVF is left
without any images.


Version-Release number of selected component (if applicable): rhevm-3.4.0-0.22


How reproducible: 100%


Steps to Reproduce:
1. Export VM with 3 vdisks or more
2. Verify RHEVM reports successful export
3. Import the exported VM
4. Error complaining about missing images for exported VM

Actual results:

Error complaining about missing images for exported VM

Expected results:

VM imported successfully

Additional info:

Comment 1 Fabian Salamanca 2014-07-23 16:49:27 UTC
Created attachment 920296 [details]
engine.log, vdsm, messages

Comment 3 Liron Aravot 2014-07-24 08:23:47 UTC
From brief look it seems that we have 2 issues here -

We can take for example task d9f0fba3-63ae-4a4c-a904-525907a0f16c [a]

1.  it can be seen that after a mintue the engine detects the task as zombie task and attempts to stop it, which is obviously wrong. this is either caused by wrong value of the config value AsyncTaskZombieTaskLifeInMinutes or by wrong comprasion.

-----------------------------------------------------------------------
2014-07-22 21:01:46,483 INFO  [org.ovirt.engine.core.bll.SPMAsyncTask] (DefaultQuartzScheduler_Worker-96) SPMAsyncTask::PollTask: Polling task d9f0fba3-63ae-4
a4c-a904-525907a0f16c (Parent Command ExportVm, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) returned status running.
2014-07-22 21:01:46,484 INFO  [org.ovirt.engine.core.bll.AsyncTaskManager] (DefaultQuartzScheduler_Worker-96) Finished polling Tasks, will poll again in 10 se
conds.
2014-07-22 21:01:47,885 INFO  [org.ovirt.engine.core.bll.scheduling.PolicyUnitImpl] (DefaultQuartzScheduler_Worker-3) There is no over-utilized host in cluste
r Telcel_cluster_01
2014-07-22 21:02:38,719 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-16) Correlation ID: null, 
Call Stack: null, Custom Event ID: -1, Message: Stopping async task ExportVm that started at Tue Jul 22 21:01:36 CDT 2014
2014-07-22 21:02:38,719 INFO  [org.ovirt.engine.core.bll.AsyncTaskManager] (DefaultQuartzScheduler_Worker-16) Cleaning zombie tasks: Stopping async task Expor
tVm that started at 7/22/14 9:01 PM
2014-07-22 21:02:38,720 INFO  [org.ovirt.engine.core.bll.SPMAsyncTask] (DefaultQuartzScheduler_Worker-16) SPMAsyncTask::StopTask: Attempting to stop task ab27
ab6c-0877-4a99-b0c0-f3403247fb10 (Parent Command ExportVm, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters).
-----------------------------------------------------------------------

2. when a task is stopped by the engine when its detected as zombie, endSuccesfully shouldn't be called for the parent command, but endWithFailue. as endSuccesfully is being executed the engine marks the export as succesful.

2014-07-22 21:02:50,115 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-4-thread-11) Correlation ID: 15933d80, Job ID: 23623117-7370-4d7e-9e1c-d13c1cb66efa, Call Stack: null, Custom Event ID: -1, Message: Vm Backups_02 was exported successfully to EXPORT_01


[a] :

-----------------------------------------------------------------------
2014-07-22 21:01:35,906 INFO  [org.ovirt.engine.core.bll.AsyncTaskManager] (org.ovirt.thread.pool-4-thread-1) [37d34d0d] Adding task d9f0fba3-63ae-4a4c-a904-5
25907a0f16c (Parent Command ExportVm, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters), polling hasn't started yet..

2014-07-22 21:01:36,872 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-4-thread-1) [5a1eea6a] Correlation ID: 15933d80, Job ID: 23623117-7370-4d7e-9e1c-d13c1cb66efa, Call Stack: null, Custom Event ID: -1, Message: Starting export Vm Backups_02 to EXPORT_01


2014-07-22 21:01:36,873 INFO  [org.ovirt.engine.core.bll.SPMAsyncTask] (org.ovirt.thread.pool-4-thread-1) [5a1eea6a] BaseAsyncTask::startPollingTask: Starting to poll task d9f0fba3-63ae-4a4c-a904-525907a0f16c.
-----------------------------------------------------------------------

Comment 4 Liron Aravot 2014-07-24 08:27:21 UTC
Fabian, additionaly please run this query and report back the results so the issue will be better analyzed.

select * from vdc_options where option_name = 'AsyncTaskZombieTaskLifeInMinutes';

thanks.

Comment 5 Fabian Salamanca 2014-07-24 16:25:27 UTC
Sure!



engine=# select * from vdc_options where option_name =
engine-# 'AsyncTaskZombieTaskLifeInMinutes';
 option_id |           option_name            | option_value | version 
-----------+----------------------------------+--------------+---------
        28 | AsyncTaskZombieTaskLifeInMinutes | 1            | general
(1 row)

Comment 6 Liron Aravot 2014-07-24 16:33:22 UTC
Thanks Fabian,
Have you ever updated the config values? because the default value for that is 3000.

please run the following -
update vdc_options set option_value = 3000 where option_name = 'AsyncTaskZombieTaskLifeInMinutes';

and your problem should be solved (atleast for exports that take less than 50 hours).

regardless of that, what needs to be solved is what i specified in under #1 in my comment - https://bugzilla.redhat.com/show_bug.cgi?id=1122639#c3.

Comment 7 Fabian Salamanca 2014-07-24 16:45:25 UTC
Hi Liron!

This is an in place upgrade from a RHEV 3.3.2, in that version ZombieTasks where leaving me with locked disks, so I changed the value and decided to upgrade, forgot to take back that value to the default.

Now I have the default value:

[root@rhev-mgr dbscripts]# rhevm-config -g AsyncTaskZombieTaskLifeInMinutes
AsyncTaskZombieTaskLifeInMinutes: 3000 version: general
[root@rhev-mgr dbscripts]# 

And I still have the same issue, when I try to import the VM, it complains about missing images and indeed, there are none in the images directory in the export domain:

For a 2 vdisks VM:

[root@host images]# ls -lrt
total 4
drwxr-xr-x. 2 36 kvm 4096 Jul 24 13:17 7a1e8654-9cc0-4dc5-a9e6-ef413bb4c244
[root@host images]# 

From engine.log :

2014-07-24 13:19:18,725 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.GetImageInfoVDSCommand] (ajp-/127.0.0.1:8702-11) [545aa28b] IrsBroker::getImageInfo::Failed getting image info imageId = d03d5e87-659d-4cfa-a458-5aacbca09672 does not exist on domainName = EXPORT_01 , domainId = e211e3f5-0249-4d9b-a583-bbc03e5a4c74,  error code: ImagePathError, message: Image path does not exist or cannot be accessed/created: ('/rhev/data-center/mnt/10.202.130.103:_var_lib_export/e211e3f5-0249-4d9b-a583-bbc03e5a4c74/images/a5f46711-307f-47d4-a483-e1f7af4751a1',)


Thanks!

Comment 8 Liron Aravot 2014-07-30 08:48:29 UTC
Fabian, as the export operation was already mistakenly marked as succesfull (see #2 at https://bugzilla.redhat.com/show_bug.cgi?id=1122639#c3), this export is already broken.

try to export and import again and report if the problem persists.

thanks,
Liron

Comment 9 Allon Mureinik 2014-07-30 15:18:11 UTC
(In reply to Liron Aravot from comment #3)
> 2. when a task is stopped by the engine when its detected as zombie,
> endSuccesfully shouldn't be called for the parent command, but
> endWithFailue. as endSuccesfully is being executed the engine marks the
> export as succesful.
Oved, this seems like an issue in the tasks infrastructure (it isn't specific to this command).
Can one of your guys please take a look?

Comment 10 Oved Ourfali 2014-07-31 07:36:07 UTC
Asked Ravi to look at it.

Comment 11 Jiri Belka 2014-09-11 09:21:36 UTC
ok, vt3.

it can't be reproduced as "zombie" task does not end anymore as ended successfully thus the VM is not exported at all. tested with ' 'AsyncTaskZombieTaskLifeInMinutes' set to '1' to get zombie task. then i see 'Failed to export Vm...' in events.

Comment 13 Eyal Edri 2015-02-17 17:13:10 UTC
rhev 3.5.0 was released. closing.


Note You need to log in before you can comment on or make changes to this bug.