Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 982686 - engine [logger]: same exception is logged multiple times in engine log
Summary: engine [logger]: same exception is logged multiple times in engine log
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.2.0
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: ---
: 3.4.0
Assignee: Piotr Kliczewski
QA Contact: sefi litmanovich
URL:
Whiteboard: infra
Depends On: 960635
Blocks: rhev3.4beta 1142926
TreeView+ depends on / blocked
 
Reported: 2013-07-09 14:54 UTC by Alissa
Modified: 2016-02-10 19:40 UTC (History)
13 users (show)

Fixed In Version: ovirt-3.4.0-alpha1
Doc Type: Bug Fix
Doc Text:
Clone Of: 960635
Environment:
Last Closed:
oVirt Team: Infra
Target Upstream Version:


Attachments (Terms of Use)

Description Alissa 2013-07-09 14:54:33 UTC
+++ This bug was initially created as a clone of Bug #960635 +++

Description of problem:

I ran a vm on hsm host and restarted the vdsm on the hsm during the create volume step.
the live storage migration command fails and we report it 5 (not counting the INFO which also reports it)

(see more info)

Version-Release number of selected component (if applicable):

sf15
vdsm-4.10.2-17.0.el6ev.x86_64

How reproducible:

100%

Steps to Reproduce:
1. create and run a vm on iscsi pool with 2 hosts
2. live migrate the vm's disk
3. restart the vdsm on the hsm when task createVolume appears in spm
 
Actual results:

we fail the move and engine reports it 5 times in the log which would make debugging deficult in case we have multiple vms

Expected results:

we should diminish the amount of reported errors in the log


Additional info: logs

2013-05-07 16:14:10,043 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-46) [105ad651] Error code imageErr and error message VDSGenericException: VDSErrorException: Failed to VmReplicateDiskStartVDS, e
rror = Drive image file %s could not be found

2013-05-07 16:14:10,044 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-4-thread-46) [105ad651] Command VmReplicateDiskStartVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed
to VmReplicateDiskStartVDS, error = Drive image file %s could not be found

2013-05-07 16:14:10,044 ERROR [org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand] (pool-4-thread-46) [105ad651] Command org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand throw Vdc Bll exception. With error message VdcBLLException:
 Drive image file %s could not be found

2013-05-07 16:14:10,070 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-4-thread-46) [105ad651] Error code imageErr and error message VDSGenericException: VDSErrorException: Failed to VmReplicateDiskFinishVDS,
error = Drive image file %s could not be found

2013-05-07 16:14:10,070 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-4-thread-46) [105ad651] Command VmReplicateDiskFinishVDS execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed
 to VmReplicateDiskFinishVDS, error = Drive image file %s could not be found

--- Additional comment from Alissa on 2013-07-09 08:04:05 EDT ---

First problem that needs a fix related to the specific error that the error doesn't contain the name of the image file, it's not properly replaced and will make troubleshooting more difficult: 
"Drive image file %s could not be found".

Second - regarding the multiple logging, it looks as this is not a storage related problem, but rather happens for many different commands.
Perhaps the bug should be splitted to handle the first issue in storage area, and the multiple logging to be handled on infra level - Ayal, your input on this?
The multiple logging is done in the following places:

BrokerCommandBase --> ProceedProxyReturnValue
VDSCommandBase --> setVdsRuntimeError

--- Additional comment from Ayal Baron on 2013-07-09 09:52:12 EDT ---

(In reply to Alissa from comment #1)
> First problem that needs a fix related to the specific error that the error
> doesn't contain the name of the image file, it's not properly replaced and
> will make troubleshooting more difficult: 
> "Drive image file %s could not be found".

This means that we're lacking this information in the log?

> 
> Second - regarding the multiple logging, it looks as this is not a storage
> related problem, but rather happens for many different commands.
> Perhaps the bug should be splitted to handle the first issue in storage
> area, and the multiple logging to be handled on infra level - Ayal, your
> input on this?
> The multiple logging is done in the following places:
> 
> BrokerCommandBase --> ProceedProxyReturnValue
> VDSCommandBase --> setVdsRuntimeError

Sounds reasonable to me.
Please clone it, rename both to reflect the relevant problems etc.

--- Additional comment from Alissa on 2013-07-09 10:43:06 EDT ---

(In reply to Ayal Baron from comment #2)
> (In reply to Alissa from comment #1)
> > First problem that needs a fix related to the specific error that the error
> > doesn't contain the name of the image file, it's not properly replaced and
> > will make troubleshooting more difficult: 
> > "Drive image file %s could not be found".
> 
> This means that we're lacking this information in the log?
yes. I think the incomplete message is originated like this in vdsm so the bug should be moved from engine to vdsm component - this is how it's declared in define.py (and it's not defined like this in engine code):
  'imageErr': {'status': {
        'code': 13,
        'message': 'Drive image file %s could not be found'}},

> > Second - regarding the multiple logging, it looks as this is not a storage
> > related problem, but rather happens for many different commands.
> > Perhaps the bug should be splitted to handle the first issue in storage
> > area, and the multiple logging to be handled on infra level - Ayal, your
> > input on this?
> > The multiple logging is done in the following places:
> > 
> > BrokerCommandBase --> ProceedProxyReturnValue
> > VDSCommandBase --> setVdsRuntimeError
> 
> Sounds reasonable to me.
> Please clone it, rename both to reflect the relevant problems etc.

Comment 1 Alissa 2013-07-09 14:56:59 UTC
The multiple logging is done in the following places:

BrokerCommandBase --> ProceedProxyReturnValue
VDSCommandBase --> setVdsRuntimeError

Comment 2 Barak 2013-08-13 11:56:50 UTC
Yaniv, don't try to reproduce with live storage migration.
Just fix/understand the issue described in comment #1

Comment 3 Yair Zaslavsky 2013-08-15 07:15:39 UTC
Yaniv,  following Comment #2 the log at BrokerCommandBase happens in the "default" part of the large switch.
error code 13 is imageErr at VdcBllErrors.
There are many other errors in VdcBllErrors which are handdled by the default part (so indeed you can use other flows to reproduce this).

ProccedProxyReturnValue is being called from VmReplicateDiskStartVdsCommand and
VmReplaicateDiskFinishVdsCommand  - both extend the BrokerCommndBase.
These VDS broker commands are called from the task handlers of LiveMigrateDisk, see
LiveMigrateDiskCommand.initTaskHandlers()
Since each one of these VDS commands also extend VdsCommandBase, you will see that setVdsRuntimeError(ex) is also being called (see VdsCommandBase.executeCommand

One of the problems is that all the exceptions generated by ProceedProxyReturnValue extend (directly or indirectly) RuntimeException,
and you can see what's going on in VDSCommandBase.executeCommand  - the catch for RuntimeException performs the setVdsRuntimeError

What i would suggest (of course other suggestions are posible :) is to wrap
the exceptions generated at ProceedProxyReturnValue with Let's say a ProxyReturnValueException which extends RuntimeException, and then at the VdsCommandBase.executeCommand code, add a catch to ProxyReturnValueException which does not log (in contrast to catch RuntimeException)

Comment 4 Yaniv Bronhaim 2013-09-02 13:36:35 UTC
Yair, I can catch VDSExceptionBase and call VDSCommandBase::setVdsRuntimeError with a parameter not to report to log. Isn't that enough?

Comment 5 Yair Zaslavsky 2013-09-16 04:27:46 UTC
Yaniv, did you check this out? How will you determine the value of the boolean flag?
Anyway, I'm postponing to 3.4

Comment 6 Eli Mesika 2013-11-20 09:53:43 UTC
fixed in commit : afc4d50

Comment 7 Sandro Bonazzola 2014-01-14 08:44:43 UTC
ovirt 3.4.0 alpha has been released

Comment 8 sefi litmanovich 2014-02-17 13:22:35 UTC
Verified on Ovirt 3.4.0-0.7beta2, vdsm-4.14.1-3.el6.x86_64.

engine.log doesn't have multiple error messages at the same time.

Comment 10 Itamar Heim 2014-06-12 14:07:21 UTC
Closing as part of 3.4.0


Note You need to log in before you can comment on or make changes to this bug.