Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1367557 - HA VMs are not restarted on different host if NonResponsive host is off and start action failed
Summary: HA VMs are not restarted on different host if NonResponsive host is off and s...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Infra
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
high
urgent vote
Target Milestone: ovirt-4.0.4
: 4.0.4
Assignee: Martin Perina
QA Contact: Israel Pinto
URL:
Whiteboard:
Depends On:
Blocks: 1368202
TreeView+ depends on / blocked
 
Reported: 2016-08-16 18:42 UTC by Martin Perina
Modified: 2016-09-26 12:36 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1368202 (view as bug list)
Environment:
Last Closed: 2016-09-26 12:36:09 UTC
oVirt Team: Infra
rule-engine: ovirt-4.0.z+
rule-engine: ovirt-4.1+
mgoldboi: planning_ack+
mperina: devel_ack+
pstehlik: testing_ack+


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
oVirt gerrit 63224 master MERGED core: Always fix VMs status during fencing if host if off 2016-09-04 09:37:43 UTC
oVirt gerrit 63226 ovirt-engine-4.0 MERGED core: Always fix VMs status during fencing if host if off 2016-09-04 10:58:30 UTC
oVirt gerrit 63227 ovirt-engine-4.0.4 MERGED core: Always fix VMs status during fencing if host if off 2016-09-04 11:04:42 UTC

Description Martin Perina 2016-08-16 18:42:09 UTC
Description of problem:

HA VMs are not restarted on different host(s) if NonResponsive host is off (detected by successful power management status action) and start power management action failed

Version-Release number of selected component (if applicable):

3.0 and later

How reproducible:

100%

Steps to Reproduce:
1. Create 2 or more hosts cluster with power management properly configured
2. Run HA VMs on host1 and make this host non responsive by turning the power off
3. Make sure that power management status of host1 can be properly detected, but power management start action failed
4. Check that fence of host1 failed and HA VMs are not restarted on different hosts although we detect that host1 is turned off using power management

Actual results:

HA VMs are not restarted on different host even though host1 is powered off

Expected results:

HA VMs are restarted on different hosts

Additional info:

Comment 1 Israel Pinto 2016-09-11 13:14:51 UTC
Verify with:
rhevm-4.0.4.2-0.1.el7ev.noarch
Hosts with PM:
OS Version:RHEL - 7.2 - 9.el7
OS Description:Red Hat Enterprise Linux Server 7.2 (Maipo)
Kernel Version:3.10.0 - 327.28.3.el7.x86_64
KVM Version:2.3.0 - 31.el7_2.21
LIBVIRT Version:libvirt-1.2.17-13.el7_2.5
VDSM Version:vdsm-4.18.13-1.el7ev

Steps:
1. Create 2 hosts in cluster with power management properly configured
2. Run HA 2 VMs and 1 none HA VM  on host1 
3. Turn power off host_1 from power management and setup host not to power up(from remote PM iLO4 in this case)
4. Check HA VMs are started on host_2

Results:
HA VMs run on host_2 - PASS

Comment 2 Donny Davis 2016-09-23 19:12:15 UTC
My customer is experiencing the same issue. He is on the latest RHV 4.0 release. His environment is secure, so logs will not be possible. 

His environment was just upgraded from RHEV 3.6 to RHV 4.0 in hopes that the issue would be resolved, however it still exists.

His hardware is running iDrac 7 as the only difference from the reported bug.

Comment 3 Martin Perina 2016-09-23 20:08:45 UTC
(In reply to Donny Davis from comment #2)
> My customer is experiencing the same issue. He is on the latest RHV 4.0
> release. His environment is secure, so logs will not be possible. 
> 
> His environment was just upgraded from RHEV 3.6 to RHV 4.0 in hopes that the
> issue would be resolved, however it still exists.
> 
> His hardware is running iDrac 7 as the only difference from the reported bug.

As you can see this bug is targeted to 4.0.4, which is not yet officially released. Also fix for this bug was backported into 3.6.9 in BZ1372812 which was released a few days ago ...


Note You need to log in before you can comment on or make changes to this bug.