Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 988339 - [scale] race - sometimes VM and VDS statuses is not being updated (host stuck in unassigned)
Summary: [scale] race - sometimes VM and VDS statuses is not being updated (host stuck...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.2.0
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: 3.4.0
Assignee: Roy Golan
QA Contact: Yuri Obshansky
URL:
Whiteboard: infra
Depends On:
Blocks: 1060700 rhev3.4beta 1142926
TreeView+ depends on / blocked
 
Reported: 2013-07-25 11:15 UTC by Pavel Zhukov
Modified: 2018-12-03 19:25 UTC (History)
21 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, when a host was stuck in an unassigned state, it could also cause virtual machines on other hosts to stop updating their status. This update adds a concurrent hash map for the internal event queue, which fixes this issue.
Clone Of:
: 1060700 (view as bug list)
Environment:
Last Closed:
oVirt Team: Infra
Target Upstream Version:


Attachments (Terms of Use)
eventq.btm (deleted)
2013-07-28 12:12 UTC, Roy Golan
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 438093 None None None Never

Description Pavel Zhukov 2013-07-25 11:15:43 UTC
Description of problem:
After some manipulation with the hosts, one of them went to "Unassigned" state for a long time (more than 20 hrs). Statuses of the VMs on all _other_ host are not being updated (VM can be launched without errors from the host/engine but status is 9 "waiting for launch). VMs can be powered off and launched again (status changed from 0 to 9 and vice versa, run_on_vds is changed as well). Free memory of the host is not being updated.  

Version-Release number of selected component (if applicable):
rhevm-3.2.1-0.39.el6ev.noarch

How reproducible:
Unknown. 2 systems are affected


Actual results:
One host is in Unassigned mode. 
New started VMs are in "Waiting for launch" status but actually up and running

Comment 10 Roy Golan 2013-07-28 12:12:09 UTC
Created attachment 779325 [details]
eventq.btm

Comment 23 Yair Zaslavsky 2013-08-20 12:05:13 UTC
Still needs to be investigated, postponing to 3.2.4

Comment 24 Barak 2013-09-16 11:41:43 UTC
This bug is about patch

Comment 26 Barak 2013-09-17 12:12:16 UTC
the patch was accepted upstream long time ago and it is already in 3.3,
I would like to test this scenario as a part of the scale testing for 3.3,
Hence moving to ON_QA

Comment 27 Charlie 2013-11-28 00:13:59 UTC
This bug is currently attached to errata RHEA-2013:15231. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.

Comment 28 Shai Revivo 2014-01-15 12:14:22 UTC
QE are unable to verify this scale bug for 3.3.
will verify in 3.4

Comment 30 Barak 2014-02-03 11:58:28 UTC
Added a 3.3.z flag to test it for 3.3.zstream

Comment 33 Eldad Marciano 2014-05-13 14:19:00 UTC
How to reproduced the bug?

Comment 34 Eldad Marciano 2014-06-05 13:18:27 UTC
Tested on 3.4(latest) 3.4.0-0.21.el6ev

- I have created 37 hosts 
- running deactivate and active in high frequency.
- hosts being unassigned for 2-3 min and then status Ok.

Comment 35 Itamar Heim 2014-06-12 14:06:42 UTC
Closing as part of 3.4.0


Note You need to log in before you can comment on or make changes to this bug.