Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1687032 - Bad error handling when writing storage domain metadata may corrupt metadata [NEEDINFO]
Summary: Bad error handling when writing storage domain metadata may corrupt metadata
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.30.8
Hardware: Unspecified
OS: Unspecified
unspecified
urgent vote
Target Milestone: ovirt-4.3.3
: ---
Assignee: Nir Soffer
QA Contact: Shir Fishbain
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-08 23:12 UTC by Nir Soffer
Modified: 2019-04-16 13:58 UTC (History)
3 users (show)

Fixed In Version: vdsm-4.30.12
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-16 13:58:22 UTC
oVirt Team: Storage
aefrat: needinfo? (nsoffer)
sbonazzo: ovirt-4.3?
sbonazzo: planning_ack?
sbonazzo: devel_ack?
sbonazzo: testing_ack?


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
oVirt gerrit 98384 master MERGED storage: make flush and refresh methods of PersistentDict private 2019-03-13 23:11:23 UTC
oVirt gerrit 98385 master MERGED storage: fix PersistentDict transaction rollback 2019-03-19 09:27:20 UTC
oVirt gerrit 98387 master MERGED tests: Add more failing tests for persistent dict 2019-03-12 14:36:37 UTC
oVirt gerrit 98388 master MERGED persistent: Fix transaction cleanup after errors 2019-03-12 15:31:50 UTC
oVirt gerrit 98603 ovirt-4.3 MERGED tests: Add more failing tests for persistent dict 2019-03-19 17:04:41 UTC
oVirt gerrit 98604 ovirt-4.3 MERGED persistent: Fix transaction cleanup after errors 2019-03-19 17:04:44 UTC
oVirt gerrit 98626 ovirt-4.3 MERGED storage: make flush and refresh methods of PersistentDict private 2019-03-19 17:04:46 UTC
oVirt gerrit 98663 ovirt-4.3 MERGED storage: fix PersistentDict transaction rollback 2019-03-19 17:04:49 UTC

Description Nir Soffer 2019-03-08 23:12:31 UTC
Description of problem:

A storage read or write error when modifying storage domain metadata may
leave storage domain metadata in inconsistent state.

There are 2 issues:

- Read error may leave metadata object in transaction state. After that no data
  will be written to storage until the storage domain is refreshed (usually 
  every 5 minutes).

- Write error does not rollback the changes, so the metadata keeps values
  modified during a transaction, while state on storage was not modified.

Both issues can cause different hosts to see different metadata at the same
time.

I don't know if how to reproduce this with the real system, but it is easy
to reproduce in vdsm automated tests when we can inject both read and write
errors.

Version-Release number of selected component (if applicable):
Any

How reproducible:
Always in vdsm tests, should be very hard in real system.

Steps to Reproduce:
Inject read or write errors when accessing storage domain metadata.

Comment 1 Avihai 2019-04-02 08:30:23 UTC
Hi Nir,

As this is already verified on vdsm tests and there is no real system scenario, can you please verify this bug?

Comment 2 Tal Nisan 2019-04-02 14:11:07 UTC
Verified in VDSM test as it is (almost) impossible to reproduce in a running system

Comment 3 Sandro Bonazzola 2019-04-16 13:58:22 UTC
This bugzilla is included in oVirt 4.3.3 release, published on April 16th 2019.

Since the problem described in this bug report should be
resolved in oVirt 4.3.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.