Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1511127 - scary messages when creating multiple vdo devices
Summary: scary messages when creating multiple vdo devices
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kmod-kvdo
Version: 7.5
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Thomas Jaskiewicz
QA Contact: Jakub Krysl
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-08 18:06 UTC by Corey Marthaler
Modified: 2018-10-30 09:39 UTC (History)
6 users (show)

Fixed In Version: 6.1.0.168
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-30 09:38:49 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:3094 None None None 2018-10-30 09:39:26 UTC

Description Corey Marthaler 2017-11-08 18:06:26 UTC
Description of problem:

Is there a real corruption problem here? If not customers aren't going to want to see "error" and "corrupt" messages.

vdo create --name vdo1 --device /dev/mapper/mpathb
[...]
vdo create --name vdo7 --device /dev/mapper/mpathh
Creating VDO vdo7
Starting VDO vdo7
Starting compression on VDO vdo7
VDO instance 6 volume is ready at /dev/mapper/vdo7

[root@harding-02 ~]# vdo list
vdo3
vdo2
vdo1
vdo7
vdo6
vdo5
vdo4

[root@harding-02 ~]# grep -i corrupt /var/log/messages
Nov  8 12:00:39 harding-02 kernel: : UDS Error: Corrupt saved component (1030)
Nov  8 12:00:39 harding-02 kernel: : UDS Error: Corrupt saved component (1030)
Nov  8 12:00:39 harding-02 kernel: : UDS Error: Corrupt saved component (1030)
Nov  8 12:00:51 harding-02 kernel: : UDS Error: Corrupt saved component (1030)
Nov  8 12:00:51 harding-02 kernel: : UDS Error: Corrupt saved component (1030)
Nov  8 12:00:51 harding-02 kernel: : UDS Error: Corrupt saved component (1030)

[root@harding-02 ~]# grep -i error /var/log/messages
Nov  8 12:00:38 harding-02 UDS/vdoformat[4572]: ERROR  (vdoformat/4572) loadVolumeGeometry ID mismatch, expected 5, got 268482810: VDO Status: Component id mismatch in decoder (2059)
Nov  8 12:00:38 harding-02 UDS/vdoformat[4572]: ERROR  (vdoformat/4572) decodeSuperBlock ID mismatch, expected 0, got 268482810: VDO Status: Component id mismatch in decoder (2059)
Nov  8 12:00:39 harding-02 kernel: : UDS Error: Corrupt saved component (1030)
Nov  8 12:00:39 harding-02 kernel: : UDS Error: Corrupt saved component (1030)
Nov  8 12:00:39 harding-02 kernel: : UDS Error: Corrupt saved component (1030)
Nov  8 12:00:49 harding-02 UDS/vdoformat[4624]: ERROR  (vdoformat/4624) loadVolumeGeometry ID mismatch, expected 5, got 268482810: VDO Status: Component id mismatch in decoder (2059)
Nov  8 12:00:49 harding-02 UDS/vdoformat[4624]: ERROR  (vdoformat/4624) decodeSuperBlock ID mismatch, expected 0, got 268482810: VDO Status: Component id mismatch in decoder (2059)
Nov  8 12:00:51 harding-02 kernel: : UDS Error: Corrupt saved component (1030)
Nov  8 12:00:51 harding-02 kernel: : UDS Error: Corrupt saved component (1030)
Nov  8 12:00:51 harding-02 kernel: : UDS Error: Corrupt saved component (1030)


Version-Release number of selected component (if applicable):
3.10.0-772.el7.x86_64

lvm2-2.02.176-2.el7    BUILT: Fri Nov  3 07:46:53 CDT 2017
lvm2-libs-2.02.176-2.el7    BUILT: Fri Nov  3 07:46:53 CDT 2017
lvm2-cluster-2.02.176-2.el7    BUILT: Fri Nov  3 07:46:53 CDT 2017
lvm2-lockd-2.02.176-2.el7    BUILT: Fri Nov  3 07:46:53 CDT 2017
lvm2-python-boom-0.8-2.el7    BUILT: Fri Nov  3 07:48:54 CDT 2017
cmirror-2.02.176-2.el7    BUILT: Fri Nov  3 07:46:53 CDT 2017
device-mapper-1.02.145-2.el7    BUILT: Fri Nov  3 07:46:53 CDT 2017
device-mapper-libs-1.02.145-2.el7    BUILT: Fri Nov  3 07:46:53 CDT 2017
device-mapper-event-1.02.145-2.el7    BUILT: Fri Nov  3 07:46:53 CDT 2017
device-mapper-event-libs-1.02.145-2.el7    BUILT: Fri Nov  3 07:46:53 CDT 2017
device-mapper-persistent-data-0.7.3-2.el7    BUILT: Tue Oct 10 04:00:07 CDT 2017
sanlock-3.5.0-1.el7    BUILT: Wed Apr 26 09:37:30 CDT 2017
sanlock-lib-3.5.0-1.el7    BUILT: Wed Apr 26 09:37:30 CDT 2017
vdo-6.1.0.34-8    BUILT: Fri Nov  3 06:58:45 CDT 2017
kmod-kvdo-6.1.0.34-7.el7    BUILT: Fri Nov  3 06:44:06 CDT 2017


How reproducible:
Often

Comment 2 Bryan Gurney 2017-11-08 19:02:22 UTC
This is reproducible if you create just one VDO volume.  One example on a test VM, for the creation of "vdo2" on /dev/sdc (a 20 GB device, which I zeroed prior to the "vdo create" command):

Nov  8 13:54:03 rhel75test-20171023 UDS/vdoformat[13914]: ERROR  (vdoformat/13914) loadVolumeGeometry ID mismatch, expected 5, got 0: VDO Status: Component id mismatch in decoder (2059)
Nov  8 13:54:03 rhel75test-20171023 UDS/vdoformat[13914]: ERROR  (vdoformat/13914) decodeSuperBlock version mismatch, expected 12.0, got 0.0: VDO Status: Unsupported component version (2058)
Nov  8 13:54:04 rhel75test-20171023 kernel: kvdo19:dmsetup: starting device 'vdo2' device instantiation 19 (ti=ffff9c78c0e91040) write policy sync
Nov  8 13:54:04 rhel75test-20171023 kernel: kvdo19:dmsetup: underlying device, REQ_FLUSH: supported, REQ_FUA: not supported
Nov  8 13:54:04 rhel75test-20171023 kernel: kvdo19:dmsetup: zones: 1 logical, 1 physical, 1 hash; base threads: 5
Nov  8 13:54:04 rhel75test-20171023 kernel: kvdo19:dmsetup: Setting UDS index target state to online
Nov  8 13:54:04 rhel75test-20171023 kernel: kvdo19:dmsetup: device 'vdo2' started
Nov  8 13:54:04 rhel75test-20171023 kernel: uds: kvdo19:dedupeQ: Initializing index session for index: dev=/dev/sdc offset=4096 size=2781704192
Nov  8 13:54:04 rhel75test-20171023 kernel: uds: kvdo19:dedupeQ: Loading index from saved state and rebuilding if necessary.
Nov  8 13:54:04 rhel75test-20171023 kernel: kvdo19:dmsetup: resuming device 'vdo2'
Nov  8 13:54:04 rhel75test-20171023 kernel: kvdo19:dmsetup: device 'vdo2' resumed
Nov  8 13:54:04 rhel75test-20171023 kernel: uds: kvdo19:dedupeQ: bad region table magic number: UDS Error: Corrupt saved component (1030)
Nov  8 13:54:04 rhel75test-20171023 kernel: uds: kvdo19:dedupeQ: cannot read superblock header: UDS Error: Corrupt saved component (1030)
Nov  8 13:54:04 rhel75test-20171023 kernel: uds: kvdo19:dedupeQ: Failed to initialize grid: UDS Error: Corrupt saved component (1030)
Nov  8 13:54:04 rhel75test-20171023 kernel: uds: kvdo19:dedupeQ: Closed index session (0, ffff8c95f7557960)
Nov  8 13:54:04 rhel75test-20171023 kernel: uds: kvdo19:dedupeQ: Initializing index session for index: dev=/dev/sdc offset=4096 size=2781704192
Nov  8 13:54:04 rhel75test-20171023 kernel: uds: kvdo19:dedupeQ: Creating new index.
Nov  8 13:54:04 rhel75test-20171023 kernel: uds: kvdo19:dedupeQ: Using 2 indexing zones for concurrency.
Nov  8 13:54:04 rhel75test-20171023 kernel: kvdo19:packerQ: compression is enabled
Nov  8 13:54:04 rhel75test-20171023 kernel: uds: kvdo19:dedupeQ: Created index session (34)
Nov  8 13:54:04 rhel75test-20171023 kernel: uds: kvdo19:dedupeQ: Opened block context (25)

The VDO volume was successfully created.

Comment 3 Andy Walsh 2017-11-08 19:22:13 UTC
I believe this is the by product of fixing rhbz#1507996.  I believe the fix was to try and read a geometry block and if you can't (resulting in these errors above), then you're safe to create a new VDO volume.

It will need to be investigated on whether we can squelch these messages in that situation.

Comment 4 Bryan Gurney 2017-11-08 19:32:48 UTC
If you then try to create a VDO volume after having removed the previous one, you see a remnant of what's covered in BZ 1510558, where vdoformat leaves the previous metadata alone:

Nov  8 14:08:08 rhel75test-20171023 kernel: kvdo20:dmsetup: starting device 'vdo2' device instantiation 20 (ti=ffff9c78c0a7a040) write policy sync
Nov  8 14:08:08 rhel75test-20171023 kernel: kvdo20:dmsetup: underlying device, REQ_FLUSH: supported, REQ_FUA: not supported
Nov  8 14:08:08 rhel75test-20171023 kernel: kvdo20:dmsetup: zones: 1 logical, 1 physical, 1 hash; base threads: 5
Nov  8 14:08:08 rhel75test-20171023 kernel: kvdo20:dmsetup: Setting UDS index target state to online
Nov  8 14:08:08 rhel75test-20171023 kernel: kvdo20:dmsetup: device 'vdo2' started
Nov  8 14:08:08 rhel75test-20171023 kernel: uds: kvdo20:dedupeQ: Initializing index session for index: dev=/dev/sdc offset=4096 size=2781704192
Nov  8 14:08:08 rhel75test-20171023 kernel: uds: kvdo20:dedupeQ: Loading index from saved state and rebuilding if necessary.
Nov  8 14:08:08 rhel75test-20171023 kernel: kvdo20:dmsetup: resuming device 'vdo2'
Nov  8 14:08:08 rhel75test-20171023 kernel: kvdo20:dmsetup: device 'vdo2' resumed
Nov  8 14:08:08 rhel75test-20171023 kernel: uds: kvdo20:dedupeQ: Using 2 indexing zones for concurrency.
Nov  8 14:08:08 rhel75test-20171023 kernel: kvdo20:packerQ: compression is enabled
Nov  8 14:08:09 rhel75test-20171023 kernel: uds: kvdo20:dedupeQ: read index page map, last update 0
Nov  8 14:08:09 rhel75test-20171023 kernel: uds: kvdo20:dedupeQ: index_0: loaded index from chapter 0 through chapter 0
Nov  8 14:08:09 rhel75test-20171023 kernel: uds: kvdo20:dedupeQ: Created index session (35)
Nov  8 14:08:09 rhel75test-20171023 kernel: uds: kvdo20:dedupeQ: Opened block context (26)

Comment 5 Thomas Jaskiewicz 2017-11-10 05:11:17 UTC
There is indeed no corruption in this case.  The first time we setup a new index we will try to load the index, notice that it has not been initialized yet, and then proceed to create a new index.   The "corrupt saved component" message is being logged 3 times when this happens.

We will change this to a single message, and "Corrupt saved component (1030)" will be replaced by "No index found (1061)".

Comment 6 Thomas Jaskiewicz 2017-11-10 05:11:29 UTC
There is indeed no corruption in this case.  The first time we setup a new index we will try to load the index, notice that it has not been initialized yet, and then proceed to create a new index.   The "corrupt saved component" message is being logged 3 times when this happens.

We will change this to a single message, and "Corrupt saved component (1030)" will be replaced by "No index found (1061)".

Comment 8 Andy Walsh 2017-11-17 22:41:49 UTC
The QA contact got reset to vdo-qe instead of jkrysl@redhat.com.  Jakub, can you please take assignment and re-ack the qa_ack?

Thank You,
-Andy

Comment 9 Andy Walsh 2017-11-17 22:43:29 UTC
The pm_ack got reset as a result of moving the component from vdo to kmod-kvdo.  Louis, can you please re-ack the pm_ack?

Thank You,
-Andy

Comment 12 Jakub Krysl 2017-11-21 11:47:41 UTC
kmod-kvdo-6.1.0.55-9.el7

[ 1207.270696] uds: loading out-of-tree module taints kernel. 
[ 1207.278022] uds: modprobe: loaded version 6.1.0.31 
[ 1207.284636] kvdo: modprobe: loaded version 6.1.0.55 
[ 1211.534732] kvdo0:dmsetup: starting device 'vdo' device instantiation 0 (ti=ffffa71806aaa040) write policy sync 
[ 1211.544928] kvdo0:dmsetup: zones: 1 logical, 1 physical, 1 hash; base threads: 5 
[ 1211.852678] kvdo0:dmsetup: Setting UDS index target state to online 
[ 1211.859003] kvdo0:dmsetup: uds: kvdo0:dedupeQ: loading or rebuilding index: dev=/dev/sdc offset=4096 size=2781704192 
[ 1211.859269] uds: kvdo0:dedupeQ: Failed loading or rebuilding index: UDS Error: No index found (1061) 
uds: kvdo0:dedupeQ: creating index: dev=/dev/sdc offset=4096 size=2781704192 
uds: kvdo0:dedupeQ: Using 12 indexing zones for concurrency. 
 
[ 1211.891255] device 'vdo' started 
[ 1211.894816] kvdo0:dmsetup: resuming device 'vdo' 
[ 1211.899486] kvdo0:dmsetup: device 'vdo' resumed 
[ 1211.920910] kvdo0:packerQ: compression is enabled 


# grep -i corrupt /var/log/messages
# grep -i error /var/log/messages
Nov 21 11:26:35 storageqe-24 UDS/vdoformat[14044]: ERROR  (vdoformat/14044) release version 0 requires an upgrade: VDO Status: Unsupported component version (2058)
Nov 21 11:26:38 storageqe-24 UDS/vdoformat[14044]: ERROR  (vdoformat/14044) decodeSuperBlock ID mismatch, expected 0, got 5: VDO Status: Component id mismatch in decoder (2059)
Nov 21 11:26:40 storageqe-24 kernel: : UDS Error: No index found (1061)

So there is no more corruption message, but each VDO creation produces 3 error messages in this version. AFAIK they all are product of missing UDS index, which is expected in this situation. So is there a way to get rid of these too?

Comment 13 Andy Walsh 2017-11-21 14:04:06 UTC
Clearing the 'vdo-internal@redhat.com' needinfo, and adding it for Tom (the new assignee)

Tom, can you please review and respond with your thoughts?

Comment 14 Thomas Jaskiewicz 2017-11-27 21:42:05 UTC
The 2 vdoFormat ERROR messages are produced by a different component than the kernel UDS Error message.  These will be hard to deal with.

But the 2 vdoFormat messages should be condensed into a single message.  Maybe even both of them can be eliminated.  I'll ask sweettea.

Comment 15 Jakub Krysl 2017-11-28 08:31:01 UTC
Thanks Tom, returning to be further fixed.

Comment 18 sclafani 2018-03-29 20:18:24 UTC
Everyone who sees even the reduced messages seems to get scared/sidetracked by them, so it's probably worth fixing this sooner than later.

Comment 19 sclafani 2018-05-01 23:42:29 UTC
None of these should be logged anymore when starting up a fresh VDO.

Comment 21 Jakub Krysl 2018-07-04 09:22:20 UTC
Tested on:
RHEL-7.6-20180626.0
kernel-3.10.0-915.el7
kmod-vdo-6.1.1.99-1.el7
vdo-6.1.1.99-2.el7

# vdo create --name vdo --device /dev/sdb --verbose
Creating VDO vdo
    grep MemAvailable /proc/meminfo
    pvcreate -qq --test /dev/sdb
    modprobe kvdo
    vdoformat --uds-checkpoint-frequency=0 --uds-memory-size=0.25 /dev/disk/by-id/scsi-360fff19abdd9b56dfb9bf59625f4c9f7
    vdodumpconfig /dev/disk/by-id/scsi-360fff19abdd9b56dfb9bf59625f4c9f7
Starting VDO vdo
    dmsetup status vdo
    grep MemAvailable /proc/meminfo
    modprobe kvdo
    vdodumpconfig /dev/disk/by-id/scsi-360fff19abdd9b56dfb9bf59625f4c9f7
    dmsetup create vdo --uuid VDO-0e71c26d-c718-46a7-8ebd-56c5b3237fc3 --table '0 4278400280 vdo /dev/disk/by-id/scsi-360fff19abdd9b56dfb9bf59625f4c9f7 4096 disabled 0 32768 16380 on auto vdo ack=1,bio=4,bioRotationInterval=64,cpu=2,hash=1,logical=1,physical=1'
    dmsetup status vdo
Starting compression on VDO vdo
    dmsetup message vdo 0 compression on
    dmsetup status vdo
VDO instance 18 volume is ready at /dev/mapper/vdo

/var/log/messages:
[159630.984103] kvdo18:dmsetup: starting device 'vdo'
[159631.007087] kvdo18:dmsetup: underlying device, REQ_FLUSH: not supported, REQ_FUA: not supported
[159631.050807] kvdo18:dmsetup: Using write policy sync automatically.
[159631.081297] kvdo18:dmsetup: zones: 1 logical, 1 physical, 1 hash; base threads: 5
[159631.295523] kvdo18:journalQ: VDO commencing normal operation
[159631.322357] kvdo18:dmsetup: uds: kvdo18:dedupeQ: creating index: dev=/dev/disk/by-id/scsi-360fff19abdd9b56dfb9bf59625f4c9f7 offset=4096 size=2781704192
uds: kvdo18:dedupeQ: Using 6 indexing zones for concurrency.

[159631.411527] Setting UDS index target state to online
[159631.436282] kvdo18:dmsetup: device 'vdo' started
[159631.458530] kvdo18:dmsetup: resuming device 'vdo'
[159631.481002] kvdo18:dmsetup: device 'vdo' resumed
[159631.524222] kvdo18:packerQ: compression is enabled

The messages are the same when creating multiple VDOs.

Comment 23 errata-xmlrpc 2018-10-30 09:38:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3094


Note You need to log in before you can comment on or make changes to this bug.