Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 1518230

Summary: vdo kernel modules don't work with kernel 3.10.0-797
Product: Red Hat Enterprise Linux 7 Reporter: Marius Vollmer <mvollmer>
Component: kmod-kvdoAssignee: Ken Raeburn <raeburn>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Jakub Krysl <jkrysl>
Severity: high Docs Contact:
Priority: high    
Version: 7.5CC: awalsh, bgurney, bjohnsto, cmarthal, jkrysl, limershe
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-02-20 18:05:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
pastebin output from Corey Marthaler for 3.10.0-799.el7 kernel stalling on VDO create none

Description Marius Vollmer 2017-11-28 13:19:09 UTC
Description of problem:

Upgrading the kernel made creation of vdo volumes hang forever.

Version-Release number of selected component (if applicable):
kmod-kvdo-6.1.0.0-5.el7.x86_64
kernel-3.10.0-797.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
# vdo create --name vdo0 --device /dev/sda --indexMem=0.25 

Actual results:
The VM has 100% CPU and is unresponsive at the console.  Pings still get answered.

Expected results:
Volume gets created.

Additional info:
It works as expected with kernel-3.10.0-768.el7.x86_64.  Upgrading kmod-kvdo to kmod-kvdo-6.1.0.55-9.el7.x86_64 doesn't help.

Comment 2 Jakub Krysl 2017-11-28 13:48:25 UTC
# uname -r                                                                             
3.10.0-797.el7.x86_64 
# vdo create --name vdo --device /dev/mapper/rhel_storageqe-74-lv
Creating VDO vdo
[  332.259143] uds: loading out-of-tree module taints kernel.
[  332.291305] uds: modprobe: loaded version 6.1.0.31
[  332.344706] kvdo: modprobe: loaded version 6.1.0.55
Starting VDO vdo
[  333.807213] kvdo0:dmsetup: starting device 'vdo' device instantiation 0 (ti=ffffa20880f55040) write policy sync
[  333.860275] kvdo0:dmsetup: zones: 1 logical, 1 physical, 1 hash; base threads: 5
[  360.170761] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kvdo0:journalQ:1400]
[  360.211092] Modules linked in: kvdo(OE) uds(OE) sunrpc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt hpwdt iTCO_vendor_support hpilo ipmi_ssif pcspkr sg i2c_i801 wmi ipmi_si lpc_ich ipmi_devintf ipmi_msghandler acpi_power_meter pcc_cpufreq ioatdma shpchp dm_multipath ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 ahci i2c_algo_bit drm_kms_helper libahci ixgbe syscopyarea sysfillrect sysimgblt libata fb_sys_fops ttm crct10dif_pclmul crct10dif_common crc32c_intel drm tg3 serio_raw mdio dca ptp i2c_core pps_core dm_mirror dm_region_hash dm_log dm_mod
[  360.519283] CPU: 3 PID: 1400 Comm: kvdo0:journalQ Tainted: G           OE  ------------   3.10.0-797.el7.x86_64 #1
[  360.567294] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015
[  360.604749] task: ffff937270bc0fd0 ti: ffff9372743cc000 task.ti: ffff9372743cc000
[  360.638850] RIP: 0010:[<ffffffffb4101d73>]  [<ffffffffb4101d73>] native_queued_spin_lock_slowpath+0x1c3/0x1e0
[  360.683979] RSP: 0018:ffff9372743cfdb8  EFLAGS: 00000202
[  360.708080] RAX: 0000000000000001 RBX: ffff93727fc56d80 RCX: 0000000000000001
[  360.743999] RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffffffffc07ad388
[  360.777969] RBP: ffff9372743cfdb8 R08: 0000000000000101 R09: 0000000000000001
[  360.810171] R10: 00000000000007c6 R11: 7fffffffffffffff R12: ffff937035962fd8
[  360.842397] R13: ffff93727fc16d80 R14: ffff93727fc16e00 R15: ffffffffb40d3e8c
[  360.874708] FS:  0000000000000000(0000) GS:ffff93727fcc0000(0000) knlGS:0000000000000000
[  360.911407] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  360.937455] CR2: 00007fa38acfd470 CR3: 000000013700e000 CR4: 00000000001407e0
[  360.969776] Call Trace:
[  360.980811] Code: 00 00 74 f4 e9 95 fe ff ff 0f 1f 80 00 00 00 00 83 fa 01 75 11 0f 1f 00 e9 69 fe ff ff 0f 1f 00 85 c0 74 0c f3 90 8b 07 0f b6 c0 <83> f8 03 75 f0 b8 01 00 00 00 66 89 07 5d c3 f3 90 4d 8b 08 4d

This was the first calltrace, they keep comming after that (with the ~22s delay)

Comment 4 Andy Walsh 2017-11-28 14:27:50 UTC
I was able to modprobe and create a VDO volume without any issues.  Perhaps we can get more information behind the underlying hardware (available memory, storage, etc.)?

[root@RHEL75-20171128 ~]# uname -r
3.10.0-797.el7.x86_64
[root@RHEL75-20171128 ~]# modprobe kvdo
[root@RHEL75-20171128 ~]# lsblk
NAME                MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0                  11:0    1 1024M  0 rom  
vda                 252:0    0   20G  0 disk 
├─vda1              252:1    0    1G  0 part /boot
└─vda2              252:2    0   19G  0 part 
  ├─rhel-root       253:0    0   17G  0 lvm  /
  └─rhel-swap       253:1    0    2G  0 lvm  [SWAP]
vdb                 252:16   0   20G  0 disk 
└─vdo_base-vdo_base 253:2    0   10G  0 lvm  
[root@RHEL75-20171128 ~]# vdo create --name=vdo1 --device=/dev/vdo_base/vdo_base
Creating VDO vdo1
Starting VDO vdo1
Starting compression on VDO vdo1
VDO instance 0 volume is ready at /dev/mapper/vdo1


And the tail of the /var/log/messages file:
Nov 28 09:17:17 localhost kernel: uds: loading out-of-tree module taints kernel.
Nov 28 09:17:17 localhost kernel: uds: module verification failed: signature and/or required key missing - tainting kernel
Nov 28 09:17:17 localhost kernel: uds: modprobe: loaded version 6.1.0.31
Nov 28 09:17:17 localhost kernel: kvdo: modprobe: loaded version 6.1.0.55
Nov 28 09:18:49 localhost UDS/vdoformat[1570]: ERROR  (vdoformat/1570) loadVolumeGeometry ID mismatch, expected 5, got 0: VDO Status: Component id mismatch in decoder (2059)
Nov 28 09:18:49 localhost UDS/vdoformat[1570]: ERROR  (vdoformat/1570) decodeSuperBlock version mismatch, expected 12.0, got 0.0: VDO Status: Unsupported component version (2058)
Nov 28 09:18:55 localhost kernel: kvdo0:dmsetup: starting device 'vdo1' device instantiation 0 (ti=ffffa79ec040c040) write policy sync
Nov 28 09:18:55 localhost kernel: kvdo0:dmsetup: zones: 1 logical, 1 physical, 1 hash; base threads: 5
Nov 28 09:18:56 localhost kernel: kvdo0:dmsetup: Setting UDS index target state to online
Nov 28 09:18:56 localhost kernel: kvdo0:dmsetup: device 'vdo1' started
Nov 28 09:18:56 localhost kernel: uds: kvdo0:dedupeQ: loading or rebuilding index: dev=/dev/vdo_base/vdo_base offset=4096 size=2781704192
Nov 28 09:18:56 localhost kernel: kvdo0:dmsetup: resuming device 'vdo1'
Nov 28 09:18:56 localhost kernel: kvdo0:dmsetup: device 'vdo1' resumed
Nov 28 09:18:56 localhost kernel: uds: kvdo0:dedupeQ: Failed loading or rebuilding index: UDS Error: No index found (1061)
Nov 28 09:18:56 localhost kernel: uds: kvdo0:dedupeQ: creating index: dev=/dev/vdo_base/vdo_base offset=4096 size=2781704192
Nov 28 09:18:56 localhost kernel: uds: kvdo0:dedupeQ: Using 2 indexing zones for concurrency.
Nov 28 09:18:56 localhost kernel: kvdo0:packerQ: compression is enabled

I ran top in batch mode for a bit, and here is the final output (I included the top 6 processes sorted by CPU):
top - 09:25:15 up 12 min,  2 users,  load average: 0.00, 0.01, 0.03
Tasks: 140 total,   1 running, 139 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.2 sy,  0.0 ni, 99.7 id,  0.0 wa,  0.0 hi,  0.2 si,  0.0 st
KiB Mem :  2047788 total,  1287432 free,   497780 used,   262576 buff/cache
KiB Swap:  2097148 total,  2097148 free,        0 used.  1326624 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 1582 root      20   0       0      0      0 S   0.3  0.0   0:00.60 kvdo0:dedupeQ
 1583 root      20   0       0      0      0 S   0.3  0.0   0:00.66 kvdo0:journalQ
 1584 root      20   0       0      0      0 S   0.3  0.0   0:00.66 kvdo0:packerQ
 1587 root      20   0       0      0      0 S   0.3  0.0   0:00.66 kvdo0:hashQ0
 1588 root      20   0       0      0      0 S   0.3  0.0   0:00.55 kvdo0:bioQ0
 1591 root      20   0       0      0      0 S   0.3  0.0   0:00.55 kvdo0:bioQ3

Comment 5 Marius Vollmer 2017-11-28 14:39:24 UTC
(In reply to Jakub Krysl from comment #2)

> This was the first calltrace, they keep comming after that (with the ~22s
> delay)

Ah, good that you can reproduce this.  A couple of times I thought it was just me...

In my case, I had a libvirt VM with a 10G emulated SCSI block device, created via virt-manager.  I can dig out the XML for the virtual machine, if that helps, and share a link to an image.

Comment 6 Bryan Gurney 2017-11-28 16:05:52 UTC
Created attachment 1359959 [details]
pastebin output from Corey Marthaler for 3.10.0-799.el7 kernel stalling on VDO create

Corey Marthaler encountered something similar in 3.10.0-799.el7.x86_64; I've copied the command output and dmesg output into cmarthal_20171128_vdocreatestall.txt.

Comment 7 Marius Vollmer 2017-11-29 14:38:30 UTC
For what it's worth, I have now seen a similar lock up with kernel-3.10.0-768.el7.x86_64 while writing 10G of zeros to a vdo volume.  I have not been able to reproduce this a second time.  I have no kernel backtrace, unfortunately.

Comment 8 Ken Raeburn 2017-11-29 16:16:24 UTC
(In reply to Marius Vollmer from comment #7)
> For what it's worth, I have now seen a similar lock up with
> kernel-3.10.0-768.el7.x86_64 while writing 10G of zeros to a vdo volume.  I
> have not been able to reproduce this a second time.  I have no kernel
> backtrace, unfortunately.

That doesn't sound like exactly the same problem, since the stack traces I've seen for this problem indicate a hang while starting up the VDO device in the first place; we don't get to the point of being able to write to anything. It's possible the hang is for similar reasons, of course. Did you happen to get a stack trace?

Comment 9 Marius Vollmer 2017-11-30 11:48:58 UTC
(In reply to Ken Raeburn from comment #8)
> Did you happen to get a stack trace?

No, unfortunately not.

Comment 10 Ken Raeburn 2017-11-30 22:57:04 UTC
Thanks to Dennis running a test of the "bad" binary module with a systemtap script for some debugging, it's looking like we're somehow getting out of initializeWorkQueueStackHandle with the spin lock still held, though I see no code path that should allow that. I'm still looking at the compiled code to see if I can spot something funny going on there.

Comment 11 Ken Raeburn 2017-12-02 03:20:32 UTC
Here's my status at the moment with the problem as reproduced on Bryan's machine:

The function initializeWorkQueueStackHandle, which gets called as each worker thread starts up (and they're serialized elsewhere, so in this test only one thread will be even running that function at any one time) is a small function that includes this code:

  spin_lock(&workQueueStackHandleGlobals.offsetLock);
  if (workQueueStackHandleGlobals.offset == 0) {
    workQueueStackHandleGlobals.offset = offset;
    spin_unlock(&workQueueStackHandleGlobals.offsetLock);
  } else {
    long foundOffset = workQueueStackHandleGlobals.offset;
    spin_unlock(&workQueueStackHandleGlobals.offsetLock);
    BUG_ON(foundOffset != offset);
  }

The first time it's called, the .offset field will be zero and we'll follow the "else" path to save a nonzero value; on the following calls we should follow the other branch. However, on some of these machines, we return from the function with the lock still locked.

Modifying the code to add debugging statements has made the problem disappear. Even just rebuilding from the same SRPM made the problem disappear, though it reproducibility of the binary image has been a problem for a couple reasons, so it's just the same source code, not identical machine code and symbol locations and so on.

I've disassembled the kvdo driver and looked at the initialization function. Both paths appear to load the same address and pass it to the unlock function. It's the same address as was used to lock it.

I've used SystemTap to probe the entry and return of the function and examine the lock structure. (Remember the higher-level serialization of the calls; concurrent execution shouldn't be an issue.) On first call, the .counter field is 0, and on return, it's 1; on second call, it's still 1 and we sit and spin.

I've used the hardware watchpoint support in SystemTap to track when changes are made to the lock. On my RHEL 7.4 test VM, lock and unlock operations are recorded. On Bryan's machine with 7.5 and the problematic kvdo build, the lock is recorded, but there's no unlock change reported before the function .return probe is triggered. This seems to rule out some other thread making a bad reference and scribbling over the lock after we've released it.

I've even put probes on and immediately after the "call" instruction to print out the state; %rdi is loaded with the correct address for the lock, and pv_lock_ops+0x10 holds the address of the function __raw_callee_save___native_queued_spin_unlock, but the value in the lock doesn't change across the call. I've looked at __raw_callee_save___native_queued_spin_unlock; it saves registers and calls __native_queued_spin_unlock, which stores a zero byte at the location given in %rdi. And yet...

Comment 12 Ken Raeburn 2017-12-05 22:33:15 UTC
Does anyone still see problems with the newer build of kvdo?

Comment 13 Jakub Krysl 2017-12-06 09:25:55 UTC
kvdo-6.1.0.72-10 works for me on this kernel.

Comment 14 Andy Walsh 2017-12-07 13:33:14 UTC
Since this problem seems to have gone away with the new build, we can't block on it, but we're not comfortable without having the answer as to why this happened.  If this happens for anyone, please be sure to update this ticket.

Comment 15 Marius Vollmer 2017-12-11 08:39:43 UTC
kvdo-6.1.0.72-10 works for me on kernel 3.10.0-797.

Comment 16 Corey Marthaler 2017-12-14 17:33:05 UTC
Just a note that this continues to work w/ the latest kernels and vdo builds. Removing the proposed blocker flag as well as the TestBlocker keyword.

3.10.0-823.el7.x86_64
vdo-6.1.0.98-13    BUILT: Tue Dec 12 07:35:03 CST 2017
kmod-kvdo-6.1.0.98-11.el7    BUILT: Tue Dec 12 08:08:52 CST 2017


[root@harding-03 ~]# vdo list
PV1
PV3
PV2
PV5
PV4
PV7
PV6
PV8

Comment 17 Andy Walsh 2018-02-20 18:05:25 UTC
Closing this bug as we have not been able to reproduce this problem on a more modern kernel.  If this problem comes back up, this bug can be reopened.