Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1596565 - libdm hangs waiting on udev notification semaphore
Summary: libdm hangs waiting on udev notification semaphore
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: LVM and device-mapper
Classification: Community
Component: lvm2
Version: 2.02.177
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: LVM and device-mapper development team
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-29 09:18 UTC by agusdallalba
Modified: 2018-06-29 10:26 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-29 10:16:18 UTC
rule-engine: lvm-technical-solution?
rule-engine: lvm-test-coverage?


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github systemd systemd issues 9469 None None None 2018-06-29 09:44:31 UTC

Description agusdallalba 2018-06-29 09:18:45 UTC
I reported this first to systemd (https://github.com/systemd/systemd/issues/9469) and was told this is an issue with the DM client side code. I'll copy and paste the relevant parts of the report below.

How reproducible:
Always.

Steps to Reproduce:

A consequence of this bug is that cryptsetup open hangs after creating the dm device. I discovered this problem because my system got stuck at boot.

0. Fist let's test that cryptsetup works properly:

# fallocate -l 1M test0_file && cryptsetup --debug plainOpen test0_file test0

# cryptsetup 2.0.3 processing "cryptsetup --debug plainOpen test0_file test0"
# Running command open.
# Locking memory.
# Installing SIGINT/SIGTERM handler.
# Unblocking interruption on signal.
# Allocating context for crypt device test0_file.
# Trying to open and read device test0_file with direct-io.
# Initialising device-mapper backend library.
# Formatting device test0_file as type PLAIN.
# Crypto backend (gcrypt 1.8.3) initialized in cryptsetup library version 2.0.3.
# Detected kernel Linux 4.17.2-1-ARCH x86_64.
# Interactive passphrase entry requested.
Enter passphrase for test0_file: 
# Activating volume test0 [keyslot -1] using passphrase.
# dm version   [ opencount flush ]   [16384] (*1)
# dm versions   [ opencount flush ]   [16384] (*1)
# Detected dm-ioctl version 4.39.0.
# Detected dm-crypt version 1.18.1.
# Device-mapper backend running with UDEV support enabled.
# dm status test0  [ opencount flush ]   [16384] (*1)
# Plain: hashing passphrase using ripemd160.
# Allocating a free loop device.
# Trying to open and read device /dev/loop0 with direct-io.
# Calculated device size is 20480 sectors (RW), offset 0.
# Trying to activate PLAIN device test0 using cipher aes-cbc-essiv:sha256.
# DM-UUID is CRYPT-PLAIN-test0
# Udev cookie 0xd4d5ac1 (semid 2359325) created
# Udev cookie 0xd4d5ac1 (semid 2359325) incremented to 1
# Udev cookie 0xd4d5ac1 (semid 2359325) incremented to 2
# Udev cookie 0xd4d5ac1 (semid 2359325) assigned to CREATE task(0) with flags DISABLE_LIBRARY_FALLBACK         (0x20)
# dm create test0 CRYPT-PLAIN-test0 [ opencount flush ]   [16384] (*1)
# dm reload test0  [ opencount flush securedata ]   [16384] (*1)
# dm resume test0  [ opencount flush securedata ]   [16384] (*1)
# test0: Stacking NODE_ADD (254,1) 0:0 0600 [trust_udev]
# test0: Stacking NODE_READ_AHEAD 256 (flags=1)
# Udev cookie 0xd4d5ac1 (semid 2359325) decremented to 1
# Udev cookie 0xd4d5ac1 (semid 2359325) waiting for zero
# Udev cookie 0xd4d5ac1 (semid 2359325) destroyed
# test0: Skipping NODE_ADD (254,1) 0:0 0600 [trust_udev]
# test0: Processing NODE_READ_AHEAD 256 (flags=1)
# test0 (254:1): read ahead is 256
# test0: retaining kernel read ahead of 256 (requested 256)
# Releasing crypt device test0_file context.
# Releasing device-mapper backend.
# Closed loop /dev/loop0 (test0_file).
# Unlocking memory.

Everything works for now.

1. Now, as a minimal example, I create an udev rule with a RUN directive that is triggered many times in quick succession:

# mkdir /run/udev/rules.d && echo 'RUN="/bin/true"' > /etc/udev/rules.d/test.rules

I'm not sure if this only happens with RUN directives, but this is how I discovered the issue.

2. If we try again with cryptsetup:

# fallocate -l 1M test_file && cryptsetup --debug plainOpen test_file test

# cryptsetup 2.0.3 processing "cryptsetup --debug plainOpen test_file test"
# Running command open.
# Locking memory.
# Installing SIGINT/SIGTERM handler.
# Unblocking interruption on signal.
# Allocating context for crypt device test_file.
# Trying to open and read device test_file with direct-io.
# Initialising device-mapper backend library.
# Formatting device test_file as type PLAIN.
# Crypto backend (gcrypt 1.8.3) initialized in cryptsetup library version 2.0.3.
# Detected kernel Linux 4.17.2-1-ARCH x86_64.
# Interactive passphrase entry requested.
Enter passphrase for test_file: 
# Activating volume test [keyslot -1] using passphrase.
# dm version   [ opencount flush ]   [16384] (*1)
# dm versions   [ opencount flush ]   [16384] (*1)
# Detected dm-ioctl version 4.39.0.
# Detected dm-crypt version 1.18.1.
# Device-mapper backend running with UDEV support enabled.
# dm status test  [ opencount flush ]   [16384] (*1)
# Plain: hashing passphrase using ripemd160.
# Allocating a free loop device.
# Trying to open and read device /dev/loop1 with direct-io.
# Calculated device size is 2048 sectors (RW), offset 0.
# Trying to activate PLAIN device test using cipher aes-cbc-essiv:sha256.
# DM-UUID is CRYPT-PLAIN-test
# Udev cookie 0xd4d7151 (semid 2654239) created
# Udev cookie 0xd4d7151 (semid 2654239) incremented to 1
# Udev cookie 0xd4d7151 (semid 2654239) incremented to 2
# Udev cookie 0xd4d7151 (semid 2654239) assigned to CREATE task(0) with flags DISABLE_LIBRARY_FALLBACK         (0x20)
# dm create test CRYPT-PLAIN-test [ opencount flush ]   [16384] (*1)
# dm reload test  [ opencount flush securedata ]   [16384] (*1)
# dm resume test  [ opencount flush securedata ]   [16384] (*1)
# test: Stacking NODE_ADD (254,2) 0:0 0600 [trust_udev]
# test: Stacking NODE_READ_AHEAD 256 (flags=1)
# Udev cookie 0xd4d7151 (semid 2654239) decremented to 1
# Udev cookie 0xd4d7151 (semid 2654239) waiting for zero

Now cryptsetup is stuck in the `semop` syscall.

3. # gdb -p $(pgrep cryptsetup)

[snip]
(gdb) where
#0  0x00007ff43af5a297 in semop () from /usr/lib/libc.so.6
#1  0x00007ff43ac19e73 in _udev_wait (cookie=223179089, nowait=nowait@entry=0x7ffc3cc74ee4) at libdm-common.c:2610
#2  0x00007ff43ac1b577 in dm_udev_wait (cookie=<optimized out>) at libdm-common.c:2629
#3  0x00007ff43b64a8fa in _dm_create_device (name=name@entry=0x7ffc3cc7761c "test", type=type@entry=0x7ff43b66a940 "PLAIN", 
    device=0x564250661eb0, flags=flags@entry=0, uuid=0x0, size=2048, target=DM_CRYPT, 
    params=0x564250664718 "aes-cbc-essiv:sha256 9c1185a5c5e9fc54612808977ee8f548b2258d31ddadef707ba62c166051b9e3 0 /dev/loop1 0", 
    reload=0) at lib/libdevmapper.c:1127
#4  0x00007ff43b64ad5c in dm_create_device (cd=cd@entry=0x5642506623d0, name=name@entry=0x7ffc3cc7761c "test", 
    type=type@entry=0x7ff43b66a940 "PLAIN", dmd=dmd@entry=0x7ffc3cc75100, reload=reload@entry=0) at lib/libdevmapper.c:1207
#5  0x00007ff43b63ca7b in PLAIN_activate (cd=cd@entry=0x5642506623d0, name=name@entry=0x7ffc3cc7761c "test", vk=<optimized out>, 
    size=<optimized out>, flags=flags@entry=0) at lib/setup.c:512
#6  0x00007ff43b63f7f8 in _activate_by_passphrase (cd=0x5642506623d0, name=0x7ffc3cc7761c "test", keyslot=<optimized out>, 
    passphrase=<optimized out>, passphrase_size=<optimized out>, flags=0) at lib/setup.c:2938
#7  0x000056424ece45be in action_open_plain () at src/cryptsetup.c:240
#8  action_open () at src/cryptsetup.c:1766
#9  0x000056424ecdb393 in run_action (action=<optimized out>) at src/cryptsetup.c:2119
#10 main (argc=5, argv=0x7ffc3cc75518) at src/cryptsetup.c:2553
(gdb) up
#1  0x00007ff43ac19e73 in _udev_wait (cookie=223179089, nowait=nowait@entry=0x7ffc3cc74ee4) at libdm-common.c:2610
2610		if (semop(semid, &sb, 1) < 0) {
(gdb) print (int)semctl(semid, 0, 12)
$1 = 1

The semaphore is `1`, and udev doesn't decrement it.

This is as far as I'm able to debug this. I hope it helps.

Thank you.

Comment 1 Peter Rajnoha 2018-06-29 10:16:18 UTC
(In reply to agusdallalba from comment #0)
> 1. Now, as a minimal example, I create an udev rule with a RUN directive
> that is triggered many times in quick succession:
> 
> # mkdir /run/udev/rules.d && echo 'RUN="/bin/true"' >
> /etc/udev/rules.d/test.rules
> 
...
> 2. If we try again with cryptsetup:
> #1  0x00007ff43ac19e73 in _udev_wait (cookie=223179089,
> nowait=nowait@entry=0x7ffc3cc74ee4) at libdm-common.c:2610
> 2610		if (semop(semid, &sb, 1) < 0) {
> (gdb) print (int)semctl(semid, 0, 12)
> $1 = 1
> 
> The semaphore is `1`, and udev doesn't decrement it.
> 

You have used 'RUN=/bin/true' in your udev rule which means that you have replaced the run queue with a single command '/bin/true' and hence you also replaced the 'RUN+="/usr/sbin/dmsetup udevcomplete $env{DM_COOKIE}"' udev rule from /usr/lib/udev/rules.d/95-dm-notify.rules which is supposed to unlock the waiting libdm instance.

You have to use 'RUN+=' to add to the run queue instead of overwriting it with 'RUN='.

Otherwise, you override all the RUN rules which are set before - you're not using a number prefix for your rule file name so it goes after all number-prefixed rules, which, in practice, are actually all rules because adding a number prefix is non-written standard to help make a clear order while executing rules.

Comment 2 agusdallalba 2018-06-29 10:26:11 UTC
I see. Thank you for the explanation. :)


Note You need to log in before you can comment on or make changes to this bug.