Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 986084 - pool meta device resizing can lead to problems [NEEDINFO]
Summary: pool meta device resizing can lead to problems
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Zdenek Kabelac
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On: 1055944
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-07-18 23:07 UTC by Corey Marthaler
Modified: 2014-06-18 01:19 UTC (History)
8 users (show)

Fixed In Version: lvm2-2.02.99-1.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-06-13 12:07:31 UTC
Target Upstream Version:
zkabelac: needinfo? (msnitzer)


Attachments (Terms of Use)

Description Corey Marthaler 2013-07-18 23:07:16 UTC
Description of problem:
SCENARIO - [resize_pool_meta_device]
Create an XFS filesystem, mount it, snapshot it, and attempt to resize it's pool meta device while online
Making origin volume
lvcreate --thinpool POOL -L 2G snapper_thinp
  device-mapper: remove ioctl on  failed: Device or resource busy
Sanity checking pool device metadata
(thin_check /dev/mapper/snapper_thinp-POOL_tmeta)
lvcreate --virtualsize 1G -T snapper_thinp/POOL -n origin
lvcreate -V 1G -T snapper_thinp/POOL -n other1
lvcreate -V 1G -T snapper_thinp/POOL -n other2
lvcreate -V 1G -T snapper_thinp/POOL -n other3
lvcreate -V 1G -T snapper_thinp/POOL -n other4
lvcreate -V 1G -T snapper_thinp/POOL -n other5
Placing an XFS filesystem on origin volume
Mounting origin volume

Making snapshot of origin volume
lvcreate -s /dev/snapper_thinp/origin -n meta_resize

Attempt to resize the open snapshoted filesystem multiple times with lvextend/fsadm on qalvm-01
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 12.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 20.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 28.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 36.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 44.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 52.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 60.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Problem reactivating POOL
  libdevmapper exiting with 2 device(s) still suspended.
Online meta device resize failed

[  138.058126] device-mapper: thin: failed to resize metadata device

[root@qalvm-01 ~]# df -h
/dev/mapper/snapper_thinp-origin 1014M   33M  982M   4% /mnt/origin

[root@qalvm-01 ~]# umount /mnt/origin
[DEADLOCK]

kernel: [  576.274035] umount          D ffff88007fd14600     0  1580    808 0x00000080
kernel: [  576.274035]  ffff88007c015d88 0000000000000086 ffff88007c015fd8 0000000000014600
kernel: [  576.274035]  ffff88007c015fd8 0000000000014600 ffff880070ea0000 ffff880070163800
kernel: [  576.274035]  ffff880070ea0000 0000000000000001 0000000000000000 ffff880070163928
kernel: [  576.274035] Call Trace:
kernel: [  576.274035]  [<ffffffff81602969>] schedule+0x29/0x70
kernel: [  576.274035]  [<ffffffffa01a5109>] _xfs_log_force+0x1e9/0x2a0 [xfs]
kernel: [  576.274035]  [<ffffffff81094290>] ? wake_up_state+0x20/0x20
kernel: [  576.274035]  [<ffffffffa01a51e6>] xfs_log_force+0x26/0x80 [xfs]
kernel: [  576.274035]  [<ffffffffa015aced>] xfs_fs_sync_fs+0x2d/0x50 [xfs]
kernel: [  576.274035]  [<ffffffff811c74f2>] sync_filesystem+0x72/0xa0
kernel: [  576.274035]  [<ffffffff8119cb00>] generic_shutdown_super+0x30/0xd0
kernel: [  576.274035]  [<ffffffff8119cdb7>] kill_block_super+0x27/0x70
kernel: [  576.274035]  [<ffffffff8119d12d>] deactivate_locked_super+0x3d/0x60
kernel: [  576.274035]  [<ffffffff8119d196>] deactivate_super+0x46/0x60
kernel: [  576.274035]  [<ffffffff811b8155>] mntput_no_expire+0xc5/0x120
kernel: [  576.274035]  [<ffffffff811b8f21>] SyS_umount+0x91/0x3a0
kernel: [  576.274035]  [<ffffffff8160c919>] system_call_fastpath+0x16/0x1b


Version-Release number of selected component (if applicable):
3.10.0-0.rc5.61.el7.x86_64
lvm2-2.02.99-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
lvm2-libs-2.02.99-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
lvm2-cluster-2.02.99-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
device-mapper-1.02.78-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
device-mapper-libs-1.02.78-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
device-mapper-event-1.02.78-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
device-mapper-event-libs-1.02.78-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
cmirror-2.02.99-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013


How reproducible:
Everytime

Comment 1 Corey Marthaler 2013-07-18 23:13:28 UTC
# AFTER REBOOT 

# Which device exactly is full here? I don't get it.

[root@qalvm-01 ~]# pvscan
  PV /dev/vdh1   VG snapper_thinp   lvm2 [2.00 GiB / 0    free]
  PV /dev/vdg1   VG snapper_thinp   lvm2 [2.00 GiB / 1.99 GiB free]
  PV /dev/vdf1   VG snapper_thinp   lvm2 [2.00 GiB / 2.00 GiB free]
  PV /dev/vde1   VG snapper_thinp   lvm2 [2.00 GiB / 2.00 GiB free]
  PV /dev/vdd1   VG snapper_thinp   lvm2 [2.00 GiB / 1.93 GiB free]


[root@qalvm-01 ~]# lvs -a -o +devices
  LV           VG            Attr      LSize  Pool Origin Devices
  POOL         snapper_thinp twi---tz-  2.00g             POOL_tdata(0)
  [POOL_tdata] snapper_thinp Twi-a----  2.00g             /dev/vdh1(0)
  [POOL_tdata] snapper_thinp Twi-a----  2.00g             /dev/vdg1(0)
  [POOL_tmeta] snapper_thinp ewi-a---- 68.00m             /dev/vdd1(0)
  meta_resize  snapper_thinp Vwi---tz-  1.00g POOL origin
  origin       snapper_thinp Vwi---tz-  1.00g POOL
  other1       snapper_thinp Vwi---tz-  1.00g POOL
  other2       snapper_thinp Vwi---tz-  1.00g POOL
  other3       snapper_thinp Vwi---tz-  1.00g POOL
  other4       snapper_thinp Vwi---tz-  1.00g POOL
  other5       snapper_thinp Vwi---tz-  1.00g POOL

[root@qalvm-01 ~]# vgchange -an snapper_thinp
  0 logical volume(s) in volume group "snapper_thinp" now active

[root@qalvm-01 ~]# lvremove snapper_thinp
Removing pool "POOL" will remove 7 dependent volume(s). Proceed? [y/n]: y
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.

Comment 2 Corey Marthaler 2013-07-18 23:36:10 UTC
I attempted to swap in a new meta device (knowing that it probably wouldn't work due to bug 973419/973432).

[root@qalvm-01 ~]# lvcreate -n new_meta -L 68M snapper_thinp
  Logical volume "new_meta" created

[root@qalvm-01 ~]# lvconvert --poolmetadata snapper_thinp/new_meta --thinpool snapper_thinp/POOL
  Attempted to decrement suspended device counter below zero.
Do you want to swap metadata of snapper_thinp/POOL pool with volume snapper_thinp/new_meta? [y/n]: y
  device-mapper: create ioctl on snapper_thinp-POOL_tmeta failed: Device or resource busy
  Failed to activate pool logical volume snapper_thinp/POOL.
  Device snapper_thinp-POOL_tdata (253:3) is used by another device.
  Failed to deactivate pool data logical volume.

Comment 3 Heinz Mauelshagen 2013-07-19 08:10:26 UTC
Corey,

seems like this snippet is reponsible:

/* Release unneeded blocks in thin pool */
/* TODO: defer when multiple LVs relased at once */
if (pool_lv && !update_pool_lv(pool_lv, 1)) {
        log_error("Failed to update thin pool %s.", pool_lv->name);
        return 0;
}

If so, are you able to remove the thin vols individualy?

Comment 4 Zdenek Kabelac 2013-07-19 09:04:35 UTC
Unfortunately pool metadata resize is currently awaiting for Joe's kernel patch.
There is a bug which limits maximum resizable size.
So i.e. for 2MB max is <64MB, there are some rules about that - but for current version of kernel the feature should be marked as unavailable.

It's currently enabled for debugging and testing purposes.

Comment 5 Zdenek Kabelac 2013-08-02 09:39:55 UTC
This upstream patch puts requirement for version 1.9:

http://www.redhat.com/archives/lvm-devel/2013-July/msg00264.html

Comment 6 Peter Rajnoha 2013-08-05 08:28:43 UTC
(In reply to Zdenek Kabelac from comment #5)
> This upstream patch puts requirement for version 1.9:
> 
> http://www.redhat.com/archives/lvm-devel/2013-July/msg00264.html

(this is already in latest RHEL7 package - lvm2-2.02.99-1.el7)

Comment 8 Corey Marthaler 2014-01-10 21:08:31 UTC
This still exists in the latest rpms.

3.10.0-64.el7.x86_64
lvm2-2.02.103-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
lvm2-libs-2.02.103-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
lvm2-cluster-2.02.103-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-1.02.82-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-libs-1.02.82-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-event-1.02.82-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-event-libs-1.02.82-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-persistent-data-0.2.8-2.el7    BUILT: Wed Oct 30 10:20:48 CDT 2013
cmirror-2.02.103-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014


(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:5)
  Problem reactivating POOL
  libdevmapper exiting with 2 device(s) still suspended.
Online meta device resize failed

/dev/mapper/snapper_thinp-origin on /mnt/origin type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
[root@host-049 ~]# umount /mnt/origin
[DEADLOCK]

Comment 9 Zdenek Kabelac 2014-01-28 13:06:22 UTC
The online metadata resize requires latest thinpool kernel target with fixes as mentioned in Bug 1056647 and thin related mainly in Bug 1055944.

Temporarily usable brew kernel build could be found in 
Bug 1056647 comment 27.

Related upstream thin lvm commits which enables use of thin pool target 1.10 for online metadata resize:

https://www.redhat.com/archives/lvm-devel/2014-January/msg00036.html

Comment 14 Corey Marthaler 2014-02-10 19:40:19 UTC
This is now fixed in the latest kernel.

3.10.0-84.el7.x86_64
lvm2-2.02.105-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
lvm2-libs-2.02.105-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
lvm2-cluster-2.02.105-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
device-mapper-1.02.84-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
device-mapper-libs-1.02.84-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
device-mapper-event-1.02.84-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
device-mapper-event-libs-1.02.84-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
device-mapper-persistent-data-0.2.8-4.el7    BUILT: Fri Jan 24 14:28:55 CST 2014
cmirror-2.02.105-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014

Comment 15 Ludek Smid 2014-06-13 12:07:31 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.


Note You need to log in before you can comment on or make changes to this bug.