Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1596973 - Gluster-block resize to be mindful of the underlying block-hosting volume size
Summary: Gluster-block resize to be mindful of the underlying block-hosting volume size
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: gluster-block
Version: cns-3.10
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: ---
Assignee: Prasanna Kumar Kalever
QA Contact: Sweta Anandpara
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-01 01:27 UTC by Sweta Anandpara
Modified: 2018-11-19 08:40 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-19 08:40:17 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Sweta Anandpara 2018-07-01 01:27:12 UTC
Description of problem:
=======================
In line with BZ 1467528, where 'gluster-block create' errors out at the outset itself if the size of the block is more than the underlying block-hosting volume, 'gluster-block modify' does not have the same check. 

If we try to increase the size of an existing block using the command 'gluster-block  modify <volname>/<blockname> size <newsize>', and that size is greater than the available-size-in-volume, without checking it tries to increase the size, eventually running out of space, filling up the entire volume, and finally timing out with NO success.

Size checking has to be done at the beginning itself even before attempting to start increase the size, on the feasibility. 


Version-Release number of selected component (if applicable):
=============================================================
[root@dhcp46-50 ~]# rpm -qa | grep gluster
glusterfs-server-3.8.4-54.12.el7rhgs.x86_64
glusterfs-cli-3.8.4-54.12.el7rhgs.x86_64
libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.6.x86_64
glusterfs-rdma-3.8.4-54.12.el7rhgs.x86_64
python-gluster-3.8.4-54.12.el7rhgs.noarch
glusterfs-3.8.4-54.12.el7rhgs.x86_64
vdsm-gluster-4.17.33-1.2.el7rhgs.noarch
gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64
glusterfs-libs-3.8.4-54.12.el7rhgs.x86_64
glusterfs-api-3.8.4-54.12.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-client-xlators-3.8.4-54.12.el7rhgs.x86_64
glusterfs-fuse-3.8.4-54.12.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-54.12.el7rhgs.x86_64
gluster-block-0.2.1-20.el7rhgs.x86_64
[root@dhcp46-50 ~]# rpm -qa | grep tcmu
libtcmu-1.2.0-20.el7rhgs.x86_64
tcmu-runner-1.2.0-20.el7rhgs.x86_64
[root@dhcp46-50 ~]# rpm -qa | grep configshell
python-configshell-1.1.fb23-4.el7_5.noarch
[root@dhcp46-50 ~]# rpm -qa | grep rtslib
python-rtslib-2.1.fb63-12.el7_5.noarch
[root@dhcp46-50 ~]# rpm -qa | grep targetcli
targetcli-2.1.fb46-6.el7_5.noarch
[root@dhcp46-50 ~]# 


How reproducible:
==================
Always

Steps to Reproduce:
=====================
1. Have a replica 3 volume, on a cluster with brick-mux enabled and set the volume option to group 'gluster-block'.
2. Create a block of (say) 10Mib
3. Note the underlying size of the block-hosting volume.. in other words, note the size of the lvm hosting the brick.
4. Increase the size of the block more than the size noted in 3, using the command 'gluster-block modify <ovlname>/<blockname> size <size>'

Actual results:
===============
Step 4 times out. Logs mention 'No space left in device'. /mnt/<volname>/block-store/<block_gbid> file is increased to the maximum size possible. 'df -h' shows the mountpoint as well as the lvm partition to be 100% full.

Expected results:
=================
Size checking has to be done at the outset, when gluster-block modify command is given.


Additional info:
================

[root@dhcp46-50 ~]# gluster-block modify ozone/ob2 size 15M
IQN: iqn.2016-12.org.gluster-block:8d83c3cf-7230-4bee-a9f6-310e61b8bd00
SIZE: 15.0 MiB
SUCCESSFUL ON:  10.70.46.50 10.70.46.102 10.70.46.176
RESULT: SUCCESS
[root@dhcp46-50 ~]# gluster-block info ozone/ob2
NAME: ob2
VOLUME: ozone
GBID: 8d83c3cf-7230-4bee-a9f6-310e61b8bd00
SIZE: 15.0 MiB
HA: 3
PASSWORD: 5827a5c2-6eaf-4154-9ce5-fa0398f20c13
EXPORTED ON: 10.70.46.50 10.70.46.102 10.70.46.176
[root@dhcp46-50 ~]# gluster-block modify ozone/ob2 size 12M
Shrink size ?
use 'force' option [current size 15.0 MiB, request size 12.0 MiB]
RESULT:FAIL
[root@dhcp46-50 ~]# gluster-block modify ozone/ob2 size 12M force
IQN: iqn.2016-12.org.gluster-block:8d83c3cf-7230-4bee-a9f6-310e61b8bd00
SIZE: 12.0 MiB
SUCCESSFUL ON:  10.70.46.50 10.70.46.102 10.70.46.176
RESULT: SUCCESS
[root@dhcp46-50 ~]# gluster-block info ozone/ob2
NAME: ob2
VOLUME: ozone
GBID: 8d83c3cf-7230-4bee-a9f6-310e61b8bd00
SIZE: 12.0 MiB
HA: 3
PASSWORD: 5827a5c2-6eaf-4154-9ce5-fa0398f20c13
EXPORTED ON: 10.70.46.50 10.70.46.102 10.70.46.176
[root@dhcp46-50 ~]# targetcli ls /backstores/user:glfs/ob2
o- ob2 .................................... [ozone@10.70.46.50/block-store/8d83c3cf-7230-4bee-a9f6-310e61b8bd00 (12.0MiB) activated]
  o- alua ......................................................................................................... [ALUA Groups: 3]
    o- default_tg_pt_gp ............................................................................. [ALUA state: Active/optimized]
    o- glfs_tg_pt_gp_ano ........................................................................ [ALUA state: Active/non-optimized]
    o- glfs_tg_pt_gp_ao ............................................................................. [ALUA state: Active/optimized]
[root@dhcp46-50 ~]# 
[root@dhcp46-50 ~]# 
[root@dhcp46-50 ~]# targetcli  /backstores/user:glfs/ob2 get attribute dev_size
dev_size=12582912 
[root@dhcp46-50 ~]# mkdir /mnt/ozone
[root@dhcp46-50 ~]# mount -t glusterfs 10.70.46.50:ozone /mnt/ozone
[root@dhcp46-50 ~]# cd /mnt/ozone
[root@dhcp46-50 ozone]# 
[root@dhcp46-50 ozone]# cd block-store/
[root@dhcp46-50 block-store]# ll
total 12288
-rw-------. 1 root root 12582912 Jun 30 20:49 8d83c3cf-7230-4bee-a9f6-310e61b8bd00
-rw-------. 1 root root 10485760 Jun 30 19:26 a55295bf-844a-4e92-9a85-dbf470a2b75c
[root@dhcp46-50 block-store]# gluster-block modify ozone/ob2 size 500
minimum acceptable block size is 512 bytes
[root@dhcp46-50 block-store]# gluster-block modify ozone/ob2 size 500 force
minimum acceptable block size is 512 bytes
[root@dhcp46-50 block-store]# gluster-block modify ozone/ob2 size 512 force
IQN: iqn.2016-12.org.gluster-block:8d83c3cf-7230-4bee-a9f6-310e61b8bd00
SIZE: 512.0 B
SUCCESSFUL ON:  10.70.46.50 10.70.46.102 10.70.46.176
RESULT: SUCCESS
[root@dhcp46-50 block-store]# gluster-block info  ozone/ob2
NAME: ob2
VOLUME: ozone
GBID: 8d83c3cf-7230-4bee-a9f6-310e61b8bd00
SIZE: 512.0 B
HA: 3
PASSWORD: 5827a5c2-6eaf-4154-9ce5-fa0398f20c13
EXPORTED ON: 10.70.46.50 10.70.46.102 10.70.46.176
[root@dhcp46-50 block-store]# ll
total 1
-rw-------. 1 root root      512 Jun 30 20:52 8d83c3cf-7230-4bee-a9f6-310e61b8bd00
-rw-------. 1 root root 10485760 Jun 30 19:26 a55295bf-844a-4e92-9a85-dbf470a2b75c
[root@dhcp46-50 block-store]# gluster-block modify ozone/ob2 size 1T



Did not receive any response from gluster-block daemon. Please check log files to find the reason
[root@dhcp46-50 block-store]# 
[root@dhcp46-50 block-store]# 
[root@dhcp46-50 block-store]# 
[root@dhcp46-50 block-store]# vim /var/log/gluster-block/gluster-blockd.log 
[root@dhcp46-50 block-store]# vim /var/log/gluster-block/gluster-blockd.log ^C
[root@dhcp46-50 block-store]# systemctl status gluster-blockd
● gluster-blockd.service - Gluster block storage utility
   Loaded: loaded (/usr/lib/systemd/system/gluster-blockd.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2018-06-30 20:33:03 EDT; 26min ago
 Main PID: 14149 (gluster-blockd)
    Tasks: 18
   CGroup: /system.slice/gluster-blockd.service
           └─14149 /usr/sbin/gluster-blockd --glfs-lru-count 5 --log-level INFO

Jun 30 20:33:03 dhcp46-50.lab.eng.blr.redhat.com systemd[1]: Started Gluster block storage utility.
Jun 30 20:33:03 dhcp46-50.lab.eng.blr.redhat.com systemd[1]: Starting Gluster block storage utility...
Jun 30 20:33:04 dhcp46-50.lab.eng.blr.redhat.com gluster-blockd[14149]: Parameter auto_save_on_exit is now 'false'.
Jun 30 20:33:04 dhcp46-50.lab.eng.blr.redhat.com gluster-blockd[14149]: Parameter logfile is now '/var/log/gluster-block/gluster-block-configshell.log'.
Jun 30 20:33:04 dhcp46-50.lab.eng.blr.redhat.com gluster-blockd[14149]: Parameter loglevel_file is now 'info'.
Jun 30 20:33:04 dhcp46-50.lab.eng.blr.redhat.com gluster-blockd[14149]: Parameter auto_enable_tpgt is now 'false'.
Jun 30 20:33:04 dhcp46-50.lab.eng.blr.redhat.com gluster-blockd[14149]: Parameter auto_add_default_portal is now 'false'.
[root@dhcp46-50 block-store]# systemctl status ^C
[root@dhcp46-50 block-store]# vim /var/^C
[root@dhcp46-50 block-store]# grep -R "trace" /var/log/gluster-block/*.log
[root@dhcp46-50 block-store]# vim /var/log/gluster-block/gluster-block-c
[root@dhcp46-50 block-store]# vim /var/log/gluster-block/gluster-block-cli.log 
[root@dhcp46-50 block-store]# vim /var/log/gluster-block/gluster-block
gluster-block-cli.log          gluster-block-configshell.log  gluster-blockd.log             gluster-block-gfapi.log        
[root@dhcp46-50 block-store]# vim /var/log/gluster-block/gluster-block-gfapi.log 
[root@dhcp46-50 block-store]# tail /var/log/gluster-block/gluster-block-gfapi.log
The message "W [MSGID: 114031] [client-rpc-fops.c:2109:client3_3_zerofill_cbk] 0-ozone-client-2: remote operation failed [No space left on device]" repeated 6499 times between [2018-07-01 00:57:08.794557] and [2018-07-01 00:59:08.952428]
The message "W [MSGID: 114031] [client-rpc-fops.c:2109:client3_3_zerofill_cbk] 0-ozone-client-0: remote operation failed [No space left on device]" repeated 13048 times between [2018-07-01 00:57:34.643151] and [2018-07-01 00:59:08.957384]
The message "W [MSGID: 114031] [client-rpc-fops.c:2109:client3_3_zerofill_cbk] 0-ozone-client-1: remote operation failed [No space left on device]" repeated 13052 times between [2018-07-01 00:57:11.478349] and [2018-07-01 00:59:08.957499]
[2018-07-01 00:59:08.968947] W [MSGID: 114031] [client-rpc-fops.c:2109:client3_3_zerofill_cbk] 0-ozone-client-0: remote operation failed [No space left on device]
[2018-07-01 00:59:08.969538] W [MSGID: 114031] [client-rpc-fops.c:2109:client3_3_zerofill_cbk] 0-ozone-client-1: remote operation failed [No space left on device]
[2018-07-01 00:59:08.986766] W [MSGID: 114031] [client-rpc-fops.c:2109:client3_3_zerofill_cbk] 0-ozone-client-2: remote operation failed [No space left on device]
[2018-07-01 00:59:41.180779] ERROR: glfs_zerofill(8d83c3cf-7230-4bee-a9f6-310e61b8bd00): on volume ozone for block ob2 of size 1099511627776 failed[No space left on device] [at glfs-operations.c+291 :<glusterBlockResizeEntry>]
The message "W [MSGID: 114031] [client-rpc-fops.c:2109:client3_3_zerofill_cbk] 0-ozone-client-1: remote operation failed [No space left on device]" repeated 2528 times between [2018-07-01 00:59:08.969538] and [2018-07-01 00:59:18.696171]
The message "W [MSGID: 114031] [client-rpc-fops.c:2109:client3_3_zerofill_cbk] 0-ozone-client-0: remote operation failed [No space left on device]" repeated 2528 times between [2018-07-01 00:59:08.968947] and [2018-07-01 00:59:18.713178]
The message "W [MSGID: 114031] [client-rpc-fops.c:2109:client3_3_zerofill_cbk] 0-ozone-client-2: remote operation failed [No space left on device]" repeated 4566 times between [2018-07-01 00:59:08.986766] and [2018-07-01 00:59:27.066719]
[root@dhcp46-50 block-store]# 
[root@dhcp46-50 block-store]# 
[root@dhcp46-50 block-store]# vim /var/log/gluster-block/gluster-block
gluster-block-cli.log          gluster-block-configshell.log  gluster-blockd.log             gluster-block-gfapi.log        
[root@dhcp46-50 block-store]# vim /var/log/gluster-block/gluster-block-configshell.log 
[root@dhcp46-50 block-store]# gluster-block info ozone/ob2
NAME: ob2
VOLUME: ozone
GBID: 8d83c3cf-7230-4bee-a9f6-310e61b8bd00
SIZE: 512.0 B
HA: 3
PASSWORD: 5827a5c2-6eaf-4154-9ce5-fa0398f20c13
EXPORTED ON: 10.70.46.50 10.70.46.102 10.70.46.176
[root@dhcp46-50 block-store]# 
[root@dhcp46-50 block-store]# ll
total 1
-rw-------. 1 root root 1099511627776 Jun 30 20:56 8d83c3cf-7230-4bee-a9f6-310e61b8bd00
-rw-------. 1 root root      10485760 Jun 30 19:26 a55295bf-844a-4e92-9a85-dbf470a2b75c
[root@dhcp46-50 block-store]# pwd
/mnt/ozone/block-store
[root@dhcp46-50 block-store]# df -h
Filesystem                   Size  Used Avail Use% Mounted on
/dev/mapper/rhgs-root         44G  4.8G   40G  11% /
devtmpfs                     7.8G     0  7.8G   0% /dev
tmpfs                        7.8G     0  7.8G   0% /dev/shm
tmpfs                        7.8G  338M  7.5G   5% /run
tmpfs                        7.8G     0  7.8G   0% /sys/fs/cgroup
/dev/mapper/RHS_vg1-RHS_lv1   50G   34M   50G   1% /bricks/brick1
/dev/mapper/RHS_vg2-RHS_lv2   50G   33M   50G   1% /bricks/brick2
/dev/mapper/RHS_vg3-RHS_lv3   50G   33M   50G   1% /bricks/brick3
/dev/vda1                   1014M  174M  841M  18% /boot
/dev/mapper/RHS_vg0-RHS_lv0   50G   50G  3.4M 100% /bricks/brick0
tmpfs                        1.6G     0  1.6G   0% /run/user/0
10.70.46.50:ozone             50G   50G     0 100% /mnt/ozone
[root@dhcp46-50 block-store]#

Comment 7 Amar Tumballi 2018-11-19 08:40:17 UTC
Considering the discussion that this is more of 'resize' bug, which is not part of OCS, we should take this upstream, and fix it there, so when we backport we get the fixes.


Note You need to log in before you can comment on or make changes to this bug.