Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1367806 - [RFE] - Add "blkdiscard" as a new method to zero volumes
Summary: [RFE] - Add "blkdiscard" as a new method to zero volumes
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: RFEs
Version: 4.18.15
Hardware: Unspecified
OS: Unspecified
medium
high with 1 vote vote
Target Milestone: ovirt-4.2.0
: 4.20.9
Assignee: Idan Shaby
QA Contact: Elad
URL:
Whiteboard:
Depends On: 1327886
Blocks: wipe_disk_using_blkdiscard 1314382 1475780 1487151
TreeView+ depends on / blocked
 
Reported: 2016-08-17 14:16 UTC by Idan Shaby
Modified: 2017-12-20 10:53 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
A new VDSM parameter enables a host to remove a disk/snapshot on block storage, where "Wipe After Delete" is enabled, in much less time than the "dd" command, especially if the storage supports "Write same."
Clone Of:
: 1475780 (view as bug list)
Environment:
Last Closed: 2017-12-20 10:53:54 UTC
oVirt Team: Storage
rule-engine: ovirt-4.2+
ratamir: testing_plan_complete-
ylavi: planning_ack+
rule-engine: devel_ack+
ratamir: testing_ack+


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
oVirt gerrit 68804 master MERGED storagetestlib: introduce the Aborting tests tool 2016-12-26 08:17:29 UTC
oVirt gerrit 68805 master MERGED storage: add an option to zero volumes using "blkdiscard --zeroout" 2017-07-27 08:56:03 UTC
oVirt gerrit 79449 master POST storage: change blockdev tests to use loop device 2017-07-16 09:20:28 UTC

Description Idan Shaby 2016-08-17 14:16:49 UTC
Description of problem:
Today, vdsm uses the "dd" linux command to wipe volumes.

The problem with using "dd" to wipe volumes is that it is very slow (~ 7 minutes to wipe a 10GB volume on netapp in my environment).
To zero volumes more efficiently, vdsm can use the "blkdiscard" command from the util-linux package, which can run up to ~ 10 times faster.

Version-Release number of selected component (if applicable):
7cf1dbe1b669e9dab203b33baae34192bf01e114

Steps to Reproduce:
1. Create a disk on a block storage domain.
2. Set its "Wipe After Delete" property to true.
3. Remove the disk and see in the vdsm log that it is done very slow.
You can do that by calculating the time that passes between the log message "Zero volume thread started for volume <volume_id>" and the log message "Zero volume <volume_id> task <task_id> completed".

Actual results:
In my environment it takes ~ 7 minutes to wipe a 10GB disk.

Expected results:
Should be quicker, if it's possible.

Additional info:
Calling "blkdiscard -z <block_device>" should work at least fast as "dd", and up to ~ 10 times faster, as it calls "write same" if the device supports it.

Comment 3 Idan Shaby 2016-11-07 06:12:42 UTC
No, I tried to figure out what we should do so that won't happen, but we decided that we should be done with bug 1241106 first, and then come back to this one.
The last thing I saw was that the command failed with timeouts only when I ran it from vdsm. I guess that we should run it with higher priority or maybe run it in a different way (not like we run dd today).
Anyway, right now there's no need for a bug.

Comment 4 Idan Shaby 2017-07-27 10:44:59 UTC
Bug 1475780 was opened to switch the default zero method to "blkdiscard".

Comment 5 Raz Tamir 2017-08-31 09:52:58 UTC
Tested on ovirt-engine-4.2.0-0.0.master.20170828065003.git0619c76.el7.centos

wipe after delete took 1.5 minutes for a 10GB disk on a block domain

Is there any change to move the messages
Zero volume thread started for volume <VOL_ID>
and
Zero volume thread finished for volume <VOL_ID>

to be on INFO logger level or should I open a new bug for that request?

Comment 6 Allon Mureinik 2017-08-31 09:58:50 UTC
(In reply to Raz Tamir from comment #5)
> Tested on ovirt-engine-4.2.0-0.0.master.20170828065003.git0619c76.el7.centos
> 
> wipe after delete took 1.5 minutes for a 10GB disk on a block domain
On the same setup/storage, how long does it take to delete a 10GB disk with the "old" method?

> 
> Is there any change to move the messages
> Zero volume thread started for volume <VOL_ID>
> and
> Zero volume thread finished for volume <VOL_ID>
> 
> to be on INFO logger level or should I open a new bug for that request?
Let's have a new BZ for this please

Comment 7 Raz Tamir 2017-08-31 10:36:00 UTC
(In reply to Allon Mureinik from comment #6)
> (In reply to Raz Tamir from comment #5)
> > Tested on ovirt-engine-4.2.0-0.0.master.20170828065003.git0619c76.el7.centos
> > 
> > wipe after delete took 1.5 minutes for a 10GB disk on a block domain
> On the same setup/storage, how long does it take to delete a 10GB disk with
> the "old" method?
~ 5 minutes
> 
> > 
> > Is there any change to move the messages
> > Zero volume thread started for volume <VOL_ID>
> > and
> > Zero volume thread finished for volume <VOL_ID>
> > 
> > to be on INFO logger level or should I open a new bug for that request?
> Let's have a new BZ for this please
https://bugzilla.redhat.com/show_bug.cgi?id=1487151

Comment 8 Allon Mureinik 2017-08-31 11:45:10 UTC
(In reply to Raz Tamir from comment #7)
> (In reply to Allon Mureinik from comment #6)
> > (In reply to Raz Tamir from comment #5)
> > > Tested on ovirt-engine-4.2.0-0.0.master.20170828065003.git0619c76.el7.centos
> > > 
> > > wipe after delete took 1.5 minutes for a 10GB disk on a block domain
> > On the same setup/storage, how long does it take to delete a 10GB disk with
> > the "old" method?
> ~ 5 minutes
A x3 improvement for a 10GB disk? I'll take that!

> > 
> > > 
> > > Is there any change to move the messages
> > > Zero volume thread started for volume <VOL_ID>
> > > and
> > > Zero volume thread finished for volume <VOL_ID>
> > > 
> > > to be on INFO logger level or should I open a new bug for that request?
> > Let's have a new BZ for this please
> https://bugzilla.redhat.com/show_bug.cgi?id=1487151
Thanks!

Comment 9 Yaniv Kaul 2017-08-31 20:49:28 UTC
(In reply to Raz Tamir from comment #7)
> (In reply to Allon Mureinik from comment #6)
> > (In reply to Raz Tamir from comment #5)
> > > Tested on ovirt-engine-4.2.0-0.0.master.20170828065003.git0619c76.el7.centos
> > > 
> > > wipe after delete took 1.5 minutes for a 10GB disk on a block domain

That's a bit slow - ~113MBps for discard?

> > On the same setup/storage, how long does it take to delete a 10GB disk with
> > the "old" method?
> ~ 5 minutes

And that's VERY slow! 34MBps?!?!

Something wrong with that storage. Or my math.

My laptop (SSD, 16.5G):
[ykaul@ykaul sosreport-dvrhvm01.cbec.gov.in-20170831074327]$ time sudo blkdiscard --zero /dev/sda3

real	1m47.121s
user	0m0.005s
sys	0m0.167s


But:
[ykaul@ykaul sosreport-dvrhvm01.cbec.gov.in-20170831074327]$ time sudo blkdiscard  /dev/sda3
[sudo] password for ykaul: 

real	0m4.236s
user	0m0.019s
sys	0m0.017s

So perhaps my SSD doesn't support write_same? 

[ykaul@ykaul sosreport-dvrhvm01.cbec.gov.in-20170831074327]$ sudo sg_inq -p 0xb0 /dev/sda
VPD INQUIRY: Block limits page (SBC)
  Maximum compare and write length: 0 blocks
  Optimal transfer length granularity: 1 blocks
  Maximum transfer length: 0 blocks
  Optimal transfer length: 0 blocks
  Maximum prefetch transfer length: 0 blocks
  Maximum unmap LBA count: 0
  Maximum unmap block descriptor count: 0
  Optimal unmap granularity: 1
  Unmap granularity alignment valid: 0
  Unmap granularity alignment: 0
  Maximum write same length: 0x3fffc0 blocks
  Maximum atomic transfer length: 0
  Atomic alignment: 0
  Atomic transfer length granularity: 0


> > 
> > > 
> > > Is there any change to move the messages
> > > Zero volume thread started for volume <VOL_ID>
> > > and
> > > Zero volume thread finished for volume <VOL_ID>
> > > 
> > > to be on INFO logger level or should I open a new bug for that request?
> > Let's have a new BZ for this please
> https://bugzilla.redhat.com/show_bug.cgi?id=1487151

Comment 10 Sandro Bonazzola 2017-12-20 10:53:54 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.