Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1685502 - RHEL.8 guest needs long time at 'Booting from Hard Disk' on some specific machine with virtio-scsi driver
Summary: RHEL.8 guest needs long time at 'Booting from Hard Disk' on some specific mac...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: seabios
Version: ---
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Sergio Lopez
QA Contact: FuXiangChun
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-05 10:50 UTC by CongLi
Modified: 2019-04-11 02:34 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-11 02:34:57 UTC
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)
seabios log (deleted)
2019-03-05 10:50 UTC, CongLi
no flags Details

Description CongLi 2019-03-05 10:50:25 UTC
Created attachment 1540894 [details]
seabios log

Description of problem:
RHEL.8 guest needs long time at 'Booting from Hard Disk' on some specified machine with virtio-scsi driver, virtio-blk needs around 3s which is much faster than virtio-scsi.
2019-03-04 05:59:44: Booting from Hard Disk...
2019-03-04 05:59:44: Booting from 0000:7c00
2019-03-04 06:00:12: VBE mode info request: 140
2019-03-04 06:00:12: VBE mode info request: 141

Version-Release number of selected component (if applicable):
qemu-kvm-2.12.0-63.module+el8+2833+c7d6d092.x86_64
seabios-1.11.1-3.module+el8+2776+42eaaa59.x86_64

How reproducible:
always

Steps to Reproduce:
1. Boot up a RHEL.8 guest:
MALLOC_PERTURB_=1  /usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1' \
    -machine q35  \
    -nodefaults \
    -vga std \
    -chardev file,path=/home/seabios.log,id=seabios \
    -device isa-debugcon,chardev=seabios,iobase=0x402 \
    -chardev file,path=/home/serial.log,id=serial \
    -device isa-serial,chardev=serial \
    -device pcie-root-port,id=pcie.0-root-port-6,slot=6,chassis=6,addr=0x6,bus=pcie.0 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie.0-root-port-6,addr=0x0 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel80-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -device pcie-root-port,id=pcie.0-root-port-7,slot=7,chassis=7,addr=0x7,bus=pcie.0 \
    -device virtio-net-pci,mac=9a:c0:c1:c2:c3:c4,id=idQanWPO,vectors=4,netdev=idksf3Wv,bus=pcie.0-root-port-7,addr=0x0  \
    -netdev tap,id=idksf3Wv,vhost=on \
    -m 15360  \
    -smp 4,maxcpus=4,cores=2,threads=1,sockets=2  \
    -cpu 'Haswell-noTSX',+kvm_pv_unhalt \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -monitor stdio \
    -vnc :0 \

2.
3.

Actual results:
The guest needs over 20s at 
"Booting from Hard Disk...
Booting from 0000:7c00".

Expected results:
Guest could boot up soon, no more than 10s from local disk.

Additional info:
1. same result as pc machine type
2. virtio-scsi is slower than virtio-blk
3. cpu info:
processor	: 31
vendor_id	: GenuineIntel
cpu family	: 6
model		: 63
model name	: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
stepping	: 2
microcode	: 0x3d
cpu MHz		: 1203.452
cache size	: 20480 KB
physical id	: 1
siblings	: 16
core id		: 7
cpu cores	: 8
apicid		: 31
initial apicid	: 31
fpu		: yes
fpu_exception	: yes
cpuid level	: 15
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts flush_l1d
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
bogomips	: 4804.88
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:
4. disk info which guest installed on:
03:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 [Invader] (rev 02)
5. The issue only occur on specified machine.

Comment 3 Sergio Lopez 2019-03-12 17:07:15 UTC
Thanks for giving me access to the server to where the issue can be reproduced.

The underlying issue is that there seems to be some kind of problem with the underlying storage, causing some read ops to be *extremely* slow:

[root@dell-per630-02 home]# dd if=/home/kvm_autotest_root/images/rhel80-64-virtio-scsi.qcow2 of=/dev/null iflag=direct bs=16k count=10240
10240+0 records in
10240+0 records out
167772160 bytes (168 MB, 160 MiB) copied, 73.8531 s, 2.3 MB/s

Note the 128ms average latency.

Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
sda              8.00    0.00    128.00      0.00     0.00     0.00   0.00   0.00  128.00    0.00   1.02    16.00     0.00   1.00   0.80
dm-0             0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
dm-1             0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
dm-2             8.00    0.00    128.00      0.00     0.00     0.00   0.00   0.00  128.12    0.00   1.02    16.00     0.00   1.00   0.80

Looks like a HW issue to me.

Comment 4 CongLi 2019-03-13 02:31:21 UTC
(In reply to Sergio Lopez from comment #3)
> Thanks for giving me access to the server to where the issue can be
> reproduced.
> 
> The underlying issue is that there seems to be some kind of problem with the
> underlying storage, causing some read ops to be *extremely* slow:
> 
> [root@dell-per630-02 home]# dd
> if=/home/kvm_autotest_root/images/rhel80-64-virtio-scsi.qcow2 of=/dev/null
> iflag=direct bs=16k count=10240
> 10240+0 records in
> 10240+0 records out
> 167772160 bytes (168 MB, 160 MiB) copied, 73.8531 s, 2.3 MB/s
> 
> Note the 128ms average latency.
> 
> Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm 
> %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
> sda              8.00    0.00    128.00      0.00     0.00     0.00   0.00  
> 0.00  128.00    0.00   1.02    16.00     0.00   1.00   0.80
> dm-0             0.00    0.00      0.00      0.00     0.00     0.00   0.00  
> 0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
> dm-1             0.00    0.00      0.00      0.00     0.00     0.00   0.00  
> 0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
> dm-2             8.00    0.00    128.00      0.00     0.00     0.00   0.00  
> 0.00  128.12    0.00   1.02    16.00     0.00   1.00   0.80
> 
> Looks like a HW issue to me.

Hi Sergio,

QE would like to confirm that according to QE tests, virtio-blk is faster 
that virtio-scsi, virtio-blk only needs 2~3 seconds, is it also an issue 
or as expected?

script: /home/virtio-blk.sh

Thanks.

Comment 5 Sergio Lopez 2019-03-13 08:20:09 UTC
(In reply to CongLi from comment #4)
> (In reply to Sergio Lopez from comment #3)
> > Thanks for giving me access to the server to where the issue can be
> > reproduced.
> > 
> > The underlying issue is that there seems to be some kind of problem with the
> > underlying storage, causing some read ops to be *extremely* slow:
> > 
> > [root@dell-per630-02 home]# dd
> > if=/home/kvm_autotest_root/images/rhel80-64-virtio-scsi.qcow2 of=/dev/null
> > iflag=direct bs=16k count=10240
> > 10240+0 records in
> > 10240+0 records out
> > 167772160 bytes (168 MB, 160 MiB) copied, 73.8531 s, 2.3 MB/s
> > 
> > Note the 128ms average latency.
> > 
> > Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm 
> > %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
> > sda              8.00    0.00    128.00      0.00     0.00     0.00   0.00  
> > 0.00  128.00    0.00   1.02    16.00     0.00   1.00   0.80
> > dm-0             0.00    0.00      0.00      0.00     0.00     0.00   0.00  
> > 0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
> > dm-1             0.00    0.00      0.00      0.00     0.00     0.00   0.00  
> > 0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
> > dm-2             8.00    0.00    128.00      0.00     0.00     0.00   0.00  
> > 0.00  128.12    0.00   1.02    16.00     0.00   1.00   0.80
> > 
> > Looks like a HW issue to me.
> 
> Hi Sergio,
> 
> QE would like to confirm that according to QE tests, virtio-blk is faster 
> that virtio-scsi, virtio-blk only needs 2~3 seconds, is it also an issue 
> or as expected?
> 
> script: /home/virtio-blk.sh

Seems to be that the unexpected latency on the storage latency only manifests while reading certain blocks (which reinforces the idea of this being a HW issue).

If we change "/home/virtio-blk.sh" to access the same qcow2 file as "/home/qemu.sh" ("/home/kvm_autotest_root/images/rhel80-64-virtio-scsi.qcow2"), the issue manifests in the same way as with virtio-scsi. I created the script "/home/virtio-blk-slp.sh" with this minor modification.


I think this particular server should be decommissioned until its local storage is repaired.

Thanks.

Comment 6 CongLi 2019-03-13 08:34:20 UTC
(In reply to Sergio Lopez from comment #5)
> Seems to be that the unexpected latency on the storage latency only
> manifests while reading certain blocks (which reinforces the idea of this
> being a HW issue).
> 
> If we change "/home/virtio-blk.sh" to access the same qcow2 file as
> "/home/qemu.sh"
> ("/home/kvm_autotest_root/images/rhel80-64-virtio-scsi.qcow2"), the issue
> manifests in the same way as with virtio-scsi. I created the script
> "/home/virtio-blk-slp.sh" with this minor modification.
> 
> 
> I think this particular server should be decommissioned until its local
> storage is repaired.
> 
> Thanks.

Thanks Sergio, I will send ticket to fix the hw issue, and update the status to 
this bug later.

Thanks.

Comment 7 Sergio Lopez 2019-04-10 13:38:25 UTC
Hi CongLi,

Any news about the issue? Can we close this BZ?

Thanks.

Comment 8 CongLi 2019-04-11 02:34:57 UTC
Thanks Sergio, it's a hw issue, there is no performance issue 
after the storage fixed, close this bug.

Thanks.


Note You need to log in before you can comment on or make changes to this bug.