Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1281713 - system_reset should clear pending request for error (IDE)
Summary: system_reset should clear pending request for error (IDE)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.7
Hardware: x86_64
OS: Windows
unspecified
medium
Target Milestone: rc
: ---
Assignee: John Snow
QA Contact: aihua liang
URL:
Whiteboard:
Depends On:
Blocks: 1299875 1299876
TreeView+ depends on / blocked
 
Reported: 2015-11-13 09:03 UTC by Guo, Zhiyi
Modified: 2017-03-21 09:35 UTC (History)
10 users (show)

Fixed In Version: qemu-kvm-0.12.1.2-2.494.el6
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1299875 (view as bug list)
Environment:
Last Closed: 2017-03-21 09:35:12 UTC
Target Upstream Version:


Attachments (Terms of Use)
valgrind log on intel broadwell host (deleted)
2015-12-01 08:32 UTC, Guo, Zhiyi
no flags Details
valgrind log on amd Opteron 6376 (deleted)
2015-12-01 09:18 UTC, Guo, Zhiyi
no flags Details
valgrind log with option --error-limit=no (deleted)
2015-12-02 08:09 UTC, Guo, Zhiyi
no flags Details
correct valgrind log including rhel6.7 and rhel7.2 (deleted)
2015-12-02 08:27 UTC, Guo, Zhiyi
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:0621 normal SHIPPED_LIVE Moderate: qemu-kvm security and bug fix update 2017-03-21 12:28:31 UTC

Description Guo, Zhiyi 2015-11-13 09:03:11 UTC
Description of problem:
qemu-kvm quit with Segmentation fault after execute system_reset when no space left on host.

Version-Release number of selected component (if applicable):
qemu-img-0.12.1.2-2.481.el6.x86_64
qemu-kvm-tools-0.12.1.2-2.481.el6.x86_64
qemu-kvm-0.12.1.2-2.481.el6.x86_64
qemu-guest-agent-0.12.1.2-2.481.el6.x86_64
qemu-kvm-debuginfo-0.12.1.2-2.481.el6.x86_64
2.6.32-583.el6.x86_64

How reproducible:
70%

Steps to Reproduce:
1.Create a 25G win2012.qcow2 image and install a windows2012r2 guest.
2.In guest located filesystem, make it out of space by copy guest image several times until no space left on device prompt. Launch guest by qemu-kvm command:
/usr/libexec/qemu-kvm -name win2012 -m 2048 \
	-cpu Opteron_G4 \
	-smp 1,cores=1,threads=2,sockets=2,maxcpus=4 \
	 -vga qxl\
	-serial unix:/tmp/m,server,nowait \
	-drive file=win2012-64r2-virtio-scsi.qcow2,if=none,id=drive-scsi-disk0,format=qcow2,cache=none,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0 -device scsi-hd,drive=drive-scsi-disk0,bus=scsi0.0,scsi-id=0,lun=0,id=scsi-disk0,bootindex=1 \
	-monitor stdio \
	-usb -device usb-kbd,id=input0 \
	-vnc :1

3. Interact with guest by browsing internet or other things until you see "block I/O error in device 'ide0-hd0': No space left on device (28)" prompt from qemu-kvm monitor(Prompt usually happen within 5 minutes), input system_reset in qemu monitor. And Segmentation fault will happen.

Actual results:
qemu-kvm quit with Segmentation fault after execute system_reset

Expected results:
qemu-kvm process should still alive and guest system can be reset without error

Additional info:
Stack info:
Core was generated by `/usr/libexec/qemu-kvm -name win2012 -m 2048 -cpu SandyBridge -smp 2,cores=1,thr'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f85d1b04a90 in ?? ()
(gdb) bt
#0  0x00007f85d1b04a90 in ?? ()
#1  0x00007f85d03f5aee in bdrv_aio_cancel (acb=<value optimized out>)
    at /usr/src/debug/qemu-kvm-0.12.1.2/block.c:3842
#2  0x00007f85d052d46a in ide_dma_cancel (bm=0x7f85d26e1160)
    at /usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/core.c:2395
#3  0x00007f85d052d499 in ide_dma_reset (bm=0x7f85d26e1160)
    at /usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/core.c:2408
#4  0x00007f85d05335ad in piix3_reset (opaque=0x7f85d26e0010)
    at /usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/piix.c:124
#5  0x00007f85d03b71d2 in qemu_system_reset (report=true)
    at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:3417
#6  0x00007f85d03dd050 in qemu_kvm_system_reset (report=true)
    at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1992
#7  0x00007f85d03dd253 in kvm_main_loop ()
    at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2272
#8  0x00007f85d03be317 in main_loop (argc=<value optimized out>, 
    argv=<value optimized out>, envp=<value optimized out>)
    at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4273
#9  main (argc=<value optimized out>, argv=<value optimized out>, 
    envp=<value optimized out>)
    at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6731

Qemu-kvm won't quit with Segmentation fault on Opteron_G5 host but windows guest cannot be reset after system_reset.

Comment 2 Markus Armbruster 2015-11-24 16:37:15 UTC
Can you reproduce this with a qemu-kvm built with --enable-debug?

Comment 3 Guo, Zhiyi 2015-11-27 02:00:55 UTC
Hi,
I guess you may want to see the function call ?? or argument value has been optimized. 
Stack trace still the same as reported in description.
I have enabled --enable-debug option and rebuild the qemu-kvm. -g option has been added to compile procedure from configure file:
+ ../configure --target-list=x86_64-softmmu '--extra-ldflags=-Wl,--build-id -pie -Wl,-z,relro -Wl,-z,now' '--extra-cflags=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fPIE -DPIE' --with-pkgversion=qemu-kvm-0.12.1.2-2.479.el6.2 --prefix=/usr --localstatedir=/var --sysconfdir=/etc --disable-strip --disable-xen --block-drv-rw-whitelist=qcow2,raw,file,host_device,host_cdrom,qed,gluster,rbd --block-drv-ro-whitelist=vmdk,vpc --disable-debug-tcg --disable-sparse --disable-sdl --disable-curses --disable-curl --disable-check-utests --disable-bluez --enable-docs --disable-vde --disable-spice --trace-backend=nop --enable-smartcard --disable-smartcard-nss --enable-mixemu
Install prefix    /usr
BIOS directory    /usr/share/qemu
binary directory  /usr/bin
local state directory   /var
Manual directory  /usr/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path       /root/rpmbuild/BUILD/qemu-kvm-0.12.1.2
C compiler        gcc
Host C compiler   gcc
CFLAGS            -O2 -g

BR/
Zhiyi

Comment 4 Markus Armbruster 2015-11-27 07:17:53 UTC
I can't see --enable-debug in your configure line.  I can see -O2.  You need to get one roughly like this:

../configure --target-list=x86_64-softmmu '--extra-ldflags=-Wl,--build-id -pie -Wl,-z,relro -Wl,-z,now' '--extra-cflags=-g -pipe -Wall -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fPIE -DPIE' --with-pkgversion=qemu-kvm-0.12.1.2-2.479.el6.2 --prefix=/usr --localstatedir=/var --sysconfdir=/etc --disable-strip --disable-xen --block-drv-rw-whitelist=qcow2,raw,file,host_device,host_cdrom,qed,gluster,rbd --block-drv-ro-whitelist=vmdk,vpc --disable-debug-tcg --disable-sparse --disable-sdl --disable-curses --disable-curl --disable-check-utests --disable-bluez --enable-docs --disable-vde --disable-spice --trace-backend=nop --enable-smartcard --disable-smartcard-nss --enable-mixemu

Please try again :)

Comment 5 Guo, Zhiyi 2015-11-27 11:40:59 UTC
(In reply to Markus Armbruster from comment #4)
> I can't see --enable-debug in your configure line.  I can see -O2.  You need
> to get one roughly like this:
> 
> ../configure --target-list=x86_64-softmmu '--extra-ldflags=-Wl,--build-id
> -pie -Wl,-z,relro -Wl,-z,now' '--extra-cflags=-g -pipe -Wall -fexceptions
> -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fPIE -DPIE'
> --with-pkgversion=qemu-kvm-0.12.1.2-2.479.el6.2 --prefix=/usr
> --localstatedir=/var --sysconfdir=/etc --disable-strip --disable-xen
> --block-drv-rw-whitelist=qcow2,raw,file,host_device,host_cdrom,qed,gluster,
> rbd --block-drv-ro-whitelist=vmdk,vpc --disable-debug-tcg --disable-sparse
> --disable-sdl --disable-curses --disable-curl --disable-check-utests
> --disable-bluez --enable-docs --disable-vde --disable-spice
> --trace-backend=nop --enable-smartcard --disable-smartcard-nss
> --enable-mixemu
> 
> Please try again :)

Stack trace with none optimized code: 
(gdb) bt
#0  0x00007f1e571cfb10 in ?? ()
#1  0x00007f1e55ebd5ed in bdrv_aio_cancel_async (acb=0x7f1e571cfc10) at /usr/src/debug/qemu-kvm-0.12.1.2/block.c:3876
#2  0x00007f1e55ebd499 in bdrv_aio_cancel (acb=0x7f1e571cfc10) at /usr/src/debug/qemu-kvm-0.12.1.2/block.c:3842
#3  0x00007f1e56008f37 in ide_dma_cancel (bm=0x7f1e57dac160) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/core.c:2395
#4  0x00007f1e56008f5d in ide_dma_reset (bm=0x7f1e57dac160) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/core.c:2408
#5  0x00007f1e5600c755 in piix3_reset (opaque=0x7f1e57dab010) at /usr/src/debug/qemu-kvm-0.12.1.2/hw/ide/piix.c:124
#6  0x00007f1e55e6765b in qemu_system_reset (report=true) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:3417
#7  0x00007f1e55e990ad in qemu_kvm_system_reset (report=true) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:1992
#8  0x00007f1e55e9997d in kvm_main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-kvm.c:2272
#9  0x00007f1e55e683ba in main_loop () at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:4273
#10 0x00007f1e55e6d451 in main (argc=24, argv=0x7fff7f33ac18, envp=0x7fff7f33ace0) at /usr/src/debug/qemu-kvm-0.12.1.2/vl.c:6731

Comment 6 Markus Armbruster 2015-11-27 13:17:42 UTC
Aha!  acb->aiocb_info->cancel_async seems to be garbage.  Hunch: use after free?  Chimes with your report that Opteron_G5 fails differently...  Please reproduce with your debug build of qemu-kvm under valgrind, and capture valgrind's report.

Comment 7 Guo, Zhiyi 2015-12-01 08:32:23 UTC
Created attachment 1100766 [details]
valgrind log on intel broadwell host

Log generated on Valgrind 3.11.0, Valgrind 3.8.1 will core dump under same steps

Comment 8 Guo, Zhiyi 2015-12-01 09:18:13 UTC
Created attachment 1100779 [details]
valgrind log on amd Opteron 6376

Comment 9 Markus Armbruster 2015-12-01 10:01:26 UTC
valgrind is reporting a huge number of unrelated issues, probably in part because we lack upstream patches to suppress false positives.  It hits a cutoff and stops reporting some time before the crash.  Please try again with --error-limit=no.

Additional question: is qemu-kvm-rhev affected as well?

Comment 10 Guo, Zhiyi 2015-12-02 08:09:12 UTC
Created attachment 1101353 [details]
valgrind log with option --error-limit=no

Issue also can be reproduced on rhel7.2 intel skylake host with rhev:
kernel:3.10.0-334.el7.x86_64
qemu-kvm-rhev-2.3.0-31.el7.x86_64
qemu-kvm-rhev-debuginfo-2.3.0-31.el7.x86_64
qemu-img-rhev-2.3.0-31.el7.x86_64
qemu-kvm-tools-rhev-2.3.0-31.el7.x86_64
qemu-kvm-common-rhev-2.3.0-31.el7.x86_64

Attachment include valgrind log reproduced on rhel6.7 and rhel7.2. rhev packages have been compiled with -g and without -O2 optimize. valgrind log generate with option --error-limit=no

Comment 11 Guo, Zhiyi 2015-12-02 08:12:38 UTC
Command used to reproduce the issue and capture valgrind log:
valgrind --log-file=valgrind.txt --error-limit=no /usr/libexec/qemu-kvm -name win2012 -m 2048 -smp 4 -cpu host -vga qxl -vnc :1 -monitor stdio -hda win2012.qcow2

Comment 12 Guo, Zhiyi 2015-12-02 08:27:53 UTC
Created attachment 1101356 [details]
correct valgrind log including rhel6.7 and rhel7.2

Mistake valgrind log on rhel6.7 please ignore attachment in comment 10 and use log in this comment.

Comment 13 Jeff Nelson 2016-09-22 13:49:08 UTC
Fix included in qemu-kvm-0.12.1.2-2.494.el6

Comment 14 Ademar Reis 2016-09-28 01:48:28 UTC
For reference, this is the cluster of BZ related to this issue: bug 1281713, bug 1299876, bug 1299875, bug 1361487, bug 1361490, bug 1361488, bug 1375520

Comment 16 aihua liang 2016-12-28 03:16:33 UTC
Have verified, the problem still exist.

Test Version:
 kernel version: 2.6.32-680.el6.x86_64
 qemu-kvm-rhev version: qemu-kvm-rhev-0.12.1.2-2.499.el6.x86_64

Test Step:
1.Create 25G qcow2 image, install win2012r2 on it.

2.Full fill disk by copying the image, until prompt "No space left on device" appear.

3.Start guest by qemu cmd:
MALLOC_PERTURB_=1  /usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1' \
-machine rhel6.6.0  \
-nodefaults  \
-vga std \
-chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20161227-002429-zWDOQysC,server,nowait \
-mon chardev=qmp_id_qmpmonitor1,mode=control  \
-chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20161227-002429-zWDOQysC,server,nowait \
-mon chardev=qmp_id_catch_monitor,mode=control \
-device pvpanic,ioport=0x505,id=idBpoMGZ  \
-chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20161227-002429-zWDOQysC,server,nowait \
-device isa-serial,chardev=serial_id_serial0  \
-chardev socket,id=seabioslog_id_20161227-002429-zWDOQysC,path=/var/tmp/seabios-20161227-002429-zWDOQysC,server,nowait \
-device isa-debugcon,chardev=seabioslog_id_20161227-002429-zWDOQysC,iobase=0x402 \
-device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \
-device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=1d.0,firstport=0,bus=pci.0 \
-device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=1d.2,firstport=2,bus=pci.0 \
-device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=1d.4,firstport=4,bus=pci.0 \
-drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/usr/share/avocado/data/avocado-vt/images/win2012-64r2-virtio.qcow2 \
-device ide-drive,id=image1,drive=drive_image1,bus=ide.0,unit=0 \
-device virtio-net-pci,mac=9a:b1:b2:b3:b4:b5,id=idFb0lIj,vectors=4,netdev=idNlTvHu,bus=pci.0,addr=04  \
-netdev tap,id=idNlTvHu,vhost=on \
-m 4096  \
-smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
-cpu 'SandyBridge' \
-drive id=drive_cd1,if=none,snapshot=off,aio=native,cache=none,media=cdrom,file=/usr/share/avocado/data/avocado-vt/isos/ISO/Win2012R2/en_windows_server_2012_r2_with_update_x64_dvd_6052708.iso \
-device ide-drive,id=cd1,drive=drive_cd1,bus=ide.0,unit=1 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=05 \
-drive id=drive_winutils,if=none,snapshot=off,aio=native,cache=none,media=cdrom,file=/usr/share/avocado/data/avocado-vt/isos/windows/winutils.iso \
-device scsi-cd,id=winutils,drive=drive_winutils \
-vnc :0  \
-rtc base=localtime,clock=host,driftfix=slew  \
-boot order=cdn,menu=off,strict=off \
-enable-kvm \
-monitor stdio \

4.Wait until error info "block I/O error in device 'drive_image1': No space left on device(28)"appear in hmp monitor, check vm status:
  (qemu)info status                   -----> VM status:paused(io-error)

5.Reset vm
  (qemu)system_reset                  -----> VM no response

6.Check vm status
  (qemu)info status                   -----> VM status:paused(io-error)


From step4,5,6, we can see that the problem hasn't been resolved, so change the bug status to Assigned.

Comment 18 aihua liang 2017-01-10 08:55:36 UTC
According to fam's suggestion in bz1361490, i lack test step "cont" after "system_reset". so retest it, details as bellow.

Test Version:
 kernel version: 2.6.32-681.el6.x86_64
 qemu-kvm-rhev version: qemu-kvm-rhev-0.12.1.2-2.499.el6.x86_64

Test Step:
1.Create 25G qcow2 image, install win2012r2 on it.

2.Full fill disk by copying the image, until prompt "No space left on device" appear.

3.Start guest by qemu cmd:
MALLOC_PERTURB_=1  /usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1' \
-machine rhel6.6.0  \
-nodefaults  \
-vga std \
-chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20161227-002429-zWDOQysC,server,nowait \
-mon chardev=qmp_id_qmpmonitor1,mode=control  \
-chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20161227-002429-zWDOQysC,server,nowait \
-mon chardev=qmp_id_catch_monitor,mode=control \
-device pvpanic,ioport=0x505,id=idBpoMGZ  \
-chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20161227-002429-zWDOQysC,server,nowait \
-device isa-serial,chardev=serial_id_serial0  \
-chardev socket,id=seabioslog_id_20161227-002429-zWDOQysC,path=/var/tmp/seabios-20161227-002429-zWDOQysC,server,nowait \
-device isa-debugcon,chardev=seabioslog_id_20161227-002429-zWDOQysC,iobase=0x402 \
-device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \
-device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=1d.0,firstport=0,bus=pci.0 \
-device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=1d.2,firstport=2,bus=pci.0 \
-device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=1d.4,firstport=4,bus=pci.0 \
-drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=/usr/share/avocado/data/avocado-vt/images/win2012-64r2-virtio.qcow2 \
-device ide-drive,id=image1,drive=drive_image1,bus=ide.0,unit=0 \
-device virtio-net-pci,mac=9a:b1:b2:b3:b4:b5,id=idFb0lIj,vectors=4,netdev=idNlTvHu,bus=pci.0,addr=04  \
-netdev tap,id=idNlTvHu,vhost=on \
-m 4096  \
-smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
-cpu 'SandyBridge' \
-drive id=drive_cd1,if=none,snapshot=off,aio=native,cache=none,media=cdrom,file=/usr/share/avocado/data/avocado-vt/isos/ISO/Win2012R2/en_windows_server_2012_r2_with_update_x64_dvd_6052708.iso \
-device ide-drive,id=cd1,drive=drive_cd1,bus=ide.0,unit=1 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=05 \
-drive id=drive_winutils,if=none,snapshot=off,aio=native,cache=none,media=cdrom,file=/usr/share/avocado/data/avocado-vt/isos/windows/winutils.iso \
-device scsi-cd,id=winutils,drive=drive_winutils \
-vnc :0  \
-rtc base=localtime,clock=host,driftfix=slew  \
-boot order=cdn,menu=off,strict=off \
-enable-kvm \
-monitor stdio \

4.Wait until error info "block I/O error in device 'drive_image1': No space left on device(28)"appear in hmp monitor, check vm status:
  (qemu)info status                   -----> VM status:paused(io-error)

5.Reset vm
  (qemu)system_reset                  -----> VM no response

6.Cont vm
  (qemu)cont                          -----> VM restart and load win2012r2 then appear error msg "block I/O error in device 'drive_image1': No space left on device (28)"

7.Check vm status
  (qemu)info status                   -----> VM status:paused(io-error)


8.Release some disk space, repeat step5~6.

Test Result:
After step8, vm works normally with status "running".


So, the problem has been resolved, change its status to "Verified".

Comment 21 errata-xmlrpc 2017-03-21 09:35:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0621.html


Note You need to log in before you can comment on or make changes to this bug.