Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1600599 - [RHHI] QEMU-KVM crash seen while removing storage domain which had vm's on it [NEEDINFO]
Summary: [RHHI] QEMU-KVM crash seen while removing storage domain which had vm's on it
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: glusterfs
Version: 7.5
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: sankarshan
QA Contact: Sweta Anandpara
URL:
Whiteboard:
: 1609561 (view as bug list)
Depends On:
Blocks: 1481022 1600598
TreeView+ depends on / blocked
 
Reported: 2018-07-12 14:56 UTC by bipin
Modified: 2018-11-19 05:25 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1600598
Environment:
Last Closed:
Target Upstream Version:
sabose: needinfo? (bshetty)


Attachments (Terms of Use)

Description bipin 2018-07-12 14:56:54 UTC
+++ This bug was initially created as a clone of Bug #1600598 +++

Description of problem:
-----------------------
The qemu-kvm crashed while removing the storage domain via RHV-M. The storage domain has few vm's running on it. This was seen on a dedupe and compression enabled (VDO) storage domain


Version-Release number of selected component (if applicable):
-----------------------------------------------------------
qemu-guest-agent-2.8.0-2.el7.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.4.x86_64
qemu-kvm-common-rhev-2.10.0-21.el7_5.4.x86_64
libvirt-daemon-driver-qemu-3.9.0-14.el7_5.6.x86_64
qemu-img-rhev-2.10.0-21.el7_5.4.x86_64
glusterfs-3.8.4-54.13.el7rhgs.x86_64

How reproducible:
----------------

Steps to Reproduce:
-------------------
1. Have the HE deployed on a Hyperconverged infrastructure(RHHI)
2. Create a storage domain with VDO enabled volumes
3. Create multiple vm's and pump data using FIO
4. Stop the gluster volumes and delete it
5. Remove the storage domain via RHV-M

Actual results:
--------------
qemu-kvm crashed

Expected results:
----------------
qemu-kvm shouldn't crash

Additional info:
----------------
1.This was seen on a gluster replica 3 (1*3) volume
2.Had VDO enabled bricks 
3.3 node cluster

--- Additional comment from bipin on 2018-07-12 10:55:04 EDT ---

id 7545f33c3237624d89ff870354c7d8fa238bcedb
reason:         qemu-kvm killed by SIGABRT
time:           Tue 10 Jul 2018 11:58:56 AM IST
cmdline:        /usr/libexec/qemu-kvm -name guest=vdo_vm1,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-38-vdo_vm1/master-key.aes -machine pc-i440fx-rhel7.5.0,accel=kvm,usb=off,dump-guest-core=off -cpu Haswell-noTSX,spec-ctrl=on,ssbd=on -m size=2097152k,slots=16,maxmem=8388608k -realtime mlock=off -smp 2,maxcpus=16,sockets=16,cores=1,threads=1 -numa node,nodeid=0,cpus=0-1,mem=2048 -uuid f44c3d16-d521-4b20-a02e-cca52070bfda -smbios 'type=1,manufacturer=oVirt,product=RHEV Hypervisor,version=7.5-5.0.el7,serial=00000000-0000-0000-0000-AC1F6B400622,uuid=f44c3d16-d521-4b20-a02e-cca52070bfda' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-38-vdo_vm1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2018-07-10T04:02:26,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=ua-0f32076f-ed6c-4cdf-b1f5-6bbdfce63727,bus=pci.0,addr=0x4 -device virtio-serial-pci,id=ua-1928334c-76c9-4947-868b-187e6f287ff5,max_ports=16,bus=pci.0,addr=0x5 -drive if=none,id=drive-ua-a35d33bd-5ad6-4a61-a70d-53ea611fd546,readonly=on,werror=report,rerror=report -device ide-cd,bus=ide.1,unit=0,drive=drive-ua-a35d33bd-5ad6-4a61-a70d-53ea611fd546,id=ua-a35d33bd-5ad6-4a61-a70d-53ea611fd546 -drive file=/rhev/data-center/mnt/glusterSD/rhsqa-grafton7-nic2.lab.eng.blr.redhat.com:vdo/7c5d831d-c723-4003-b0df-b5e16f5f2320/images/a25daf4d-df1d-4ab2-95a0-fc49b1d47527/f628b286-281f-4166-9824-f095483697ba,format=raw,if=none,id=drive-ua-a25daf4d-df1d-4ab2-95a0-fc49b1d47527,serial=a25daf4d-df1d-4ab2-95a0-fc49b1d47527,cache=none,werror=stop,rerror=stop,aio=threads -device scsi-hd,bus=ua-0f32076f-ed6c-4cdf-b1f5-6bbdfce63727.0,channel=0,scsi-id=0,lun=0,drive=drive-ua-a25daf4d-df1d-4ab2-95a0-fc49b1d47527,id=ua-a25daf4d-df1d-4ab2-95a0-fc49b1d47527,bootindex=2 -drive file=/rhev/data-center/mnt/glusterSD/rhsqa-grafton7-nic2.lab.eng.blr.redhat.com:vdo/7c5d831d-c723-4003-b0df-b5e16f5f2320/images/21c04a77-1a84-4743-93da-2a5878369f20/7ca00a4d-11f9-49aa-a1bc-508ea7098f47,format=raw,if=none,id=drive-ua-21c04a77-1a84-4743-93da-2a5878369f20,serial=21c04a77-1a84-4743-93da-2a5878369f20,cache=none,werror=stop,rerror=stop,aio=threads -device scsi-hd,bus=ua-0f32076f-ed6c-4cdf-b1f5-6bbdfce63727.0,channel=0,scsi-id=0,lun=2,drive=drive-ua-21c04a77-1a84-4743-93da-2a5878369f20,id=ua-21c04a77-1a84-4743-93da-2a5878369f20 -netdev tap,fd=50,id=hostua-94cdc487-7d03-428b-a328-5d9aa400d30b,vhost=on,vhostfd=52 -device virtio-net-pci,netdev=hostua-94cdc487-7d03-428b-a328-5d9aa400d30b,id=ua-94cdc487-7d03-428b-a328-5d9aa400d30b,mac=00:1a:4a:16:01:15,bus=pci.0,addr=0x3,bootindex=1 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/f44c3d16-d521-4b20-a02e-cca52070bfda.ovirt-guest-agent.0,server,nowait -device virtserialport,bus=ua-1928334c-76c9-4947-868b-187e6f287ff5.0,nr=1,chardev=charchannel0,id=channel0,name=ovirt-guest-agent.0 -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/f44c3d16-d521-4b20-a02e-cca52070bfda.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=ua-1928334c-76c9-4947-868b-187e6f287ff5.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel2,name=vdagent -device virtserialport,bus=ua-1928334c-76c9-4947-868b-187e6f287ff5.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0 -spice port=5921,tls-port=5922,addr=10.70.36.241,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on -device qxl-vga,id=ua-fff2892c-61a6-4837-afa0-916c1e6e0f21,ram_size=67108864,vram_size=8388608,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=ua-fd858673f66-ea73-49e2-9985-ac4590adbfee,filename=/dev/urandom -device virtio-rng-pci,rng=objua-05164f66-ea73-49e2-9985-ac4590adbfee,id=ua-05164f66-ea73-49e2-9985-ac4590adbfee,bus=pci.0,addr=0x7 -device vmcoreinfo -msg timestamp=on
package:        qemu-kvm-rhev-2.10.0-21.el7_5.4
uid:            107 (qemu)
count:          1
Directory:      /var/tmp/abrt/ccpp-2018-07-10-11:58:56-52702
Run 'abrt-cli report /var/tmp/abrt/ccpp-2018-07-10-11:58:56-52702' for creating a case in Red Hat Customer Portal

Comment 5 bipin 2018-07-12 15:10:09 UTC
Also had raised similar qemu-crash bugs previously 1575872 and 1561324.

Comment 7 Jeff Cody 2018-08-15 17:34:26 UTC
Just a note: from the description, this is occurring on a gluster fuse mount, and not with the QEMU native libgfapi driver.  I thought it worth mentioning since the guest name was "libgfapi", and I wanted to avoid confusion.

From the trace, it appears the fcntl operation F_UNLCK is failing on the image file, which is located on the glusterfs fuse mount.

I wonder if this is related to BZ #1598025.  Like that bug, I am suspecting the bug may be in the glusterfs library used for fuse.

If you use a later glusterfs version (such as 4.0.2-1) on the qemu host machine (i.e. the machine mounting the fuse mount, not the gluster server) does this problem go away?

Comment 9 Jeff Cody 2018-08-15 17:43:08 UTC
*** Bug 1609561 has been marked as a duplicate of this bug. ***

Comment 10 bipin 2018-11-08 07:42:56 UTC
(In reply to Jeff Cody from comment #7)
> Just a note: from the description, this is occurring on a gluster fuse
> mount, and not with the QEMU native libgfapi driver.  I thought it worth
> mentioning since the guest name was "libgfapi", and I wanted to avoid
> confusion.
> 
> From the trace, it appears the fcntl operation F_UNLCK is failing on the
> image file, which is located on the glusterfs fuse mount.
> 
> I wonder if this is related to BZ #1598025.  Like that bug, I am suspecting
> the bug may be in the glusterfs library used for fuse.
> 
> If you use a later glusterfs version (such as 4.0.2-1) on the qemu host
> machine (i.e. the machine mounting the fuse mount, not the gluster server)
> does this problem go away?

Apologies for the delayed reply. In the later version's of gluster (RHGS 3.4.0 & 3.4.1) ,I couldn't see the above issue

Comment 11 Sahina Bose 2018-11-19 05:25:56 UTC
Can we close this bug as it's not reproducible with latest version


Note You need to log in before you can comment on or make changes to this bug.