Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1511934 - libvirt doesn't refuse to run EPYC CPU model if rdtscp support is missing
Summary: libvirt doesn't refuse to run EPYC CPU model if rdtscp support is missing
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.5
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Jiri Denemark
QA Contact: chhu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-10 13:18 UTC by Eduardo Habkost
Modified: 2017-11-23 15:31 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-23 15:31:53 UTC


Attachments (Terms of Use)

Description Eduardo Habkost 2017-11-10 13:18:43 UTC
Description of problem:
libvirt won't refuse to start a VM using the "EPYC" CPU model on a host that doesn't support virtualizing the rdtscp feature.


Version-Release number of selected component (if applicable):
libvirt-3.9.0-1.el7.x86_64
qemu-kvm-rhev-2.10.0-5.el7.x86_64
virt-install-1.4.3-1.el7.noarch
kernel-3.10.0-768.el7.x86_64

How reproducible:
Always.


Steps to Reproduce:
# /usr/libexec/qemu-kvm -cpu EPYC,enforce
warning: host doesn't support requested feature: CPUID.80000001H:EDX.rdtscp [bit 27]
qemu-kvm: Host doesn't support requested features
# virt-install --pxe --cpu EPYC --name test --memory 1024 --disk none
[...]
Starting install...
Domain installation still in progress. Waiting for installation to complete.



Actual results:
VM is started.

Expected results:
ERROR    the CPU is incompatible with host CPU: Host CPU does not provide required features: rdtscp.


Additional info:
This bug can be triggered because we didn't backport rdtscp support to the kernel.  It won't be reproducible anymore once bug 1511805 is fixed, but it's still a libvirt bug.

Comment 2 Jiri Denemark 2017-11-10 14:47:08 UTC
This is expected as libvirt does not strictly check CPU features when starting
a new domain to keep compatibility with older libvirt. The strict checking is
only enabled on migration, snapshot revert, or save/restore. However, you can
explicitly request this strict checking even when starting a fresh domain
(just like you need to explicitly add the "enforce" option to QEMU) by setting
check='full' (the default is check='partial'):

    <cpu mode='custom' match='exact' check='full'>
      <model fallback='forbid'>EPYC</model>
    </cpu>

However, I don't know if this is supported by virt-install.

Comment 3 Eduardo Habkost 2017-11-10 15:57:01 UTC
Isn't it possible to change libvirt's default to check='full'?  If not, why?

(I think I have heard of other cases where it wasn't possible to change libvirt defaults before, and I'm trying to understand where this limitation comes from.)

Comment 4 Jiri Denemark 2017-11-10 16:38:41 UTC
I believe it's similar to why enforce is not enabled by default in QEMU. That is, existing domains which were running just fine even if they used a CPU which could not be provided without disabling some features would not start anymore.

Comment 5 Eduardo Habkost 2017-11-14 21:33:09 UTC
(In reply to Jiri Denemark from comment #4)
> I believe it's similar to why enforce is not enabled by default in QEMU.
> That is, existing domains which were running just fine even if they used a
> CPU which could not be provided without disabling some features would not
> start anymore.

In QEMU we could enable enforce by default, because we can change the defaults in newer machine-types and keep compatibility on the older ones.  We just didn't do that because we're afraid of breaking libvirt's expectations.

The problem for libvirt seems to be that you are unable to differentiate "importing or migrating an existing VM" from "creating a new VM".  Is that correct, or I'm missing something?

Comment 6 Jiri Denemark 2017-11-23 13:05:16 UTC
We could change it for newly defined domains, but that would mean a freshly created persistent domain and a new transient domain would get different defaults, which would IMHO be much worse than keeping the old default for all of them. And even freshly defined domain doesn't have to be a completely new one as one can just transfer the definition to another node in a cluster. We don't have anything like machine type which we could use to base the defaults on. If we had a crystal ball and added the check='...' attribute when introducing guest CPU configuration, we could change the default for domains which did not specify this attribute explicitly. But added it too late, which means no attribute has to be mapped to check='partial'.


Note You need to log in before you can comment on or make changes to this bug.