Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1691097 - Numa nodes memory size on guest OS doesn't match the created by rest API numanodes size.
Summary: Numa nodes memory size on guest OS doesn't match the created by rest API numa...
Keywords:
Status: NEW
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.3.2.1
Hardware: x86_64
OS: Linux
unspecified
medium vote
Target Milestone: ovirt-4.4.0
: ---
Assignee: Michal Skrivanek
QA Contact: meital avital
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-20 20:48 UTC by Polina
Modified: 2019-04-09 03:10 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
oVirt Team: Virt
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)
engine vdsm (deleted)
2019-03-20 20:48 UTC, Polina
no flags Details

Description Polina 2019-03-20 20:48:44 UTC
Created attachment 1546252 [details]
engine vdsm

Description of problem: The described problem found while regression automation tier2 run (https://polarion.engineering.redhat.com/polarion/redirect/project/RHEVM3/workitem?id=RHEVM3-9571). The problem is observed when working with guest os rhel8. Passes successfully with guest os rhel7.6

Version-Release number of selected component (if applicable):
vdsm-4.30.11-1.el7ev.x86_64
ovirt-engine-4.3.2.1-0.1.el7.noarch

How reproducible:100% with rhel8.0 guest os. 

Steps to Reproduce:

1. Configure VM with 4 CPUs (System tab), pinned to host with two numa nodes (host tab).
host numa topology:
[root@lynx21 ~]# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10
node 0 size: 16287 MB
node 0 free: 2039 MB
node 1 cpus: 1 3 5 7 9 11
node 1 size: 16384 MB
node 1 free: 9861 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10 

2. Send rest API creating two numa nodes for the VM so that nodes memory sum equal to the VM memory (1024)

POST https://{{host}}/ovirt-engine/api/vms/8ccf3fd3-9549-4831-b43e-926c93d9c9ca/numanodes
<vm_numa_node>
    <cpu>
        <cores>
            <core>
                <index>0</index>
            </core>
        </cores>
    </cpu>
    <index>0</index>
    <memory>768</memory>
</vm_numa_node>

POST https://{{host}}/ovirt-engine/api/vms/8ccf3fd3-9549-4831-b43e-926c93d9c9ca/numanodes
<vm_numa_node>
    <cpu>
        <cores>
            <core>
                <index>1</index>
            </core>
        </cores>
    </cpu>
    <index>1</index>
    <memory>256</memory>
</vm_numa_node>

3. GET https://{{host}}/ovirt-engine/api/vms/8ccf3fd3-9549-4831-b43e-926c93d9c9ca/numanodes

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<vm_numa_nodes>
    <vm_numa_node href="/ovirt-engine/api/vms/8ccf3fd3-9549-4831-b43e-926c93d9c9ca/numanodes/8fdc5457-790e-41fd-8e9d-27bee594e0c4" id="8fdc5457-790e-41fd-8e9d-27bee594e0c4">
        <cpu>
            <cores>
                <core>
                    <index>0</index>
                </core>
            </cores>
        </cpu>
        <index>0</index>
        <memory>768</memory>
        <vm href="/ovirt-engine/api/vms/8ccf3fd3-9549-4831-b43e-926c93d9c9ca" id="8ccf3fd3-9549-4831-b43e-926c93d9c9ca"/>
    </vm_numa_node>
    <vm_numa_node href="/ovirt-engine/api/vms/8ccf3fd3-9549-4831-b43e-926c93d9c9ca/numanodes/dbac709b-7dad-4a8b-b377-717a73046255" id="dbac709b-7dad-4a8b-b377-717a73046255">
        <cpu>
            <cores>
                <core>
                    <index>1</index>
                </core>
            </cores>
        </cpu>
        <index>1</index>
        <memory>256</memory>
        <vm href="/ovirt-engine/api/vms/8ccf3fd3-9549-4831-b43e-926c93d9c9ca" id="8ccf3fd3-9549-4831-b43e-926c93d9c9ca"/>
    </vm_numa_node>
</vm_numa_nodes>

4. Run the VM on numa host

5. login the VM and run 'numactl -H'.
numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 3
node 0 size: 569 MB
node 0 free: 236 MB
node 1 cpus: 1
node 1 size: 184 MB
node 1 free: 22 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

Actual results: node 0 size: 569 MB and node 1 size: 184 MB

Expected results:
expected approximately the values which were sent in Rest API create numanodes.
the same test for rhel7.6 guest OS in the same environment returns
numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 3
node 0 size: 767 MB
node 0 free: 530 MB
node 1 cpus: 1
node 1 size: 255 MB
node 1 free: 37 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

The attached are engine and vdsm logs extracted from automation. 
Please let me know if more logs needed - I'll reproduce the bug and provide more logs.

Additional info: for QE - test TestTotalVmMemoryEqualToNumaNodesMemory in rhevmtests/compute/sla/numa/numa_test.py. could be reproduces on infra env 5

Comment 1 Ryan Barry 2019-03-21 00:13:49 UTC
Like the bug yesterday, can you please provide the template? The 7.6 template would also be helpful

Comment 2 Michal Skrivanek 2019-03-21 05:24:06 UTC
Iā€™m also not sure what are you saying. That the memory size allocated on host is smaller than you expected? Is the guest memory smaller than 1GB? Is the guest numa as requested?

Comment 3 Polina 2019-03-21 07:57:25 UTC
the numa nodes sizes are smaller than the expected.
The test creates numa node sizes: 
    node 0 size: 768 MB and node 1 size: 256 MB.
actually got: 
    node 0 size: 569 MB and node 1 size: 184 MB

Comment 4 Polina 2019-03-21 08:02:03 UTC
used the same template http://pastebin.test.redhat.com/741534

Comment 5 Michal Skrivanek 2019-03-21 08:27:49 UTC
you have several runs with different sizes in the log, which one you're referring to? The one at 2019-03-15 11:25:54,992 has 4 CPUs and
        <numa>
            <cell cpus="0" id="0" memory="786432"/>
            <cell cpus="1" id="1" memory="262144"/>
        </numa>
which sounds invalid as cpu 2,3 are not allocated anything.

and the one at 2019-03-15 11:28:21,707+0200 has
        <numa>
            <cell cpus="0" id="0" memory="524288"/>
            <cell cpus="1,2,3" id="1" memory="524288"/>
        </numa>

which doesn't match your description.
Please also attech qemu log for that VM to doublecheck the exact cmdline args


Note You need to log in before you can comment on or make changes to this bug.