Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1511876 - Statistics output can not be obtained with gnocchi from an instance running Windows OS
Summary: Statistics output can not be obtained with gnocchi from an instance running W...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-gnocchi
Version: 10.0 (Newton)
Hardware: All
OS: Linux
low
medium
Target Milestone: zstream
: 10.0 (Newton)
Assignee: Mehdi ABAAKOUK
QA Contact: Sasha Smolyak
URL:
Whiteboard:
Depends On:
Blocks: 1525372
TreeView+ depends on / blocked
 
Reported: 2017-11-10 10:48 UTC by Masaki Furuta
Modified: 2018-04-09 07:24 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1525372 (view as bug list)
Environment:
Last Closed: 2018-04-09 07:24:38 UTC


Attachments (Terms of Use)

Description Masaki Furuta 2017-11-10 10:48:32 UTC
Description of problem:

  - The type of statistical output when executing the gnocchi command differs for each OS.
  - From Windows 2008 R2 almost no statistical information could be obtained.
  - Is this behavior expected, or a bug, or is it due to openstack settings or settings on the Windows OS side?


Version-Release number of selected component (if applicable):

  Red Hat OpenStack Platform 10.0

How reproducible:

  Always

Steps to Reproduce:

  Execute the gnocchi command from the director to output statistical information:   gnocchi measures show XXXX-XXXX

Actual results:

  Please see attached Statistical_info_list.pdf

Expected results:

  Behavior should be explained in the doc if expected.
  Or Bug. 

Additional info:

Comment 2 Mehdi ABAAKOUK 2017-11-16 15:32:54 UTC
Did you install all virtio-drivers on Windows ? and enabled all optional services ?

Ceilometer just retrieved metrics from libvirt. If you see a difference between 
Windows and Centos, this is just because the number metrics that virtio drivers provided on Linux is higher than virtio drivers on Windows.

If you need more metrics, this is need to be done on virtio side.

Comment 4 Mehdi ABAAKOUK 2017-11-22 15:44:47 UTC
We can test for "memory.usage" for example, on a compute node:

virsh dommemstat <window-instance-name>

Ceilometer/Gnocchi needs "unused" and "available" to be returned. if one of them is missing, that means virtio-memory-ballon driver is missing inside the VM.


virsh have equivalent command for other IOs:

virtio-scsi/virtio-blk are also needed for disk IOs
virtio-net for network IOs.

Comment 6 Mehdi ABAAKOUK 2017-11-30 07:12:14 UTC
The installation of the virtio-driver looks good, but I'm far to be a Window expert.

I don't think we have a bug here, each Operating System generates a different type of metrics and that fine.

If one metric that your really need in missing for your use case.
Check first, if you have it with "virtsh XXXXstat" commands:

* if no: open an other RFE on virtio-win to implement it.
* if yes: this is maybe a Ceilometer bug.

About documentation, I'm not sure creating a matrix with what metrics you can expected. It's very complicate from Ceilometer point of view.
A metric can/cannot be present for ton of reasons:

* virtio driver missing
* virtio too old
* virtio driver can generate the metric for one OS, but not for another
* old libvirt version
* not all disk/net drivers support the all IOs metrics
* not all disk/net kvm backend support the all IOs metrics
* libvirt xml doesn't contain the required hardware/configuration to generate the metric.
...

The matrix will have too many parameters.

Comment 10 Mehdi ABAAKOUK 2017-11-30 10:39:31 UTC
I'm current digging into details and will report a per metric information, soon.

Can I have also the output of to clarify something ?

* virsh domstat <instance>
* virsh domblklist <instance>
* virsh domblkinfo <instance> <disk>

Comment 11 Mehdi ABAAKOUK 2017-11-30 10:49:17 UTC
I have looked in details

Note that some metrics are not gathered with libvirt, but nova send them to Ceilometer every hour by default:

* memory
* vcpus
* disk.root.size
* disk.ephemeral.size
* instance

Here if some metrics are missing, that means nova haven't yet send them.

About cpu:

* cpu: the cpu consumed time since the instance have booted
* cpu.delta: it's the delta between two "cpu" metrics retrieved
* cpu_util: it's the "cpu.delta" in percent

So, if you have cpu, you must have the two others, you may just need to wait a bit to ensure we have at least two samples (~20 minutes usually) to compute the delta.

The same applies for all "*.rate" metrics. If you have for example "disk.read.requests" and not "disk.read.requests.rate". That just means you didn't wait enough time, Ceilometer haven't yet be able to compute the rate.

Sometimes you have the reverse "disk.write.requests.rate" and not "disk.write.requests". This is impossible since *.rate is computed with the non rated metrics.

Some metrics are about other hypervisor (e.g: not libvirt instances), so that's OK to not have measures on them:

* disk.iops
* disk.latency

For memory.usage, libvirt should report it in "virsh dommemstat", message.usage = "available" - "unused". That's not the case for your windows VMs. That means libvirt/kvm can't talk to virtio-ballon driver inside the VM. You can open another BZ on virtio-win if that the case.

Some metrics, depends the kvm storage driver used (ceph/file/...) and if virtio-blk/scsi is enabled:

* disk.capacity
* disk.allocation

I also an instance where you have disk.write.requests and not disk.read.requests. That seems impossible this metrics are built at the same times.

Comment 14 Mehdi ABAAKOUK 2017-12-13 10:45:59 UTC
As I said gathering metrics take many times especially for cpu_util and *.rate metrics that needs extra transformation by Ceilometer, are you sure you wait enough times.

Also when you have multiple controllers these rated metrics can be inacurate or missing (see: https://bugzilla.redhat.com/show_bug.cgi?id=1520694). 

If you can provide sos_report from controller and compute nodes, we can looks at Ceilometer logs to see if everything goes well.


Otherwise, I have asked some additional virsh output, can you provide them ?


Note You need to log in before you can comment on or make changes to this bug.