Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1693196 - [OSP15][Undercloud][healthcheck] failed healthcheck for ceilometer_agent_compute
Summary: [OSP15][Undercloud][healthcheck] failed healthcheck for ceilometer_agent_compute
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-common
Version: 15.0 (Stein)
Hardware: Unspecified
OS: Unspecified
Target Milestone: ga
: 15.0 (Stein)
Assignee: Cédric Jeanneret
QA Contact: Nataf Sharabi
Depends On:
TreeView+ depends on / blocked
Reported: 2019-03-27 10:28 UTC by Artem Hrechanychenko
Modified: 2019-04-10 11:18 UTC (History)
4 users (show)

Fixed In Version: openstack-tripleo-common-10.6.1-0.20190328200349.45cd562.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed:
Target Upstream Version:

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
OpenStack gerrit 648027 None None None 2019-03-27 12:17:35 UTC
Red Hat Bugzilla 1689671 None MODIFIED Undercloud: neutron containers healthcheck failed 2019-04-10 11:17:43 UTC

Description Artem Hrechanychenko 2019-03-27 10:28:00 UTC
Description of problem:
After Overcloud installation

check health-check for container on Compute node

[heat-admin@compute-0 ~]$ sudo systemctl status tripleo_ceilometer_agent_compute_healthcheck.service 
● tripleo_ceilometer_agent_compute_healthcheck.service - ceilometer_agent_compute healthcheck
   Loaded: loaded (/etc/systemd/system/tripleo_ceilometer_agent_compute_healthcheck.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2019-03-27 10:21:44 UTC; 1min 13s ago
  Process: 136470 ExecStart=/usr/bin/podman exec ceilometer_agent_compute /openstack/healthcheck (code=exited, status=1/FAILURE)
 Main PID: 136470 (code=exited, status=1/FAILURE)

Mar 27 10:21:44 compute-0 systemd[1]: Starting ceilometer_agent_compute healthcheck...
Mar 27 10:21:44 compute-0 podman[136470]: There is no ceilometer-poll process with opened RabbitMQ ports (5671,5672) running in the container
Mar 27 10:21:44 compute-0 podman[136470]: exit status 1
Mar 27 10:21:44 compute-0 systemd[1]: tripleo_ceilometer_agent_compute_healthcheck.service: Main process exited, code=exited, status=1/FAILURE
Mar 27 10:21:44 compute-0 systemd[1]: tripleo_ceilometer_agent_compute_healthcheck.service: Failed with result 'exit-code'.
Mar 27 10:21:44 compute-0 systemd[1]: Failed to start ceilometer_agent_compute healthcheck.

Container runs
f7624d8ba4f0          kolla_start  13 hours ago  Up 13 hours ago         ceilometer_agent_compute

[heat-admin@compute-0 ~]$ sudo podman logs ceilometer_agent_compute
+ sudo -E kolla_set_configs
INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Deleting /etc/ceilometer/ceilometer.conf
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/ceilometer/ceilometer.conf to /etc/ceilometer/ceilometer.conf
INFO:__main__:Writing out command to execute
++ cat /run_command
+ CMD='/usr/bin/ceilometer-polling --polling-namespaces compute --logfile /var/log/ceilometer/compute.log'
+ [[ ! -n '' ]]
+ . kolla_extend_start
++ CEILOMETER_LOG_DIR=/var/log/kolla/ceilometer
++ [[ ! -d /var/log/kolla/ceilometer ]]
++ mkdir -p /var/log/kolla/ceilometer
+++ stat -c %U:%G /var/log/kolla/ceilometer
++ [[ root:kolla != \c\e\i\l\o\m\e\t\e\r\:\k\o\l\l\a ]]
++ chown ceilometer:kolla /var/log/kolla/ceilometer
+++ stat -c %a /var/log/kolla/ceilometer
++ [[ 2755 != \7\5\5 ]]
++ chmod 755 /var/log/kolla/ceilometer
++ . /usr/local/bin/kolla_ceilometer_extend_start
+ echo 'Running command: '\''/usr/bin/ceilometer-polling --polling-namespaces compute --logfile /var/log/ceilometer/compute.log'\'''
Running command: '/usr/bin/ceilometer-polling --polling-namespaces compute --logfile /var/log/ceilometer/compute.log'
+ exec /usr/bin/ceilometer-polling --polling-namespaces compute --logfile /var/log/ceilometer/compute.log

Version-Release number of selected component (if applicable):
OSP15 compose RHOS_TRUNK-15.0-RHEL-8-20190326.n.0

container image openstack-ceilometer-compute:20190325.1 

How reproducible:

Steps to Reproduce:
1.Deploy undercloud OSP15 
2.Deploy Overcloud OSP15
3. check healthcheck status for container on overcloud compute node

Actual results:
There is no ceilometer-poll process with opened RabbitMQ ports (5671,5672) running in the container

Expected results:
service exited with exit code ==0 

Additional info:

Comment 2 Cédric Jeanneret 2019-03-27 12:17:06 UTC

pretty sure this one is linked to
The following patch will probably solve this issue:

I'm taking this BZ.



Note You need to log in before you can comment on or make changes to this bug.