Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1514733 - Cluster import fails because of collectd service not having started on one of the nodes
Summary: Cluster import fails because of collectd service not having started on one of...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: web-admin-tendrl-monitoring-integration
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Nishanth Thomas
QA Contact: Sweta Anandpara
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-18 06:39 UTC by Sweta Anandpara
Modified: 2017-12-18 04:37 UTC (History)
4 users (show)

Fixed In Version: tendrl-node-agent-1.5.4-4.el7rhgs.noarch.rpm
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-18 04:37:04 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:3478 normal SHIPPED_LIVE RHGS Web Administration packages 2017-12-18 09:34:49 UTC

Description Sweta Anandpara 2017-11-18 06:39:21 UTC
Description of problem:
=======================
While importing a 3 node cluster into a tendrl server already having a successfully imported (one) cluster, it failed with the message 'Could not find atom tendrl.objects.Cluster.atoms.ConfigureMonitoring '

Everything seemed to be going well with cluster import, until a missing 'service collectd running on node <>' message for one of the nodes of the 3-node-cluster. When checked on the backend, 'collectd' service is dead(inactive) on one of the nodes, but running on the other 2 nodes.

error    Failed post-run: tendrl.objects.Cluster.atoms.ConfigureMonitoring for flow: Import existing Gluster Cluster    18 Nov 2017 06:59:23    
info    Running Flow monitoring.flows.NewClusterDashboard    18 Nov 2017 06:59:22    
info    Processing Job 90885685-5b3f-4fc2-9074-280294a47d57    18 Nov 2017 06:59:22
error    Could not find atom tendrl.objects.Cluster.atoms.ConfigureMonitoring    18 Nov 2017 06:59:22
info    Released lock (e6a30215-3261-475a-bcd9-78071d7ff3ae) for Node (dec9e261-e263-41c1-9324-fc9ced7d4d62)    18 Nov 2017 06:59:22
info    Job (4b7d2892-fdce-41a2-8cbe-a34a5720c466): Finished Flow tendrl.flows.ImportCluster    18 Nov 2017 06:59:14
info    Service collectd running on node dhcp42-243.lab.eng.blr.redhat.com    18 Nov 2017 06:59:14
info    Job (85e39abf-c114-4f83-8cb5-0e44835a7f0d): Finished Flow tendrl.flows.ImportCluster    18 Nov 2017 06:59:14
info    Service collectd running on node dhcp42-206.lab.eng.blr.redhat.com    18 Nov 2017 06:59:14


Screenshot of the tasks and var/log/messages have been copied to http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/<bugnumber>

Version-Release number of selected component (if applicable):
=============================================================
tendrl-grafana-plugins-1.5.4-3.el7rhgs.noarch
tendrl-selinux-1.5.3-2.el7rhgs.noarch
tendrl-node-agent-1.5.4-2.el7rhgs.noarch
tendrl-monitoring-integration-1.5.4-3.el7rhgs.noarch
tendrl-grafana-selinux-1.5.3-2.el7rhgs.noarch
tendrl-notifier-1.5.4-2.el7rhgs.noarch
tendrl-commons-1.5.4-2.el7rhgs.noarch
tendrl-api-1.5.4-2.el7rhgs.noarch
tendrl-api-httpd-1.5.4-2.el7rhgs.noarch
tendrl-ansible-1.5.4-1.el7rhgs.noarch
tendrl-ui-1.5.4-2.el7rhgs.noarch


On storage node: 
tendrl-collectd-selinux-1.5.3-2.el7rhgs.noarch
tendrl-commons-1.5.4-2.el7rhgs.noarch
tendrl-node-agent-1.5.4-2.el7rhgs.noarch
tendrl-gluster-integration-1.5.4-2.el7rhgs.noarch
tendrl-selinux-1.5.3-2.el7rhgs.noarch


How reproducible:
=================
1:1


Additional info:
================
The setup is in the same state if it has to be looked at.

Comment 2 Petr Penicka 2017-11-20 13:25:36 UTC
Giving pm_ack and 3.3.z+ since both qa_ack and dev_ack are already given.

Comment 4 Sweta Anandpara 2017-11-22 08:16:25 UTC
Validated the same on build tendrl-node-agent-1.5.4-5.el7rhgs.noarch

Cluster import succeeded without any issues. Collectd service on every storage node is up and running. 

Having discussed it with developer (Nishanth), there was a fix that went in gluster/monitoring integration, which ended up fixing this issue as well.. in other words, the issue mentioned in this bugzilla is a one-off case and will not be easy to reproduce and hence verify.

Moving the bug to its final state for RHGS 3.3.1.

Comment 6 errata-xmlrpc 2017-12-18 04:37:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3478


Note You need to log in before you can comment on or make changes to this bug.