Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1688630 - When tendrl-monitoring-integration is not running, Import flow failure related log messages should be more specific about what went wrong
Summary: When tendrl-monitoring-integration is not running, Import flow failure relate...
Keywords:
Status: POST
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: web-admin-tendrl-commons
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: gowtham
QA Contact: sds-qe-bugs
URL:
Whiteboard:
Depends On: 1686888
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-14 06:40 UTC by gowtham
Modified: 2019-04-11 07:57 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github Tendrl commons issues 1080 None None None 2019-04-01 08:32:41 UTC
Github Tendrl monitoring-integration issues 593 None None None 2019-04-01 16:47:52 UTC

Description gowtham 2019-03-14 06:40:45 UTC
Description of problem:

In most of the error log messages in import flow are very generic, It displays a big traceback with atom failed messages. But it is not specified why the atom is failed. With this error message user unable to pinpoint of the failure. 

## import failed

Failure in Job 0845ffb1-4d53-4a6c-9e18-3ed0a72c1ce5 Flow tendrl.flows.ImportCluster with error: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/tendrl/commons/jobs/__init__.py", line 240, in process_job the_flow.run() File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py", line 131, in run exc_traceback) FlowExecutionFailedError: ['Traceback (most recent call last):\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/__init__.py", line 98, in run\n super(ImportCluster, self).run()\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/__init__.py", line 227, in run\n "Error executing post run function: %s" % atom_fqn\n', 'AtomExecutionFailedError: Atom Execution failed. Error: Error executing post run function: tendrl.objects.Cluster.atoms.SetupClusterAlias\n'

Failed post-run: tendrl.objects.Cluster.atoms.SetupClusterAlias for flow: Import existing Gluster Cluster

Version-Release number of selected component (if applicable):
tendrl-commons-1.6.3-17.el7rhgs.noarch

How reproducible:
100%

Steps to Reproduce:
1. Create gluster cluster
2. Install RHGSWA via tendrl-ansible
3. Stop tendrl-monitoring-integration service in a server
4. Try to import the cluster

Actual results:
Import failed with some huge traceback info

Expected results:
Need a specific log message that shows why import is failed 

Additional info:

Comment 2 Martin Bukatovic 2019-03-14 09:19:32 UTC
Quick list of related bugs (this is not a complete list) based on the sheer
title of this bug (Import flow failure related log messages should be more
specific about what went wrong):

#1647322 WA should detect and report problems with carbon initialization
#1647909 Import fails when WA is not updated
#1616005 Repeated Import (and Unmanage) fails: Timing out import job, Cluster data still not fully updated
#1612096 Import cluster with bricks down failed
#1602858 Root cause of problem with import cluster job failure  needs to be identified
#1599375 Error executing pre run function: tendrl.objects.Cluster.atoms.Check Cluster Nodes Up
#1589820 Non descriptive Import Cluster failure: Atom Execution failed
#1589801 no error reported by WA ui when importing cluster without free disk space on /var/lib/carbon partition
#1583713 No dashboards when cluster is imported on second attempt
#1686888 import cluster fails after timeout without clear indication what went wrong
#1686855 Task messages are not informative

Comment 3 Martin Bukatovic 2019-03-14 09:20:44 UTC
The reproducer in this BZ is the same as in linked BZ 1686888. What is the purpose of this BZ?

Comment 6 gowtham 2019-04-01 16:47:52 UTC
Added pre-atom in import and unmanage cluster flow to check all required services are running:
    https://github.com/Tendrl/commons/pull/1081
    https://github.com/Tendrl/commons/pull/1083
    https://github.com/Tendrl/monitoring-integration/pull/594

Assigning ownership for carbon user while creating an alias:
   PR: https://github.com/Tendrl/monitoring-integration/pull/596


Note You need to log in before you can comment on or make changes to this bug.