Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1516845 - host up when glusterd not running
Summary: host up when glusterd not running
Keywords:
Status: NEW
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: web-admin-tendrl-monitoring-integration
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Nishanth Thomas
QA Contact: Filip Balák
URL:
Whiteboard:
: 1517091 (view as bug list)
Depends On:
Blocks: 1519201
TreeView+ depends on / blocked
 
Reported: 2017-11-23 12:42 UTC by Lubos Trilety
Modified: 2018-08-12 15:28 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:


Attachments (Terms of Use)
gluster at glance snapshot (deleted)
2017-11-23 12:42 UTC, Lubos Trilety
no flags Details
gluster at glance snapshot (deleted)
2017-11-23 12:43 UTC, Lubos Trilety
no flags Details

Description Lubos Trilety 2017-11-23 12:42:19 UTC
Description of problem:
Host and all related bricks are displayed as UP on Grafana and RHGSWA UI when glusterd service is not running on the host. Moreover cluster status is still HEALTHY.
For sure bricks cannot be used when glusterd service is not running on the machine.


Version-Release number of selected component (if applicable):
tendrl-ansible-1.5.4-1.el7rhgs.noarch
tendrl-ui-1.5.4-4.el7rhgs.noarch
tendrl-grafana-plugins-1.5.4-5.el7rhgs.noarch
tendrl-selinux-1.5.3-2.el7rhgs.noarch
tendrl-commons-1.5.4-4.el7rhgs.noarch
tendrl-api-1.5.4-2.el7rhgs.noarch
tendrl-api-httpd-1.5.4-2.el7rhgs.noarch
tendrl-monitoring-integration-1.5.4-5.el7rhgs.noarch
tendrl-grafana-selinux-1.5.3-2.el7rhgs.noarch
tendrl-node-agent-1.5.4-5.el7rhgs.noarch
tendrl-notifier-1.5.4-3.el7rhgs.noarch
tendrl-gluster-integration-1.5.4-4.el7rhgs.noarch

How reproducible:
100%

Steps to Reproduce:
1. Stop glusterd service on one of the gluster nodes
2. Wait a while and check Grafana Gluster at Glance Hosts and Bricks panels
3.

Actual results:
Cluster status is HEALTHY and all hosts and bricks are UP.

Expected results:
Cluster should be UNHEATHY. Host with stopped glusterd and all bricks on the host should be DOWN on Grafana.

Additional info:

Comment 1 Lubos Trilety 2017-11-23 12:43:14 UTC
Created attachment 1358194 [details]
gluster at glance snapshot

Comment 3 Lubos Trilety 2017-11-23 13:13:39 UTC
Actually bricks can be still used, so the bricks status is accurate. At least till the related glusterfsd process is running.
However bricks looks still as UP even after killing all glusterfsd processes on the machine where glusterd is not running.

Comment 5 Martin Kudlej 2017-11-24 07:51:19 UTC
*** Bug 1517091 has been marked as a duplicate of this bug. ***

Comment 7 Nishanth Thomas 2017-11-24 10:39:15 UTC
* Glusterd is a management daemon. Even though glusterd is down data flow happens and bricks are still part of the volume. So setting the cluster state as healthy is not right.

* You get a `Peer disconnected` alert which tells you that there is a break in communication which will keep the admin informed and on which action can be taken. Attached the screenshot of the alert on UI

*Service management is not part of tendrl now. This is an RFE and glusterd failure case will reported as part of that

One thing to note is that once glusterd is down, further updates will not be available outside. For example if a brick goes down on a node where glusterd is down, which won't be communicated outside. Hence tendrl will retain the last updated values.

Comment 8 Filip Balák 2018-05-14 08:37:08 UTC
This issue happens also when entire node is shut down, not only glusterd.

Tested with:
tendrl-ansible-1.6.3-3.el7rhgs.noarch
tendrl-api-1.6.3-3.el7rhgs.noarch
tendrl-api-httpd-1.6.3-3.el7rhgs.noarch
tendrl-commons-1.6.3-4.el7rhgs.noarch
tendrl-grafana-plugins-1.6.3-2.el7rhgs.noarch
tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch
tendrl-monitoring-integration-1.6.3-2.el7rhgs.noarch
tendrl-node-agent-1.6.3-4.el7rhgs.noarch
tendrl-notifier-1.6.3-2.el7rhgs.noarch
tendrl-selinux-1.5.4-2.el7rhgs.noarch
tendrl-ui-1.6.3-1.el7rhgs.noarch
tendrl-gluster-integration-1.6.3-2.el7rhgs.noarch


Note You need to log in before you can comment on or make changes to this bug.