Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1684566 - back to normal alerts don't contain enough information
Summary: back to normal alerts don't contain enough information
Keywords:
Status: NEW
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: web-admin-tendrl-notifier
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: gowtham
QA Contact: sds-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-01 14:46 UTC by Filip Balák
Modified: 2019-03-04 07:01 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description Filip Balák 2019-03-01 14:46:29 UTC
Description of problem:
Clearing alerts for utilization from Tendrl are not containing information about current value. There is only information that utilization is back to normal.

This is confusing when for example cpu utilization drops from 85% to 75% (80% is threshold for warning alert generation) and user is notified that cpu utilization is back to normal but the user doesn't know what the current value is. 75% of cpu utilization can be still considered high but user is not informed about it in alert. From alerting point of view it seems that cpu utilization is ok but it may not be true.

Version-Release number of selected component (if applicable):
tendrl-ansible-1.6.3-11.el7rhgs.noarch
tendrl-api-1.6.3-13.el7rhgs.noarch
tendrl-api-httpd-1.6.3-13.el7rhgs.noarch
tendrl-commons-1.6.3-17.el7rhgs.noarch
tendrl-grafana-plugins-1.6.3-21.el7rhgs.noarch
tendrl-grafana-selinux-1.5.4-3.el7rhgs.noarch
tendrl-monitoring-integration-1.6.3-21.el7rhgs.noarch
tendrl-node-agent-1.6.3-18.el7rhgs.noarch
tendrl-notifier-1.6.3-4.el7rhgs.noarch
tendrl-selinux-1.5.4-3.el7rhgs.noarch
tendrl-ui-1.6.3-15.el7rhgs.noarch

How reproducible:
100%

Steps to Reproduce:
1. Import cluster with volumes into Tendrl.
2. Utilize memory, swap, cpu, and brick utilization to at least 90% on one machine.
3. Stop utilizing the machine and drop utilization back to values before test.
3. Check alerts in smtp, mail and api alerts.

Actual results:
There are utilization alerts in format:
<subject> utilization on node <node> in <cluster> back to normal
where <subject> is Cpu, Memory, Swap or Brick.
These alerts don't contain any information about actual value.

Expected results:
Alerts should contain current value.
Alerts could look something like:
<subject> utilization on node <node> in <cluster> is back to normal at <value>

Additional info:

Comment 2 gowtham 2019-03-04 07:01:01 UTC
We are receiving all utilization alerts from grafana, and we are converting grafana alert as a meaningful tendrl alert message. Problem is for the clearing alert grafana is not passing any current value. That why we are not mentioning any current value. 

{
  "Id": 2,
  "Version": 0,
  "OrgId": 2,
  "DashboardId": 5,
  "PanelId": 2,
  "Name": "bricks Capacity Utilization-Critical Alert",
  "Message": "",
  "Severity": "",
  "State": "ok",
  "Handler": 1,
  "Silenced": false,
  "ExecutionError": " ",
  "Frequency": 75,
  "EvalData": {
    
  },
  "NewStateDate": "2019-02-28T14:20:03Z",
  "StateChanges": 1,
  "Created": "2019-02-28T14:18:43Z",
  "Updated": "2019-02-28T14:18:43Z",
  "Settings": {
    "conditions": [
      {
        "evaluator": {
          "params": [
            90
          ],
          "type": "gt"
        },
        "operator": {
          "type": "and"
        },
        "query": {
          "datasourceId": 2,
          "model": {
            "refId": "A",
            "target": "tendrl.clusters.e1d3a650-f796-4b94-bbce-70d3bb6bb01a.nodes.tendrl-node-1.bricks.:gluster:brick2:brick2.utilization.percent-percent_bytes",
            "textEditor": false
          },
          "params": [
            "A",
            "4m",
            "now"
          ]
        },
        "reducer": {
          "params": [
            
          ],
          "type": "last"
        },
        "type": "query"
      }
    ],
    "executionErrorState": "keep_state",
    "frequency": "75s",
    "handler": 1,
    "name": "bricks Capacity Utilization-Critical Alert",
    "noDataState": "keep_state",
    "notifications": [
      
    ]
  }
}


Note You need to log in before you can comment on or make changes to this bug.