Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1359499 - [Docs][SHE][Admin] Add how to flow for switching HE maintenance modes in the WebAdmin.
Summary: [Docs][SHE][Admin] Add how to flow for switching HE maintenance modes in the ...
Keywords:
Status: RELEASE_PENDING
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: Documentation
Version: 4.3.0
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ovirt-4.3.5
: 4.3.1
Assignee: Tahlia Richardson
QA Contact: Eli Marcus
URL:
Whiteboard:
Depends On:
Blocks: 1469143
TreeView+ depends on / blocked
 
Reported: 2016-07-24 10:58 UTC by Nikolai Sednev
Modified: 2019-04-14 12:59 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
oVirt Team: Integration


Attachments (Terms of Use)
Screenshot from 2016-07-24 13:59:02.png (deleted)
2016-07-24 10:59 UTC, Nikolai Sednev
no flags Details
screenshot from the engine's WEBUI of alma03 not in maintenance, while it is in CLI. (deleted)
2016-07-25 10:10 UTC, Nikolai Sednev
no flags Details

Description Nikolai Sednev 2016-07-24 10:58:19 UTC
Description of problem:
Put/remove host to/from local maintenance not working in Cockpit on NGN.

Version-Release number of selected component (if applicable):
sanlock-3.2.4-2.el7_2.x86_64                                                                                                                      
ovirt-hosted-engine-ha-2.0.1-1.el7ev.noarch
ovirt-imageio-daemon-0.3.0-0.el7ev.noarch
ovirt-host-deploy-1.5.1-1.el7ev.noarch
ovirt-engine-sdk-python-3.6.7.0-1.el7ev.noarch
qemu-kvm-rhev-2.3.0-31.el7_2.16.x86_64
mom-0.5.5-1.el7ev.noarch
ovirt-setup-lib-1.0.2-1.el7ev.noarch
ovirt-vmconsole-host-1.0.4-1.el7ev.noarch
libvirt-client-1.2.17-13.el7_2.5.x86_64
vdsm-4.18.6-1.el7ev.x86_64
ovirt-hosted-engine-setup-2.0.1-1.el7ev.noarch
ovirt-imageio-common-0.3.0-0.el7ev.noarch
ovirt-vmconsole-1.0.4-1.el7ev.noarch
Linux version 3.10.0-327.22.2.el7.x86_64 (mockbuild@x86-030.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Thu Jun 9 10:09:10 EDT 2016
Linux 3.10.0-327.22.2.el7.x86_64 #1 SMP Thu Jun 9 10:09:10 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux release 7.2


How reproducible:
100%

Steps to Reproduce:
1.Deploy hosted engine on pair of RHEVH NGN over NFS.
2.Add data storage domain to get HE storage domain imported.
3.Try placing one of the hosts in to local maintenance via Cockpit (Virtualization->Hosted Engine->Put this host into local maintenance).
4.Put host via WEBUI of the engine in to maintenance.
5.Try re-activating host via Cockpit by Virtualization->Hosted Engine->Remove this host from maintenance.

Actual results:
Put/remove host to/from local maintenance not working in Cockpit on NGN.

Expected results:
Both options should work properly.

Additional info:
Screenshots supported.

Comment 1 Nikolai Sednev 2016-07-24 10:59:29 UTC
Created attachment 1183325 [details]
Screenshot from 2016-07-24 13:59:02.png

Comment 2 Ryan Barry 2016-07-24 23:55:34 UTC
I was upgrading my lab to 4.0 (finally) today when this came in, and I tested it -- I can't reproduce.

We're calling hosted-engine directly. It seems that global maintenance triggers almost immediately, but local has some lag time (even using a shell on the host). How long have you waited? 15-30 seconds seems to be about the average.

Comment 3 Ryan Barry 2016-07-25 02:02:38 UTC
Also, I can add a flag which shows some kind of visual indicator that a stage update has been triggered.

The difficulty here is that "hosted-engine --set-maintenance --mode=local" returns immediately, but it takes some time to update. We'd need to figure out a reasonable timeout after which a spinner could be replaced with a warning icon (because it did not change)

Comment 4 Nikolai Sednev 2016-07-25 10:09:33 UTC
(In reply to Ryan Barry from comment #2)
> I was upgrading my lab to 4.0 (finally) today when this came in, and I
> tested it -- I can't reproduce.
> 
> We're calling hosted-engine directly. It seems that global maintenance
> triggers almost immediately, but local has some lag time (even using a shell
> on the host). How long have you waited? 15-30 seconds seems to be about the
> average.
Waited about 1-3 minutes and host alma03 was not set in to local maintenance within the WEBUI, but it was in CLI:
[root@alma04 ~]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date                  : True
Hostname                           : alma03.qa.lab.tlv.redhat.com
Host ID                            : 1
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 0
stopped                            : False
Local maintenance                  : True
crc32                              : 945c00aa
Host timestamp                     : 345027
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=345027 (Mon Jul 25 13:00:44 2016)
        host-id=1
        score=0
        maintenance=True
        state=LocalMaintenance
        stopped=False


--== Host 2 status ==--

Status up-to-date                  : True
Hostname                           : alma04.qa.lab.tlv.redhat.com
Host ID                            : 2
Engine status                      : {"health": "good", "vm": "up", "detail": "up"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : cdf2fe5d
Host timestamp                     : 80794
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=80794 (Mon Jul 25 13:00:38 2016)
        host-id=2
        score=3400
        maintenance=False
        state=EngineUp
        stopped=False


When I tried to remove it from local maintenance using Cockpit, it host alma03 was removed from it successfully. So issue here is that engine's WEBUI not getting any changes from hosts at all when using Cockpit Virtualization->Hosted Engine->Put/Remove this host into/from local maintenance, although if doing the same via Cockpit Virtualization->Virtual Machines->Host to Maintenance, then it's working in both CLI and WEBUI of the engine. I suspect that this is due to the fact, that I'm logged in from Cockpit to the engine's WEBUI and in latest flow it also updates the engine's DB, whereas if first example it's not doing so.


(In reply to Ryan Barry from comment #3)
> Also, I can add a flag which shows some kind of visual indicator that a
> stage update has been triggered.
> 
> The difficulty here is that "hosted-engine --set-maintenance --mode=local"
> returns immediately, but it takes some time to update. We'd need to figure
> out a reasonable timeout after which a spinner could be replaced with a
> warning icon (because it did not change)
I see that command being transferred in ctrl+shift+i in console of the WEB, but not sure it has any affect on my environment as for example moving to maintenance via Virtualization->Virtual Machines->Host to Maintenance takes affect almost immediately.

Speaking of latest, IMHO we don't need to have the same option duplicated in more than one place. If possible, I'd like to manage hosts from one place.

Comment 5 Nikolai Sednev 2016-07-25 10:10:33 UTC
Created attachment 1183672 [details]
screenshot from the engine's WEBUI of alma03 not in maintenance, while it is in CLI.

Comment 6 Ryan Barry 2016-07-25 12:48:26 UTC
(In reply to Nikolai Sednev from comment #4)
> Waited about 1-3 minutes and host alma03 was not set in to local maintenance
> within the WEBUI, but it was in CLI:

Ok, so this is unclear, I suppose.

Two questions:

First -- Is the correct value reflected in cockpit?

Second -- I haven't actually done much with Engine/WEBUI in 4.0. hosted-engine maintenance does not set vdsm maintenance. I'm not sure if there's an indicator in the webui about hosted engine maintenance. Probably yes. 
> When I tried to remove it from local maintenance using Cockpit, it host
> alma03 was removed from it successfully. So issue here is that engine's
> WEBUI not getting any changes from hosts at all when using Cockpit
> Virtualization->Hosted Engine->Put/Remove this host into/from local
> maintenance, although if doing the same via Cockpit Virtualization->Virtual
> Machines->Host to Maintenance, then it's working in both CLI and WEBUI of
> the engine. I suspect that this is due to the fact, that I'm logged in from
> Cockpit to the engine's WEBUI and in latest flow it also updates the
> engine's DB, whereas if first example it's not doing so.

Is that status updated when using "hosted-engine --set-maintenance --mode=local" from the CLI?



> I see that command being transferred in ctrl+shift+i in console of the WEB,
> but not sure it has any affect on my environment as for example moving to
> maintenance via Virtualization->Virtual Machines->Host to Maintenance takes
> affect almost immediately.

I believe that this sets VDSM maintenance.

> 
> Speaking of latest, IMHO we don't need to have the same option duplicated in
> more than one place. If possible, I'd like to manage hosts from one place.

That's an ongoing discussion, I think. At present, the goal of the cockpit plugin is to provide a way to manage the functionality which could normally be reached over the shell/TUI of a single host, which has some overlap with engine, but the scope is limited -- engine manages clusters/datacenters. Cockpit manages one host.

Comment 7 Nikolai Sednev 2016-07-25 16:09:11 UTC
I see these functionalities are working with a bit of delay now, I've probably looked at them and thought they would make changes on the fly, but they're a bit delayed and also not functioning the same way at all.


In Cockpit "Virtualization->Hosted Engine->Put this host into local maintenance", then after some time (less than a minute) host's status returns to "Local maintenance: True" in CLI, "Local Maintenance: true" in Cockpit, but not changes it's symbol active symbol in engine's WEBUI to wrench symbol, although it's status shown as "Hosted Engine HA:Local Maintenance Enabled".

If setting host via Cockpit into local maintenance via "Virtualization->Virtual Machines->Host to Maintenance", then it changing it's status everywhere properly, in Cockpit "Local Maintenance: true", in CLI "Local maintenance: True" and in engine's WEBUI "Hosted Engine HA: Local Maintenance Enabled" and also with a symbol of a wrench.

If then trying to activate the host back via Cockpit by "Virtualization->Hosted Engine->Remove this host from maintenance", then after some time (less than a minute) host's status returns to active in CLI, Cockpit "Local Maintenance:false", but not being synchronized with engine's WEBUI in which it stays in "Hosted Engine HA:Local Maintenance Enabled" with a symbol of a wrench.


If setting host in to local maintenance from CLI, e.g. "hosted-engine --set-maintenance --mode=local", then in CLI host's status shown correctly as "Local maintenance: True", in Cockpit it's status also shown correctly as "Local Maintenance: true", but in engine's WEBUI it's status partially correct as it appears without wrench symbol, but in correct status of "Hosted Engine HA:Local Maintenance Enabled". 


I see inconsistency of how host's status being shown in engnine's WEBUI between:
1)From Cockpit "Virtualization->Virtual Machines->Host to Maintenance"====>Shown in engine's WEBUI with symbol of a wrench and in local maintenance.
2)From Cockpit "Virtualization->Hosted Engine->Put this host into local maintenance"===========>Shown in engine's WEBUI without symbol of a wrench, but in local maintenance.

If step 1 was done, then in Cockpit "Virtualization->Hosted Engine->Remove this host from maintenance", then engine's WEBUI not being synchronized with changes these changes at all and host appears in engine's WEBUI in local maintenance with a wrench symbol and "Hosted Engine HA: Local Maintenance Enabled" status.


CLI's "hosted-engine --set-maintenance --mode=local" equals to Cockpit's "Virtualization->Hosted Engine->Put this host into local maintenance"===========>Shown in engine's WEBUI without symbol of a wrench, but in local maintenance.

CLI's "hosted-engine --set-maintenance --mode=local" is not the same as Cockpit's "Virtualization->Virtual Machines->Host to Maintenance"====>Shown in engine's WEBUI with symbol of a wrench and in local maintenance.

Comment 8 Ryan Barry 2016-07-26 13:23:31 UTC
(In reply to Nikolai Sednev from comment #7)
> I see these functionalities are working with a bit of delay now, I've
> probably looked at them and thought they would make changes on the fly, but
> they're a bit delayed and also not functioning the same way at all.
> 
> 
> In Cockpit "Virtualization->Hosted Engine->Put this host into local
> maintenance", then after some time (less than a minute) host's status
> returns to "Local maintenance: True" in CLI, "Local Maintenance: true" in
> Cockpit, but not changes it's symbol active symbol in engine's WEBUI to
> wrench symbol, although it's status shown as "Hosted Engine HA:Local
> Maintenance Enabled".
> 
> If setting host via Cockpit into local maintenance via
> "Virtualization->Virtual Machines->Host to Maintenance", then it changing
> it's status everywhere properly, in Cockpit "Local Maintenance: true", in
> CLI "Local maintenance: True" and in engine's WEBUI "Hosted Engine HA: Local
> Maintenance Enabled" and also with a symbol of a wrench.

This is expected -- VDSM maintenance also sets hosted-engine maintenance.

> 
> If then trying to activate the host back via Cockpit by
> "Virtualization->Hosted Engine->Remove this host from maintenance", then
> after some time (less than a minute) host's status returns to active in CLI,
> Cockpit "Local Maintenance:false", but not being synchronized with engine's
> WEBUI in which it stays in "Hosted Engine HA:Local Maintenance Enabled" with
> a symbol of a wrench.

I think the question here is how engine polls/communicates with hosted-engine. I imagine that it connects to ovirt-ha-agent, but I don't know on what intervals.

Simone?

> 
> 
> If setting host in to local maintenance from CLI, e.g. "hosted-engine
> --set-maintenance --mode=local", then in CLI host's status shown correctly
> as "Local maintenance: True", in Cockpit it's status also shown correctly as
> "Local Maintenance: true", but in engine's WEBUI it's status partially
> correct as it appears without wrench symbol, but in correct status of
> "Hosted Engine HA:Local Maintenance Enabled". 

Is this partially correct? This seems entirely correct. If we expect engines WEBUI to show a wrench for hosted-engine maintenance (which does not set VDSM maintenance), a separate bug should be filed.


> I see inconsistency of how host's status being shown in engnine's WEBUI
> between:
> 1)From Cockpit "Virtualization->Virtual Machines->Host to
> Maintenance"====>Shown in engine's WEBUI with symbol of a wrench and in
> local maintenance.
> 2)From Cockpit "Virtualization->Hosted Engine->Put this host into local
> maintenance"===========>Shown in engine's WEBUI without symbol of a wrench,
> but in local maintenance.

See above -- VDSM maintenance and hosted-engine maintenance are not the same.

> 
> If step 1 was done, then in Cockpit "Virtualization->Hosted Engine->Remove
> this host from maintenance", then engine's WEBUI not being synchronized with
> changes these changes at all and host appears in engine's WEBUI in local
> maintenance with a wrench symbol and "Hosted Engine HA: Local Maintenance
> Enabled" status.

Is it possible to remove a host from VDSM maintenance through hosted-engine? I don't think so... "Hosted Engine->Remove this host from maintenance" calls "hosted-engine --set-maintenance --mode=none".
  
> CLI's "hosted-engine --set-maintenance --mode=local" equals to Cockpit's
> "Virtualization->Hosted Engine->Put this host into local
> maintenance"===========>Shown in engine's WEBUI without symbol of a wrench,
> but in local maintenance.
> 
> CLI's "hosted-engine --set-maintenance --mode=local" is not the same as
> Cockpit's "Virtualization->Virtual Machines->Host to Maintenance"====>Shown
> in engine's WEBUI with symbol of a wrench and in local maintenance.

So, it's clear that the terminology here is confusing, since hosted-engine and VDSM both mean different things when they refer to "Maintenance". Any suggestions here?

Comment 9 Simone Tiraboschi 2016-07-26 14:47:02 UTC
(In reply to Ryan Barry from comment #8)
> (In reply to Nikolai Sednev from comment #7)
> > I see these functionalities are working with a bit of delay now, I've
> > probably looked at them and thought they would make changes on the fly, but
> > they're a bit delayed and also not functioning the same way at all.
> > 
> > 
> > In Cockpit "Virtualization->Hosted Engine->Put this host into local
> > maintenance", then after some time (less than a minute) host's status
> > returns to "Local maintenance: True" in CLI, "Local Maintenance: true" in
> > Cockpit, but not changes it's symbol active symbol in engine's WEBUI to
> > wrench symbol, although it's status shown as "Hosted Engine HA:Local
> > Maintenance Enabled".
> > 
> > If setting host via Cockpit into local maintenance via
> > "Virtualization->Virtual Machines->Host to Maintenance", then it changing
> > it's status everywhere properly, in Cockpit "Local Maintenance: true", in
> > CLI "Local maintenance: True" and in engine's WEBUI "Hosted Engine HA: Local
> > Maintenance Enabled" and also with a symbol of a wrench.
> 
> This is expected -- VDSM maintenance also sets hosted-engine maintenance.

We have an open bug about the same flow in the opposite direction:
https://bugzilla.redhat.com/show_bug.cgi?id=1353600

> > If then trying to activate the host back via Cockpit by
> > "Virtualization->Hosted Engine->Remove this host from maintenance", then
> > after some time (less than a minute) host's status returns to active in CLI,
> > Cockpit "Local Maintenance:false", but not being synchronized with engine's
> > WEBUI in which it stays in "Hosted Engine HA:Local Maintenance Enabled" with
> > a symbol of a wrench.
> 
> I think the question here is how engine polls/communicates with
> hosted-engine. I imagine that it connects to ovirt-ha-agent, but I don't
> know on what intervals.
> 
> Simone?

The engine simply talks with VDSM as usually, on HE hosts VDSM also knows the hosted-engine HA status from the ha agent.

> > If setting host in to local maintenance from CLI, e.g. "hosted-engine
> > --set-maintenance --mode=local", then in CLI host's status shown correctly
> > as "Local maintenance: True", in Cockpit it's status also shown correctly as
> > "Local Maintenance: true", but in engine's WEBUI it's status partially
> > correct as it appears without wrench symbol, but in correct status of
> > "Hosted Engine HA:Local Maintenance Enabled". 
> 
> Is this partially correct? This seems entirely correct. If we expect engines
> WEBUI to show a wrench for hosted-engine maintenance (which does not set
> VDSM maintenance), a separate bug should be filed.
> 
> 
> > I see inconsistency of how host's status being shown in engnine's WEBUI
> > between:
> > 1)From Cockpit "Virtualization->Virtual Machines->Host to
> > Maintenance"====>Shown in engine's WEBUI with symbol of a wrench and in
> > local maintenance.
> > 2)From Cockpit "Virtualization->Hosted Engine->Put this host into local
> > maintenance"===========>Shown in engine's WEBUI without symbol of a wrench,
> > but in local maintenance.
> 
> See above -- VDSM maintenance and hosted-engine maintenance are not the same.
> 
> > 
> > If step 1 was done, then in Cockpit "Virtualization->Hosted Engine->Remove
> > this host from maintenance", then engine's WEBUI not being synchronized with
> > changes these changes at all and host appears in engine's WEBUI in local
> > maintenance with a wrench symbol and "Hosted Engine HA: Local Maintenance
> > Enabled" status.
> 
> Is it possible to remove a host from VDSM maintenance through hosted-engine?
> I don't think so... "Hosted Engine->Remove this host from maintenance" calls
> "hosted-engine --set-maintenance --mode=none".
>   
> > CLI's "hosted-engine --set-maintenance --mode=local" equals to Cockpit's
> > "Virtualization->Hosted Engine->Put this host into local
> > maintenance"===========>Shown in engine's WEBUI without symbol of a wrench,
> > but in local maintenance.
> > 
> > CLI's "hosted-engine --set-maintenance --mode=local" is not the same as
> > Cockpit's "Virtualization->Virtual Machines->Host to Maintenance"====>Shown
> > in engine's WEBUI with symbol of a wrench and in local maintenance.
> 
> So, it's clear that the terminology here is confusing, since hosted-engine
> and VDSM both mean different things when they refer to "Maintenance". Any
> suggestions here?

I'd suggest do call it hosted-engine local maintenance; then we also have the hosted-engine global maintenance mode

Comment 10 Fabian Deutsch 2016-08-24 09:38:38 UTC
Moving this to hosted-engine setup, as this is more about the right names for the maintenance modes.

The cockpit UI will follow the names teh he-setup suggests, thus once those names are updated, the cockpit UI will follow.

Comment 11 Sandro Bonazzola 2016-09-16 14:06:34 UTC
(In reply to Fabian Deutsch from comment #10)
> Moving this to hosted-engine setup, as this is more about the right names
> for the maintenance modes.
> 
> The cockpit UI will follow the names teh he-setup suggests, thus once those
> names are updated, the cockpit UI will follow.

Can you please split this bug in 2, one on HE and one on oVirt Cockpit?

Comment 12 Sandro Bonazzola 2016-09-16 14:09:17 UTC
Moving to Roy/SLA: maintenance modes are SLA domain.

Comment 13 Doron Fediuck 2016-09-21 10:26:34 UTC
This is about semantics of maintenance.
Currently working as designed.

Comment 14 Yaniv Lavi 2016-11-29 12:36:38 UTC
What is the action item on docs?

Comment 15 Doron Fediuck 2016-11-30 14:09:42 UTC
(In reply to Yaniv Dary from comment #14)
> What is the action item on docs?

Properly describe each maintenance mode and the interactions between them.
The initial explanation is available in [1] and we should add standard host maintenance (unrelated to HE).

[1] https://www.ovirt.org/documentation/how-to/hosted-engine/#maintaining-the-setup

Comment 16 Yaniv Lavi 2016-12-14 16:22:45 UTC
This bug had requires_doc_text flag, yet no documentation text was provided. Please add the documentation text and only then set this flag.

Comment 17 Lucy Bopf 2017-09-06 00:22:32 UTC
Hi Nikolai,

To which versions of the product does this request apply? Can you please set the correct version in 'Version' field?

Comment 18 Nikolai Sednev 2017-09-06 06:14:59 UTC
AFAIK it was 4.0.

Comment 24 Ido Rosenzwig 2018-06-25 11:44:17 UTC
please add how to switch back to None mode from Local and Global modes on the GUI as well.


Note You need to log in before you can comment on or make changes to this bug.