Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1517507 - [BUG TRACKER] - - Active/standby Amphoras- Amphora boot failure while creating LB should put the LB into error state
Summary: [BUG TRACKER] - - Active/standby Amphoras- Amphora boot failure while creatin...
Keywords:
Status: POST
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-octavia
Version: 12.0 (Pike)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 13.0 (Queens)
Assignee: Nir Magnezi
QA Contact: Alexander Stafeyev
URL:
Whiteboard:
Depends On:
Blocks: 1698576
TreeView+ depends on / blocked
 
Reported: 2017-11-26 10:15 UTC by Alexander Stafeyev
Modified: 2019-04-10 16:35 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
OpenStack Storyboard 2001315 None None None 2017-12-22 22:25:53 UTC

Description Alexander Stafeyev 2017-11-26 10:15:54 UTC
When we have "active/standby" octvia configuration, and we create LB, at least 2 amphoras are booted. 
If 1 of those 2 amphoras failed to boot we should have info/error messages in OCTAVIA logs and we should have a printed out message that alerts us regarding the amphora failure. 

Storyboard - https://storyboard.openstack.org/#!/story/2001315


* Octavia upstream is not managed in launchpad but in storyboard

Comment 2 Nir Magnezi 2017-12-05 10:46:47 UTC
Hi Alex,

Currently (at least until we implement flavors), the loadbalancer topology is a system-wide configuration. Moreover, that configuration is not exposed to the end user in any way.

When a user creates a loadbalancer, he has no notion of what is happening behind the scenes. That user can only get an indication of whether or not his loadbalancer is operational
 by looking at the operating_status and provisioning_status.

It is important to take the above into account since:
1. The user can't actually see the amphoras, those reside on the admin tenant.
2. The user normally does not have access to the deployment logs.
3. The loadbalancer creation is an asynchronous process, thus we cannot block and wait for the outcome in order to print an error.

Additionally, in the scenario you mentioned, in which there is no capacity to boot two amphoras, you expect a warning. I beg the differ. I would expect that at the end of the process the loadbalancer creation will fail and will get into an ERROR state (rather than leave it working with a single amphora).
The idea is that the OpenStack deployment simply does not have enough resources to properly create the loadbalancer and we cannot guarantee the SLA for it. A similar concept is when you try to boot an instance with a large flavor, and you simply don't have the capacity for it.
This is btw, in contrast to a highly available loadbalancer that was successfully created with two amphoras, and somewhere along the way one of the amphora instances died.

P.S.
I do thin we warn the operator when we fail to boot active standby amphoras: https://github.com/openstack/octavia/blob/7bf8804177d3b7a9a4384c2b6d349228ecdced23/octavia/controller/worker/tasks/compute_tasks.py#L228-L232

Thoughts?

Comment 3 Alexander Stafeyev 2017-12-10 08:57:55 UTC
(In reply to Nir Magnezi from comment #2)
> Hi Alex,
> 
> Currently (at least until we implement flavors), the loadbalancer topology
> is a system-wide configuration. Moreover, that configuration is not exposed
> to the end user in any way.
> 
> When a user creates a loadbalancer, he has no notion of what is happening
> behind the scenes. That user can only get an indication of whether or not
> his loadbalancer is operational
>  by looking at the operating_status and provisioning_status.
> 
> It is important to take the above into account since:
> 1. The user can't actually see the amphoras, those reside on the admin
> tenant.
> 2. The user normally does not have access to the deployment logs.
> 3. The loadbalancer creation is an asynchronous process, thus we cannot
> block and wait for the outcome in order to print an error.
> 
> Additionally, in the scenario you mentioned, in which there is no capacity
> to boot two amphoras, you expect a warning. I beg the differ. I would expect
> that at the end of the process the loadbalancer creation will fail and will
> get into an ERROR state (rather than leave it working with a single amphora).
> The idea is that the OpenStack deployment simply does not have enough
> resources to properly create the loadbalancer and we cannot guarantee the
> SLA for it. A similar concept is when you try to boot an instance with a
> large flavor, and you simply don't have the capacity for it.
> This is btw, in contrast to a highly available loadbalancer that was
> successfully created with two amphoras, and somewhere along the way one of
> the amphora instances died.
> 
> P.S.
> I do thin we warn the operator when we fail to boot active standby amphoras:
> https://github.com/openstack/octavia/blob/
> 7bf8804177d3b7a9a4384c2b6d349228ecdced23/octavia/controller/worker/tasks/
> compute_tasks.py#L228-L232
> 
> Thoughts?

Hi Nir, 
I agree with your logic. If the user expects HA and he does not get it, it would be better tu put the LB in ERROR state as you mentioned.  
I will edit the topic .

Comment 4 Nir Magnezi 2017-12-25 10:01:26 UTC
Since there is no development currently needed here, I'm setting this as TestOnly.

Comment 6 Nir Magnezi 2017-12-25 10:58:40 UTC
Since this is TestOnly, moving to POST.

Comment 11 Nir Magnezi 2018-08-30 20:30:48 UTC
Alex,

This is a TestOnly bug.
Are you planning to test it as a part of OSP13z or OSP14?
Del-Rel wants to know when this can be moved to a Modified state.

Thanks,
Nir

Comment 12 Alexander Stafeyev 2018-09-02 08:08:06 UTC
(In reply to Nir Magnezi from comment #11)
> Alex,
> 
> This is a TestOnly bug.
> Are you planning to test it as a part of OSP13z or OSP14?
> Del-Rel wants to know when this can be moved to a Modified state.
> 
> Thanks,
> Nir

Hi Nir, 
I will test it in 14. Thanks


Note You need to log in before you can comment on or make changes to this bug.