Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1516415 - RHOS 10 (newton): overcloud deployment -> Galera unable to detect last known write sequence number
Summary: RHOS 10 (newton): overcloud deployment -> Galera unable to detect last known ...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: mariadb-galera
Version: 10.0 (Newton)
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Damien Ciabrini
QA Contact: Udi Shkalim
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-22 15:10 UTC by Francisco Javier Lopez Y Grueber
Modified: 2017-11-27 13:44 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-27 13:44:08 UTC


Attachments (Terms of Use)
Screenshot of the journal messages (deleted)
2017-11-22 15:10 UTC, Francisco Javier Lopez Y Grueber
no flags Details
SOSREPORT MYSQL,PACEMAKER,COROSYNC (deleted)
2017-11-22 15:48 UTC, Francisco Javier Lopez Y Grueber
no flags Details
output mysqd_save --wsrep-recover (deleted)
2017-11-22 16:04 UTC, Francisco Javier Lopez Y Grueber
no flags Details
oc2 (deleted)
2017-11-22 16:19 UTC, Francisco Javier Lopez Y Grueber
no flags Details
oc0 (deleted)
2017-11-22 16:20 UTC, Francisco Javier Lopez Y Grueber
no flags Details
oc1 (deleted)
2017-11-22 16:24 UTC, Francisco Javier Lopez Y Grueber
no flags Details
hosts (deleted)
2017-11-22 16:54 UTC, Francisco Javier Lopez Y Grueber
no flags Details
oc2 (deleted)
2017-11-23 11:21 UTC, Francisco Javier Lopez Y Grueber
no flags Details
oc1 (deleted)
2017-11-23 11:21 UTC, Francisco Javier Lopez Y Grueber
no flags Details
oc0 (deleted)
2017-11-23 11:22 UTC, Francisco Javier Lopez Y Grueber
no flags Details
templates in use (deleted)
2017-11-23 11:23 UTC, Francisco Javier Lopez Y Grueber
no flags Details
Verification Undercloud Domain Settings (deleted)
2017-11-23 11:26 UTC, Francisco Javier Lopez Y Grueber
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Bugzilla 1464114 None None None 2017-11-23 13:05:54 UTC
Red Hat Knowledge Base (Article) 2089051 None None None 2017-11-23 13:04:26 UTC

Description Francisco Javier Lopez Y Grueber 2017-11-22 15:10:44 UTC
Created attachment 1357598 [details]
Screenshot of the journal messages

Description of problem:

Overcloud deployment fails due to the galera cluster problems. 

Main message: 

 Galera unable to detect last known write sequence number
~
crmd: Result of start operation of galera on ${node} (unknown error) 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Delete Stack and redeploy.
2. Error occurs after 1.5h 
3.

Actual results:

Deployment fails as non of the openstack services can be installed properly. 

Expected results:

Deployment succeeds

Additional info:

Comment 1 Fabio Massimo Di Nitto 2017-11-22 15:29:49 UTC
Please provide full sosreports.

Comment 2 Francisco Javier Lopez Y Grueber 2017-11-22 15:48:06 UTC
Created attachment 1357643 [details]
SOSREPORT MYSQL,PACEMAKER,COROSYNC

Comment 3 Francisco Javier Lopez Y Grueber 2017-11-22 16:04:41 UTC
Created attachment 1357648 [details]
output mysqd_save --wsrep-recover

Comment 4 Francisco Javier Lopez Y Grueber 2017-11-22 16:19:15 UTC
Created attachment 1357662 [details]
oc2

Comment 5 Francisco Javier Lopez Y Grueber 2017-11-22 16:20:32 UTC
Created attachment 1357665 [details]
oc0

Comment 6 Francisco Javier Lopez Y Grueber 2017-11-22 16:24:44 UTC
Created attachment 1357666 [details]
oc1

Comment 7 Francisco Javier Lopez Y Grueber 2017-11-22 16:54:47 UTC
Created attachment 1357678 [details]
hosts

Comment 8 Michael Bayer 2017-11-22 20:43:13 UTC
we would need the sosreports to be complete via the customer portal (including ps commands, all installed rpms, etc) so that we can pull them into collab-shell and additionally we need to see the full overcloud deploy command as well as all configurations and heat templates used to create the stack.    Additionally if we can get a directory listing of all /var/lib/mysql.

Comment 9 Francisco Javier Lopez Y Grueber 2017-11-23 11:21:05 UTC
Created attachment 1358142 [details]
oc2

Comment 10 Francisco Javier Lopez Y Grueber 2017-11-23 11:21:44 UTC
Created attachment 1358143 [details]
oc1

Comment 11 Francisco Javier Lopez Y Grueber 2017-11-23 11:22:23 UTC
Created attachment 1358144 [details]
oc0

Comment 12 Francisco Javier Lopez Y Grueber 2017-11-23 11:23:09 UTC
Created attachment 1358145 [details]
templates in use

Comment 13 Francisco Javier Lopez Y Grueber 2017-11-23 11:26:51 UTC
Created attachment 1358148 [details]
Verification Undercloud Domain Settings

domain relevant settings verified on undercloud

Comment 14 Damien Ciabrini 2017-11-23 12:51:15 UTC
Francisco, the last sosreports that you have uploaded lack important files for investigation, they only include galera/gaproxy/cluster logs

we need _all_  logs that sosreports can provide, e.g. processes running, network settings etc. Could you get those uploaded?

Comment 15 Francisco Javier Lopez Y Grueber 2017-11-23 13:02:53 UTC
Hi, please see connected for full sosreports. These only contain, rpm,yum,corosync, pacemaker and system.

Comment 18 Martin Schuppert 2017-11-23 13:58:10 UTC
Most likely the issue is that the CloudDomain does not match what is configured for the dhcp_domain in nova.conf of the undercloud.

Comment 19 Damien Ciabrini 2017-11-27 13:44:08 UTC
Closing this particular bz as per commant #18 it appears deployment error was due to a misconfiguration.


Note You need to log in before you can comment on or make changes to this bug.