Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1367777 - For Self Hosted RHV Deployment changing data center and cluster name causes deployment to fail
Summary: For Self Hosted RHV Deployment changing data center and cluster name causes d...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat Quickstart Cloud Installer
Classification: Red Hat
Component: Installation - RHEV
Version: 1.0
Hardware: Unspecified
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: John Matthews
QA Contact: Tasos Papaioannou
Dan Macpherson
URL:
Whiteboard:
Depends On:
Blocks: 1367897
TreeView+ depends on / blocked
 
Reported: 2016-08-17 12:52 UTC by James Olin Oden
Modified: 2018-02-26 19:58 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1367897 (view as bug list)
Environment:
Last Closed: 2018-02-26 19:58:09 UTC


Attachments (Terms of Use)

Description James Olin Oden 2016-08-17 12:52:08 UTC
Description of problem:
I have three times (well, Fabian did it once and I twice).   What originally happened was was doing a RHV self hosted deployment with four hosts, and I had change the data center and cluster name to be:

   0123456789112345678921234567893123456789

Which is exactly 40 characters long, the maximum length of a data center or cluster name.   As it was deploying the first host for the engine to run on, it died with the following error:

===== Puppet run for the host hyperviso14.b.b status reported as Error ======

On the host that had the puppet failure, /var/log/messages had the following 
concerning puppet:

Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) [ ERROR ] Failed to execute stage 'Closing up': Specified cluster does not exist: 1111
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) [ INFO  ] Stage: Clean up
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) [ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20160816211614.conf'
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) [ INFO  ] Stage: Pre-termination
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) [ INFO  ] Stage: Termination
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) [ ERROR ] Hosted Engine deployment failed: this system is not reliable, please check the issue,fix and redeploy
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns)           Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160816204743-iylgq9.log
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: hosted-engine --deploy --config-append=/etc/qci/answers returned 1 instead of one of [0]
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Setup/Exec[hosted-engine-setup]/returns) change from notrun to 0 failed: hosted-engine --deploy --config-append=/etc/qci/answers returned 1 instead of one of [0]
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Notify[oVirt Configuration stage- Done]) Dependency Exec[hosted-engine-setup] has failures: true
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Notify[oVirt Configuration stage- Done]) Skipping because of failed dependencies
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Notify[Datacenter is not in upstatus, going over configuration]) Dependency Exec[hosted-engine-setup] has failures: true
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Notify[Datacenter is not in upstatus, going over configuration]) Skipping because of failed dependencies
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/File[/etc/qci/engine-DC-config.py]) Dependency Exec[hosted-engine-setup] has failures: true
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/File[/etc/qci/engine-DC-config.py]) Skipping because of failed dependencies
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Exec[engine_dc_config]) Dependency Exec[hosted-engine-setup] has failures: true
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Exec[engine_dc_config]) Skipping because of failed dependencies
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Notify[oVirt Configuration stage- Starting]) Dependency Exec[hosted-engine-setup] has failures: true
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: (/Stage[main]/Ovirt::Self_hosted::Config/Notify[oVirt Configuration stage- Starting]) Skipping because of failed dependencies
Aug 16 21:16:15 hypervisor14 puppet-agent[3811]: Finished catalog run in 2140.07 seconds

When you look in the log, /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160817004633-g16j41.log, pointed to by the error above you find this:

Aug 17 00:45:25 hypervisor14.b.b vdsm[13818]: vdsm ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Failed to connect to broker, the number of errors has exceeded the limit (1)
Aug 17 00:45:25 hypervisor14.b.b vdsm[13818]: vdsm root ERROR failed to retrieve Hosted Engine HA info
                                              Traceback (most recent call last):
                                                File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 232, in _getHaInfo
                                                  stats = instance.get_all_stats()
                                                File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 102, in get_all_stats
                                                  with broker.connection(self._retries, self._wait):
                                                File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
                                                  return self.gen.next()
                                                File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 99, in connection
                                                  self.connect(retries, wait)
                                                File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 78, in connect
                                                  raise BrokerConnectionError(error_msg)
                                              BrokerConnectionError: Failed to connect to broker, the number of errors has exceeded the limit (1)

This error is listed several times in that log.

Version-Release number of selected component (if applicable):
QCI-1.0-RHEL-7-20160815.t.0

How reproducible:
every time

Steps to Reproduce:
1.  Do a RHV deployment 
2.  Change the data center and cluster names
3.  Continue with the deployment

Actual results:
It will fail with the puppet error which seems to be due to vdsm not running.

Expected results:
No errors.

Comment 2 John Matthews 2016-08-17 17:59:14 UTC
This is outside scope of GA, for GA we will disable the ability to configure the data center/cluster name with self-hosted.

Comment 3 Fabian von Feilitzsch 2016-08-17 19:05:29 UTC
Disabled datacenter/cluster configuration for self-hosted: https://github.com/fusor/fusor/pull/1165

Comment 4 Tasos Papaioannou 2016-08-23 13:55:53 UTC
Verified on QCI-1.0-RHEL-7-20160819.t.0.


Note You need to log in before you can comment on or make changes to this bug.