Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1356777 - rhel-osp-director: scale down of computes fails after upgrade 8.0->9.0, some resources are unmanaged/stopped on controllers.
Summary: rhel-osp-director: scale down of computes fails after upgrade 8.0->9.0, some...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 9.0 (Mitaka)
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: ga
: 9.0 (Mitaka)
Assignee: Ben Nemec
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-15 03:19 UTC by Alexander Chuzhoy
Modified: 2016-09-15 21:18 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-02 13:19:24 UTC


Attachments (Terms of Use)

Description Alexander Chuzhoy 2016-07-15 03:19:07 UTC
rhel-osp-director:  scale down of computes fails after upgrade 8.0->9.0


Environment:
openstack-tripleo-heat-templates-2.0.0-15.el7ost.noarch
openstack-tripleo-heat-templates-liberty-2.0.0-15.el7ost.noarch
openstack-tripleo-heat-templates-kilo-2.0.0-15.el7ost.noarch
instack-undercloud-4.0.0-7.el7ost.noarch
openstack-puppet-modules-8.1.2-1.el7ost.noarch


Steps to reproduce:
1. Deploy 8.0 with:
openstack overcloud deploy --templates --control-scale 3 --compute-scale 2 --ceph-storage-scale 3 --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server 10.5.26.10 --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml

2. Upgrade to 9.0 (including updating the images for OC nodes).



3. Try to scale down the computes with:
openstack overcloud deploy --templates --control-scale 3 --compute-scale 1 --ceph-storage-scale 3 --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server 10.5.26.10 --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml





Result:
2016-06-30 02:22:13 [overcloud-ComputeAllNodesValidationDeployment-rxtdysawxz72]: UPDATE_COMPLETE Stack UPDATE completed successfully                                                                   
2016-06-30 02:22:14 [ComputeAllNodesValidationDeployment]: UPDATE_COMPLETE state changed                                                                                                                
2016-06-30 02:53:15 [2]: SIGNAL_COMPLETE Unknown                                                                                                                                                        
2016-06-30 02:53:22 [1]: SIGNAL_COMPLETE Unknown                                                                                                                                                        
Stack overcloud UPDATE_FAILED                                                                                                                                                                           
Deployment failed:  Heat Stack update failed. 




pcs status outputs the following:
[root@overcloud-controller-0 ~]# pcs status  
Cluster name: tripleo_cluster                
Last updated: Fri Jul 15 03:18:01 2016          Last change: Fri Jul 15 01:50:34 2016 by root via cibadmin on overcloud-controller-0
Stack: corosync                                                                                                                     
Current DC: overcloud-controller-2 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum                                      
3 nodes and 127 resources configured                                                                                                

Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

Full list of resources:

 ip-192.168.200.10      (ocf::heartbeat:IPaddr2):       Started overcloud-controller-0 (unmanaged)
 ip-10.19.94.10 (ocf::heartbeat:IPaddr2):       Started overcloud-controller-1 (unmanaged)        
 ip-10.19.95.10 (ocf::heartbeat:IPaddr2):       Started overcloud-controller-2 (unmanaged)        
 Clone Set: haproxy-clone [haproxy] (unmanaged)                                                   
     haproxy    (systemd:haproxy):      Started overcloud-controller-0 (unmanaged)                
     haproxy    (systemd:haproxy):      Started overcloud-controller-2 (unmanaged)                
     haproxy    (systemd:haproxy):      Started overcloud-controller-1 (unmanaged)                
 ip-192.168.0.6 (ocf::heartbeat:IPaddr2):       Started overcloud-controller-0 (unmanaged)        
 Master/Slave Set: galera-master [galera] (unmanaged)                                             
     galera     (ocf::heartbeat:galera):        FAILED Master overcloud-controller-0 (unmanaged)  
     galera     (ocf::heartbeat:galera):        Started overcloud-controller-2 (unmanaged)        
     galera     (ocf::heartbeat:galera):        Started overcloud-controller-1 (unmanaged)        
 Clone Set: memcached-clone [memcached] (unmanaged)                                               
     memcached  (systemd:memcached):    Started overcloud-controller-0 (unmanaged)                
     memcached  (systemd:memcached):    Started overcloud-controller-2 (unmanaged)                
     memcached  (systemd:memcached):    Started overcloud-controller-1 (unmanaged)                
 ip-10.19.94.11 (ocf::heartbeat:IPaddr2):       Started overcloud-controller-1 (unmanaged)        
 ip-10.19.184.180       (ocf::heartbeat:IPaddr2):       Started overcloud-controller-2 (unmanaged)
 Clone Set: rabbitmq-clone [rabbitmq] (unmanaged)                                                 
     rabbitmq   (ocf::heartbeat:rabbitmq-cluster):      Started overcloud-controller-0 (unmanaged)
     rabbitmq   (ocf::heartbeat:rabbitmq-cluster):      Started overcloud-controller-2 (unmanaged)
     rabbitmq   (ocf::heartbeat:rabbitmq-cluster):      Started overcloud-controller-1 (unmanaged)
 Clone Set: openstack-core-clone [openstack-core] (unmanaged)                                     
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]            
 Master/Slave Set: redis-master [redis] (unmanaged)                                               
     redis      (ocf::heartbeat:redis): Master overcloud-controller-0 (unmanaged)                 
     redis      (ocf::heartbeat:redis): Started overcloud-controller-2 (unmanaged)                
     redis      (ocf::heartbeat:redis): Started overcloud-controller-1 (unmanaged)                
 Clone Set: mongod-clone [mongod] (unmanaged)                                                     
     mongod     (systemd:mongod):       Started overcloud-controller-0 (unmanaged)                
     mongod     (systemd:mongod):       Started overcloud-controller-2 (unmanaged)                
     mongod     (systemd:mongod):       Started overcloud-controller-1 (unmanaged)                
 Clone Set: openstack-aodh-evaluator-clone [openstack-aodh-evaluator] (unmanaged)                 
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]            
 Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler] (unmanaged)                 
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]            
 Clone Set: neutron-l3-agent-clone [neutron-l3-agent] (unmanaged)                                 
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]            
 Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup] (unmanaged)                       
     neutron-netns-cleanup      (ocf::neutron:NetnsCleanup):    Started overcloud-controller-0 (unmanaged)
     neutron-netns-cleanup      (ocf::neutron:NetnsCleanup):    Started overcloud-controller-2 (unmanaged)
     neutron-netns-cleanup      (ocf::neutron:NetnsCleanup):    Started overcloud-controller-1 (unmanaged)
 Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup] (unmanaged)                                   
     neutron-ovs-cleanup        (ocf::neutron:OVSCleanup):      Started overcloud-controller-0 (unmanaged)
     neutron-ovs-cleanup        (ocf::neutron:OVSCleanup):      Started overcloud-controller-2 (unmanaged)
     neutron-ovs-cleanup        (ocf::neutron:OVSCleanup):      Started overcloud-controller-1 (unmanaged)
 openstack-cinder-volume        (systemd:openstack-cinder-volume):      Stopped (unmanaged)               
 Clone Set: openstack-heat-engine-clone [openstack-heat-engine] (unmanaged)                               
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-ceilometer-api-clone [openstack-ceilometer-api] (unmanaged)                         
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-aodh-listener-clone [openstack-aodh-listener] (unmanaged)                           
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent] (unmanaged)                             
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-gnocchi-metricd-clone [openstack-gnocchi-metricd] (unmanaged)                       
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-aodh-notifier-clone [openstack-aodh-notifier] (unmanaged)                           
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-heat-api-clone [openstack-heat-api] (unmanaged)                                     
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector] (unmanaged)             
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-glance-api-clone [openstack-glance-api] (unmanaged)                                 
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler] (unmanaged)                     
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-nova-api-clone [openstack-nova-api] (unmanaged)                                     
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth] (unmanaged)                     
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-sahara-api-clone [openstack-sahara-api] (unmanaged)                                 
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch] (unmanaged)               
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-sahara-engine-clone [openstack-sahara-engine] (unmanaged)                           
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-glance-registry-clone [openstack-glance-registry] (unmanaged)                       
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-gnocchi-statsd-clone [openstack-gnocchi-statsd] (unmanaged)                         
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]                    
 Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification] (unmanaged)
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-cinder-api-clone [openstack-cinder-api] (unmanaged)
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent] (unmanaged)
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent] (unmanaged)
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy] (unmanaged)
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: delay-clone [delay] (unmanaged)
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central] (unmanaged)
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: httpd-clone [httpd] (unmanaged)
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn] (unmanaged)
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor] (unmanaged)
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-server-clone [neutron-server] (unmanaged)
     Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

Failed Actions:
* galera_promote_0 on overcloud-controller-0 'unknown error' (1): call=35, status=complete, exitreason='Failed initial monitor action',
    last-rc-change='Thu Jul 14 23:09:32 2016', queued=0ms, exec=8811ms
* openstack-nova-scheduler_start_0 on overcloud-controller-0 'OCF_TIMEOUT' (198): call=102, status=Timed Out, exitreason='none',
    last-rc-change='Fri Jul 15 00:00:41 2016', queued=0ms, exec=199981ms
* openstack-nova-scheduler_start_0 on overcloud-controller-2 'OCF_TIMEOUT' (198): call=101, status=Timed Out, exitreason='none',
    last-rc-change='Fri Jul 15 00:00:41 2016', queued=0ms, exec=199993ms
* openstack-nova-scheduler_start_0 on overcloud-controller-1 'OCF_TIMEOUT' (198): call=102, status=Timed Out, exitreason='none',
    last-rc-change='Fri Jul 15 00:00:41 2016', queued=0ms, exec=199992ms


PCSD Status:
  overcloud-controller-0: Online
  overcloud-controller-1: Online
  overcloud-controller-2: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Comment 5 Ben Nemec 2016-07-22 22:25:57 UTC
Was this deployment in a sane state before the upgrade/scale attempt?  I'm seeing issues starting galera right away in the logs, which makes me think the initial deployment failed.  At that point I wouldn't expect anything else to work.

Comment 9 Jay Dobies 2016-08-02 13:19:24 UTC
Based on Ben's comment and the inability to reproduce, closing this out. Please feel free to reopen if this issue arises again.

Comment 10 Alexander Chuzhoy 2016-09-15 21:18:09 UTC
Clearing the needinfo for now.


Note You need to log in before you can comment on or make changes to this bug.