Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1354002 - osp-director-8 GA: After new deployment of 8.0 GA , failed resource for 'my-stonith-xvm-controller'.
Summary: osp-director-8 GA: After new deployment of 8.0 GA , failed resource for 'my-s...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 8.0 (Liberty)
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: async
: 8.0 (Liberty)
Assignee: Angus Thomas
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-08 16:44 UTC by mlammon
Modified: 2017-10-24 13:54 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-10-24 13:54:36 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description mlammon 2016-07-08 16:44:59 UTC
osp-director-8 GA: After new deployment of 8.0 GA , failed resource for 'my-stonith-xvm-controller'.

Environment:
openstack-heat-api-5.0.1-5.el7ost.noarch
openstack-heat-api-cloudwatch-5.0.1-5.el7ost.noarch
python-heatclient-1.0.0-1.el7ost.noarch
openstack-heat-common-5.0.1-5.el7ost.noarch
openstack-heat-engine-5.0.1-5.el7ost.noarch
openstack-heat-api-cfn-5.0.1-5.el7ost.noarch
pcs-0.9.143-15.el7.x86_64

Description:
After successful deplyment from heat-list perspective, pcs status shows failed action like:  my-stonith-xvm-controller0_start_0 on overcloud-controller-1 'unknown error' (1): call=332, status=Timed Out, exitreason='none',
    last-rc-change='Fri Jul  8 15:43:39 2016', queued=0ms, exec=21010ms
This has been reproducible as well as seen in 8.0 async deployment.

Workaround:
sudo pcs stonith cleanup my-stonith-xvm-controller0


Deploy command:
openstack overcloud deploy --templates --control-scale 3 --compute-scale 1   --neutron-network-type vxlan --neutron-tunnel-types vxlan  --ntp-server clock.redhat.com --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml --ceph-storage-scale 1

[heat-admin@overcloud-controller-1 ~]$ sudo pcs status
Cluster name: tripleo_cluster
Last updated: Fri Jul  8 15:52:57 2016          Last change: Fri Jul  8 15:43:50 2016 by root via cibadmin on overcloud-controller-2
Stack: corosync
Current DC: overcloud-controller-0 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
3 nodes and 115 resources configured

Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

Full list of resources:

 ip-192.0.2.6   (ocf::heartbeat:IPaddr2):       Started overcloud-controller-0
 Clone Set: haproxy-clone [haproxy]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 ip-192.168.200.180     (ocf::heartbeat:IPaddr2):       Started overcloud-controller-1
 ip-192.168.100.10      (ocf::heartbeat:IPaddr2):       Started overcloud-controller-2
 ip-192.168.110.10      (ocf::heartbeat:IPaddr2):       Started overcloud-controller-0
 ip-192.168.100.11      (ocf::heartbeat:IPaddr2):       Started overcloud-controller-1
 ip-192.168.120.10      (ocf::heartbeat:IPaddr2):       Started overcloud-controller-2
 Master/Slave Set: redis-master [redis]
     Masters: [ overcloud-controller-1 ]
     Slaves: [ overcloud-controller-0 overcloud-controller-2 ]
 Master/Slave Set: galera-master [galera]
     Masters: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: mongod-clone [mongod]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: memcached-clone [memcached]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-l3-agent-clone [neutron-l3-agent]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-alarm-notifier-clone [openstack-ceilometer-alarm-notifier]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-engine-clone [openstack-heat-engine]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-api-clone [openstack-ceilometer-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-clone [openstack-heat-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-api-clone [openstack-nova-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-keystone-clone [openstack-keystone]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-glance-registry-clone [openstack-glance-registry]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-cinder-api-clone [openstack-cinder-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-glance-api-clone [openstack-glance-api]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: delay-clone [delay]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: neutron-server-clone [neutron-server]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: httpd-clone [httpd]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-ceilometer-alarm-evaluator-clone [openstack-ceilometer-alarm-evaluator]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 openstack-cinder-volume        (systemd:openstack-cinder-volume):      Started overcloud-controller-0
 Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 my-stonith-xvm-controller0     (stonith:fence_xvm):    Started overcloud-controller-2
 my-stonith-xvm-controller1     (stonith:fence_xvm):    Started overcloud-controller-1
 my-stonith-xvm-controller2     (stonith:fence_xvm):    Started overcloud-controller-0

Failed Actions:
* my-stonith-xvm-controller0_start_0 on overcloud-controller-1 'unknown error' (1): call=332, status=Timed Out, exitreason='none',
    last-rc-change='Fri Jul  8 15:43:39 2016', queued=0ms, exec=21010ms


PCSD Status:
  overcloud-controller-0: Online
  overcloud-controller-1: Online
  overcloud-controller-2: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled


[heat-admin@overcloud-controller-1 ~]$ grep my-stonith /var/log/messages
grep: /var/log/messages: Permission denied
[heat-admin@overcloud-controller-1 ~]$ sudo grep my-stonith /var/log/messages
Jul  8 11:43:39 localhost crmd[2156]:  notice: Operation my-stonith-xvm-controller0_monitor_0: not running (node=overcloud-controller-1, call=331, rc=7, cib-update=168, confirmed=true)
Jul  8 11:43:40 localhost stonith-ng[2152]:  notice: Added 'my-stonith-xvm-controller0' to the device list (1 active devices)
Jul  8 11:43:50 localhost stonith-ng[2152]:  notice: Added 'my-stonith-xvm-controller1' to the device list (2 active devices)
Jul  8 11:43:50 localhost stonith-ng[2152]:  notice: Added 'my-stonith-xvm-controller2' to the device list (3 active devices)
Jul  8 11:44:00 localhost stonith-ng[2152]:  notice: Operation 'monitor' [22228] for device 'my-stonith-xvm-controller0' returned: -62 (Timer expired)
Jul  8 11:44:01 localhost crmd[2156]:   error: Operation my-stonith-xvm-controller0_start_0: Timed Out (node=overcloud-controller-1, call=332, timeout=20000ms)
Jul  8 11:44:01 localhost crmd[2156]:  notice: Operation my-stonith-xvm-controller1_monitor_0: not running (node=overcloud-controller-1, call=336, rc=7, cib-update=171, confirmed=true)
Jul  8 11:44:01 localhost crmd[2156]:  notice: Operation my-stonith-xvm-controller2_monitor_0: not running (node=overcloud-controller-1, call=340, rc=7, cib-update=172, confirmed=true)
Jul  8 11:44:01 localhost crmd[2156]:  notice: Operation my-stonith-xvm-controller0_stop_0: ok (node=overcloud-controller-1, call=341, rc=0, cib-update=173, confirmed=true)
Jul  8 11:44:02 localhost crmd[2156]:  notice: Operation my-stonith-xvm-controller1_start_0: ok (node=overcloud-controller-1, call=342, rc=0, cib-update=174, confirmed=true)

Comment 3 mlammon 2017-03-20 19:25:13 UTC
Its a very minor issue but test today reveals the issue is still present.  
Deploy 8.0 GA , check controller-0

Failed Actions:
* my-stonith-xvm-controller0_start_0 on overcloud-controller-1 'unknown error' (1): call=332, status=Timed Out, exitreason='none',
    last-rc-change='Mon Mar 20 16:48:13 2017', queued=0ms, exec=21005ms


PCSD Status:
  overcloud-controller-0: Online
  overcloud-controller-1: Online
  overcloud-controller-2: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[heat-admin@overcloud-controller-0 ~]$ date
Mon Mar 20 18:49:24 UTC 2017

Comment 4 Chris Jones 2017-10-24 13:54:36 UTC
We're closing this because we believe this is not a bug - OSP8 didn't have support for deploying fencing and xvm would be a virtual deployment, which is only suitable for Proof of Concept deployments.


Note You need to log in before you can comment on or make changes to this bug.