Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1684419 - [ovn_cluster]master node can't be up after restart openvswitch
Summary: [ovn_cluster]master node can't be up after restart openvswitch
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch2.11
Version: FDP 19.01
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Dan Williams
QA Contact: haidong li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-01 08:49 UTC by haidong li
Modified: 2019-03-13 12:16 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description haidong li 2019-03-01 08:49:45 UTC
Description of problem:
this bug is similar with a ovs 2.9 bug bz1684363 on rhel7

Version-Release number of selected component (if applicable):
[root@hp-dl388g8-02 ovn_ha]# uname -a
Linux hp-dl388g8-02.rhts.eng.pek2.redhat.com 4.18.0-64.el8.x86_64 #1 SMP Wed Jan 23 20:50:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
[root@hp-dl388g8-02 ovn_ha]# rpm -qa | grep openvswitch
openvswitch2.11-ovn-common-2.11.0-0.20190129gitd3a10db.el8fdb.x86_64
kernel-kernel-networking-openvswitch-ovn_ha-1.0-30.noarch
openvswitch2.11-ovn-central-2.11.0-0.20190129gitd3a10db.el8fdb.x86_64
openvswitch-selinux-extra-policy-1.0-10.el8fdp.noarch
openvswitch2.11-ovn-host-2.11.0-0.20190129gitd3a10db.el8fdb.x86_64
openvswitch2.11-2.11.0-0.20190129gitd3a10db.el8fdb.x86_64

How reproducible:
everytime

Steps to Reproduce:
1.set up cluster with 3 nodes as ovndb_servers
2.restart openvswitch on master node

[root@hp-dl388g8-02 ovn_ha]# pcs status
Cluster name: my_cluster
Stack: corosync
Current DC: 70.0.0.2 (version 2.0.1-3.el8-0eb7991564) - partition with quorum
Last updated: Fri Mar  1 03:40:50 2019
Last change: Fri Mar  1 03:33:33 2019 by root via crm_attribute on 70.0.0.2

3 nodes configured
4 resources configured

Online: [ 70.0.0.2 70.0.0.12 70.0.0.20 ]

Full list of resources:

 ip-70.0.0.50	(ocf::heartbeat:IPaddr2):	Started 70.0.0.2
 Clone Set: ovndb_servers-clone [ovndb_servers] (promotable)
     Masters: [ 70.0.0.2 ]
     Slaves: [ 70.0.0.12 70.0.0.20 ]

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@hp-dl388g8-02 ovn_ha]# systemctl restart openvswitch
[root@hp-dl388g8-02 ovn_ha]# pcs status
Cluster name: my_cluster
Stack: corosync
Current DC: 70.0.0.2 (version 2.0.1-3.el8-0eb7991564) - partition with quorum
Last updated: Fri Mar  1 03:41:21 2019
Last change: Fri Mar  1 03:33:33 2019 by root via crm_attribute on 70.0.0.2

3 nodes configured
4 resources configured

Online: [ 70.0.0.2 70.0.0.12 70.0.0.20 ]

Full list of resources:

 ip-70.0.0.50	(ocf::heartbeat:IPaddr2):	Started 70.0.0.2
 Clone Set: ovndb_servers-clone [ovndb_servers] (promotable)
     Masters: [ 70.0.0.2 ]
     Slaves: [ 70.0.0.12 70.0.0.20 ]

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@hp-dl388g8-02 ovn_ha]# pcs status
Cluster name: my_cluster
Stack: corosync
Current DC: 70.0.0.2 (version 2.0.1-3.el8-0eb7991564) - partition with quorum
Last updated: Fri Mar  1 03:41:25 2019
Last change: Fri Mar  1 03:33:33 2019 by root via crm_attribute on 70.0.0.2

3 nodes configured
4 resources configured

Online: [ 70.0.0.2 70.0.0.12 70.0.0.20 ]

Full list of resources:

 ip-70.0.0.50	(ocf::heartbeat:IPaddr2):	Started 70.0.0.2
 Clone Set: ovndb_servers-clone [ovndb_servers] (promotable)
     ovndb_servers	(ocf::ovn:ovndb-servers):	FAILED 70.0.0.2
     Slaves: [ 70.0.0.12 70.0.0.20 ]

Failed Resource Actions:
* ovndb_servers_demote_0 on 70.0.0.2 'not running' (7): call=17, status=complete, exitreason='',
    last-rc-change='Fri Mar  1 03:41:22 2019', queued=0ms, exec=97ms

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@hp-dl388g8-02 ovn_ha]# pcs status
Cluster name: my_cluster
Stack: corosync
Current DC: 70.0.0.2 (version 2.0.1-3.el8-0eb7991564) - partition with quorum
Last updated: Fri Mar  1 03:41:27 2019
Last change: Fri Mar  1 03:33:33 2019 by root via crm_attribute on 70.0.0.2

3 nodes configured
4 resources configured

Online: [ 70.0.0.2 70.0.0.12 70.0.0.20 ]

Full list of resources:

 ip-70.0.0.50	(ocf::heartbeat:IPaddr2):	Started 70.0.0.2
 Clone Set: ovndb_servers-clone [ovndb_servers] (promotable)
     ovndb_servers	(ocf::ovn:ovndb-servers):	FAILED 70.0.0.2
     Slaves: [ 70.0.0.12 70.0.0.20 ]

Failed Resource Actions:
* ovndb_servers_demote_0 on 70.0.0.2 'not running' (7): call=17, status=complete, exitreason='',
    last-rc-change='Fri Mar  1 03:41:22 2019', queued=0ms, exec=97ms

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@hp-dl388g8-02 ovn_ha]# 



Actual results:
the node can't be back after restart openvswitch

Expected results:
the node can be back after restart openvswitch

Additional info:


Note You need to log in before you can comment on or make changes to this bug.