Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1519575 - [RFE] Support for OVS and OVS-DPDK on the same compute node
Summary: [RFE] Support for OVS and OVS-DPDK on the same compute node
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo
Version: 10.0 (Newton)
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
: ---
Assignee: Vijay Chundury
QA Contact: Arik Chernetsky
URL:
Whiteboard:
: 1538690 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-30 23:10 UTC by Don Weeks
Modified: 2019-04-10 14:46 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:


Attachments (Terms of Use)

Description Don Weeks 2017-11-30 23:10:44 UTC
Description of problem:

You competitors (Big Switch, 6wind) allow separate NIC interfaces to be used as DPDK and non-DPDK. Today in RHOSP 10, only one type of NIC can exist. Either DPDK or non-DPDK. This creates unnecessary zoning and prevents reuse of hardware. Most hardware can be used for both DPDK and non DPDK VMs and zoning should be reserved for other issues. 


Version-Release number of selected component (if applicable): 10


How reproducible:
This is documented as a restriction 

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:

OVS and OVS+DPDK should be usable on a NIC by NIC basis


Additional info:

Comment 1 Sanjay Upadhyay 2017-12-11 08:37:47 UTC
AFAIK, 

RHOS 10 supports role based deployment, so we can create separate roles for the ones with ovs only, and ovs-dpdk. That way same nic could be utilized on different compute nodes. Check - https://goo.gl/KmDH16

Please do expand on the usecase if I understood you wrong.

Comment 2 Don Weeks 2018-01-29 13:45:37 UTC
No, we mean that you can create an OVS bridge and an OVS+DPDK bridge on the same node. The link provided says there are DPDK nodes and non DPDK nodes. This is not what we need and not what your competitors provide. We mean OVS+DPDK ports and non DPDK ports on the same host but on different NICs.

Comment 3 Don Weeks 2018-01-29 13:50:56 UTC
Example: I have 4 NIC cards. NIC1, NIC2, NIC3, NIC4. On NIC3 and NIC4, I create 2 bonded OVS+DPDK bridges. On NIC1, I create a Linux bond for Openstack communication (API, Storage, Storage MGMT). But on NIC2, I want an OVS bridge.

Comment 4 Don Weeks 2018-01-29 13:57:22 UTC
In the document you linked, see section 4.2.5, the second note. Your competitors do not have this restriction and removing that restriction is what is being requested by this RFE.

Comment 6 Franck Baudin 2018-02-08 13:58:47 UTC
Please open an RFE, this is not a bug. But I would like to make sure that I understand your request in details, so I can make sure that there is no alternative to an RFE to fulfill your request. 

Do competitors have an upstream based solution or is it based on proprietary patches on Neutron and Nova? Because as long as all VMs are connected to a single br-int, br-int is either a kernel bridge, either a user bridge (DPDK).

In https://bugzilla.redhat.com/show_bug.cgi?id=1519575#c3 you state:  "But on NIC2, I want an OVS bridge". Why do you want an OVS kernel bridge? To connect a VM on top of it? To use a function that is not available in OVS-DPDK?

Comment 8 Don Weeks 2018-02-09 16:15:23 UTC
It was my intention that this was an RFE and not a bug. 

It is a complete reroll of Openstack so I am not as familiar with the internals. 6wind for sure is proprietary. 

On your other question, yes, we want a VM on top of it. Most of our apps require in the same VM, DPDK and non DPDK ports. Right now, we are having to use a non DPDK port.

Comment 9 Don Weeks 2018-02-09 16:16:52 UTC
Right now, we are having to use a non DPDK port on top of a DPDK port. This is not ideal.

Comment 10 Franck Baudin 2018-02-12 14:07:12 UTC
Hi Don,

(In reply to Don Weeks from comment #9)
> Right now, we are having to use a non DPDK port on top of a DPDK port. This
> is not ideal.

The main issue that I see is that OVS-DPDK doesn't support TSO/GSO.

If you feel like that you really need an RFE, can you send us a detailed description of your use case so we can have convincing arguments to the community?

Comment 13 Franck Baudin 2018-02-15 10:43:43 UTC
*** Bug 1538690 has been marked as a duplicate of this bug. ***

Comment 14 Franck Baudin 2018-03-21 09:48:41 UTC
If the only original ask is to be able to share a physical NIC between OVS-DPDK and the  kernel, the following development will provide a solution: https://blueprints.launchpad.net/tripleo/+spec/sriov-vfs-as-network-interface

Is there any feature missing in the proposal above? Can a close this BZ and mark it duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1534587 which is tracking the development above?

Comment 15 Don Weeks 2018-03-21 14:26:56 UTC
We have no intention of partitioning our NICs with SRIOV. Each NIC should be cable of being used as DPDK or non-DPDK but still participate in the OVS switch. BTW, we are now seeing that running non DPDK guest interfaces over DPDK OVS bridges is causing us issues. The guest kernel does not pull off the packets fast enough. In order to make up for not having this feature, we are asking to have the ring buffer grown. 

The use case of this is that standard typical gateway NFV VNFs have DPDK accelarated NICs for media and non-DPDK NICs for signaling in the same guest. It is very typical for this to occur. Gateway applications are driven off of signaling and are processing media based on that signaling. It is not practical for signaling to exist in a seperate VNF in this case. The VNF is the Gateway app and the app will use both signalling and media interfaces.

Comment 16 Franck Baudin 2018-03-23 10:14:12 UTC
Thanks for your quick answer Don!

Comment 17 Andreas Karis 2018-03-23 14:58:39 UTC
Hi Don,

~~~
BTW, we are now seeing that running non DPDK guest interfaces over DPDK OVS bridges is causing us issues. The guest kernel does not pull off the packets fast enough. In order to make up for not having this feature, we are asking to have the ring buffer grown. 
~~~
This is being worked in a separate ticket and separate bugzilla, let's discuss this over there. The link is: https://bugzilla.redhat.com/show_bug.cgi?id=1512941

I shared the detals in the support case, but here are they again: the current state is that this feature made it into qemu and libvirt, but the nova part is currently still missing: https://review.openstack.org/#/c/539605/ 

~~~
The use case of this is that standard typical gateway NFV VNFs have DPDK accelarated NICs for media and non-DPDK NICs for signaling in the same guest. It is very typical for this to occur. Gateway applications are driven off of signaling and are processing media based on that signaling. It is not practical for signaling to exist in a seperate VNF in this case. The VNF is the Gateway app and the app will use both signalling and media interfaces.
~~~
Just to clarify: for performance reasons, it is not possible to mix userspace and non-userspace datapaths. The current limitation is that neutron's ML2 OVS plugin creates a common bridge (br-int) that connects to all OVS bridges on a given system. If one added bridges for the kernel datapath, they would hence necessarily be connected to netdev (userspace) bridges, and packets would risk being context switched several times thus suffering additional processing time.

The current design only allows you to partition on a per hypervisor basis.

~~~
We have no intention of partitioning our NICs with SRIOV. Each NIC should be cable of being used as DPDK or non-DPDK but still participate in the OVS switch
~~~
Each NIC (or more granular, each VF of a NIC), needs to be assigned to a specific datapath: kernel or userspace. Even if/when a feature lands in OSP where you can have system datapath and netdev datapath on the same hypervisor, you will need at the minimum:
1 pNIC for the netdev datapath, dedicated to all userspace bridges and instance vNICs.
1 pNIC for the kernel datapath, dedicated to all kernel bridges and instance vNICs.

In a design with redundancy, this makes for 5 NICs (if we assume that the hypervisor needs one for the mgmt interface, at least).

The above is true until https://blueprints.launchpad.net/tripleo/+spec/sriov-vfs-as-network-interface is implemented which will allow you to use SRIOV VFs as interfaces for the bridges (the same way in which you use physical NICs as members of OVS bridges today). This is why Franck was referring to this feature to partition NICs via SR-IOV.

If I correctly interpret Franck's question, then he'd just like to know how important this feature is in order to triage it appropriately. The logic being that the use case of your application / NFV might (currently) be more easily and better addressed by technologies such as SR-IOV where you won't have to consider if you are using interfaces for userspace datapath accelerated or kernel space (offloading features) use cases.

Another point to consider is: if it was possible to assign an interface from the kernel space datapath to the instance, and another one from the userspace datapath: would your instance and application have the capabilities to deal with 2 different routes for the 2 different tasks within the instance? 

It will definitely never be possible to mix both datapaths on the same vNIC for the above reasons. I think that in the end, we are having this discussion because OVS' DPDK datapath currently is not at feature parity with OVS' kernel datapath. This comes at no surprise, all features need to be re-implemented in userspace. Hence, it may well be that some features never land in the userspace implementation (or very late). In a design where you assume that your instances need to handle loads where they will profit/be punished simultaneously from/by advantages/disadvantages of both the OVS kernel datapath and the OVS netdev datapath, you should consider using a technology such as SR-IOV. SR-IOV won't force you to make a choice, as it passes the VF directly through to your instances.

By the way, SR-IOV may have very few to no disadvantages compared to OVS DPDK in scenarios where very few instances reside on a given hypervisor and where (live-)migration is not an issue (with that said, live-migration doesn't work with OVS DPDK at the moment, either).

Comment 18 Franck Baudin 2018-09-07 16:06:34 UTC
Requires Neutron and Nova enhancements, and end to end design.

Comment 25 Don Weeks 2019-02-14 15:08:54 UTC
Franck, Andreas,

To address the SRIOV as an alternative, we are using 8 tr/rcv queues per interface. This is not possible with the current batch of SRIOV interfaces which in code limits SRIOV to 2 TR/RCV pairs. It limits the throughput of SRIOV to much for us to achieve the through put we get with DPDK. In order to achieve the same through put, we exceed the third party guest applications limit.


Note You need to log in before you can comment on or make changes to this bug.