Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1043808 - For an interface with multiple VLAN interfaces, rhev Host assigns highest mtu of a vlan interface to all vlan interface under the parent interface .
Summary: For an interface with multiple VLAN interfaces, rhev Host assigns highest mtu...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.2.0
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
: 3.5.0
Assignee: Alona Kaplan
QA Contact: Meni Yakove
URL:
Whiteboard: network
Depends On:
Blocks: rhev3.5beta 1156165
TreeView+ depends on / blocked
 
Reported: 2013-12-17 09:24 UTC by Jaison Raju
Modified: 2018-12-06 15:35 UTC (History)
16 users (show)

Fixed In Version: vt1.3
Doc Type: Bug Fix
Doc Text:
Previously, for an host interface that has multiple VLAN interfaces, the highest MTU available was assigned to all VLAN interfaces under that interface, and caused the host going into a non-responsive state. This bug fix moves the setting of the host level value of a default MTU to the engine side so a default value is in place if the MTU is not manually set. You can set the default MTU by setting the 'DefaultMTU' property using the engine-config tool. The default host level MTU must be the same as the data center level MTU, otherwise the network is considered out of synchronization. After upgrading to Red Hat Enterprise Virtrualization 3.5, if the host level and the data center level MTU is not the same, the network will be out of synchronization.
Clone Of:
Environment:
Last Closed: 2015-02-11 17:56:52 UTC
oVirt Team: Network
Target Upstream Version:
nyechiel: Triaged+


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:0158 normal SHIPPED_LIVE Important: Red Hat Enterprise Virtualization Manager 3.5.0 2015-02-11 22:38:50 UTC

Description Jaison Raju 2013-12-17 09:24:57 UTC
Description of problem:
For an interface with multiple VLAN interfaces, rhev Host assigns highest mtu of a vlan interface to all vlan interface under the interfaces .

Version-Release number of selected component (if applicable):
* RHEV 3.2

How reproducible:


Steps to Reproduce:
1. Create multiple Logical Interface on vlan, make sure atleast one vlan has mtu value more that 1500 ( say 9000)
2. Add the host to the Cluster . & confirm the network configuration on RHEV Host.


Actual results:
ifconfig will show all vlan interfaces clubbed under same interface has
MTU as 9000 .
The ifcfg file will show that the vlan interface selected with higher MTU has
vlan set & also the parent interface ( like eth1 ) also has mtu set .

The issue here is that that when other vlans are not configured to use 9000 MTU,
it ends with network issue & host going Non-responsive / non-operational .


Expected results:


Additional info:

I could confirm the following behaviour which caused this issue .

i. By default MTU of an interface is 1500 . TO enable any MTU over this value 'MTU=' syntax needs to be mentioned in its ifcfg file .
( /etc/sysconfig/network-scripts/ifcfg-* )
ii. A vlan interface cannot be assigned a higher MTU than its parent interface over which it lies .
    ( Hence by default rhevm make sure that the em1 interface gets the highest mtu among all the vlan interface it would create on it )
iii. For a logical network in rhevm , if no MTU is mentioned , rhevm does not add 'MTU' syntax in ifcfg configuration files .
iv. If an MTU is not mentioned in ifcfg configuration of a vlan , the network scripts assigns in the mtu of the underlying interfaces .

All the above behaviour is expected as a Linux networking perspective .

Here the main issue is that rhevm goes ahead & assigns assigns 9000 mtu for interfaces without making the admin aware of this .
I propose that rhevm should throw a warning message while setting up network on RHEV Host , if rhevm is clubbing multiple
Logical Networks on one interface & atleast one of these networks has an mtu set & another one does not have any mtu set .
The message should mention the Logical Networks assigned on the interface & what default MTU will be set for Networks for
which Override MTU is not opted for .
Since modifying Linux Networking component for this behaviour may not be a feasible solution , hence the above modification
would be the best option to make the admin aware of the implications when clubbing vlans with & without mtu set .

Expected behaviour :
i. While clubbing multiple vlans on RHEV Host ( there is atleast 1 network with mtu set & 1 with default) OR
ii. while adding multiple logical networks to cluster which will result in the above i. scenario ,
there should be a popup warning in web admin GUI .

Comment 2 Dan Kenigsberg 2014-01-05 11:21:04 UTC
Why is the reported issue a bug? What averse effects does it have? Why is a warning needed/useful?

Comment 3 Dan Kenigsberg 2014-01-06 13:42:09 UTC
Answering myself, by reading comment 0 more carefully. If eth0.100 was defined with no MTU whatsoever, and eth0.200 has MTU=9000, eth0.100 would not see its former default of 1500, but the high MTU of eth0.

To avoid that, Vdsm could explicitly interpret "no MTU" as MTU=1500. That's quite easy to do, but we went a long distance in the original MTU support to avoid that: we wanted not to add Vdsm-specific default on top of the one in the kernel.

Personally, I do not mind over-riding the kernel default value: it is not going to change from its current 1500 too soon.

However, Engine can have its own default, and send it explicitly instead of sending nothing. I find this cleaner - if oVirt needs a default value for MTU, let us keep it at the top, where it is guaranteed to be shared by all clustered hosts or changed in a distant future.

Comment 4 Moti Asayag 2014-01-06 14:00:22 UTC
(In reply to Dan Kenigsberg from comment #3)
> Answering myself, by reading comment 0 more carefully. If eth0.100 was
> defined with no MTU whatsoever, and eth0.200 has MTU=9000, eth0.100 would
> not see its former default of 1500, but the high MTU of eth0.
> 
> To avoid that, Vdsm could explicitly interpret "no MTU" as MTU=1500. That's
> quite easy to do, but we went a long distance in the original MTU support to
> avoid that: we wanted not to add Vdsm-specific default on top of the one in
> the kernel.
> 
> Personally, I do not mind over-riding the kernel default value: it is not
> going to change from its current 1500 too soon.
> 
> However, Engine can have its own default, and send it explicitly instead of
> sending nothing. I find this cleaner - if oVirt needs a default value for
> MTU, let us keep it at the top, where it is guaranteed to be shared by all
> clustered hosts or changed in a distant future.

From engine pov, the state of mind of the internal discussions around the 'jumbo frame' feature was not to define any host's default on engine side, as it may vary between different OSs/HW.

If no MTU is configured on a specific network, its value value will be presented to the user as "MTU: Host's default" in the network general sub-tab, stating the engine doesn't guess the host specific default MTU.

http://www.ovirt.org/Features/Design/Network/Jumbo_frames

Comment 5 Dan Kenigsberg 2014-01-06 15:51:01 UTC
Moti, the only question is whether our former resolution is worth the confusion to the customer. I see three options:

1. Keep things as they are. A customer that cares about his MTU should define it explicitly.
2. Keep thinkgs as they are, but add a warning when Engine expects that the meaning of "default mtu" is about to change for an existing network. That's what Jaison Rju has asked.
3.a. Eliminate the concept of "MTU: Host's default", and use an oVirt-level default.
3.b. implement the oVirt-level default on Vdsm side.

I like option 3.a. What is your opinion?

Comment 6 Dan Kenigsberg 2014-02-03 16:11:12 UTC
Until this issue is fixed, I can only ask the customer to set an explicit MTU value on his non-jambo networks, too.

Option 3.b is as simple as the following patch, but there's some more dead code to be removed, due to handling of non-existing mtu. I am reluctant to implement it since it introduces an ovirt deafult mtu in the back door.

--- a/vdsm/configNetwork.py
+++ b/vdsm/configNetwork.py
@@ -234,6 +234,8 @@ def addNetwork(network, vlan=None, bonding=None, nics=None, ipaddr=None,
 
     if mtu:
         mtu = int(mtu)
+    else:
+        mtu = netinfo.DEFAULT_MTU
 
     if prefix:
         if netmask:

Comment 7 Moti Asayag 2014-02-04 14:19:16 UTC
(In reply to Dan Kenigsberg from comment #5)
> Moti, the only question is whether our former resolution is worth the
> confusion to the customer. I see three options:
> 
> 1. Keep things as they are. A customer that cares about his MTU should
> define it explicitly.
> 2. Keep thinkgs as they are, but add a warning when Engine expects that the
> meaning of "default mtu" is about to change for an existing network. That's
> what Jaison Rju has asked.
> 3.a. Eliminate the concept of "MTU: Host's default", and use an oVirt-level
> default.
> 3.b. implement the oVirt-level default on Vdsm side.
> 
> I like option 3.a. What is your opinion?

I'm in favor of 3.a as well.

Note that as part of implementing 3.a, we should include the following:
1. An upgrade script to update the network's mtu column (in network table) to 1500. This will simplify the 'sync network' check.
2. Provide the user the ability to define its desired default MTU (1500 will be the system default).

This will reflect the actual configured MTU when adding/updating a network via the webadmin/rest.

Comment 8 Meni Yakove 2014-07-17 06:50:54 UTC
ovirt-engine-3.5.0-0.0.master.20140715172116.git4687dc1.el6.noarch
vdsm-4.16.0-27.git00146ed.el6.x86_64

Comment 9 Lior Vernia 2014-11-30 09:48:27 UTC
Alona, please properly document the behavior - including the possible/probable marking of networks as out-of-sync when upgrading the engine to 3.5.

Comment 10 Julie 2014-12-01 06:13:17 UTC
Hi Alona,
   I've edited the doc text but not sure if I have understand it correctly. Can you please have a look and let me know if anything needs to be changed.

Kind regards,
Julie

Comment 11 Alona Kaplan 2014-12-01 06:42:21 UTC
Hi Julie,
I would also mention that because of this fix, upgrade to 3.5 engine can cause networks with default mtu to be marked as out-of-sync. (As I explained in the previous doc-text).

Comment 12 Julie 2014-12-01 07:04:11 UTC
(In reply to Alona Kaplan from comment #11)
> Hi Julie,
> I would also mention that because of this fix, upgrade to 3.5 engine can
> cause networks with default mtu to be marked as out-of-sync. (As I explained
> in the previous doc-text).

hi Alona,
 Upgrading to 3.5 will get out of sync because of this reason: 'Also note that the default host level MTU must be the same as the data center level MTU, otherwise the network is considered out of synchronization.'
Is this not enough to cover this message?

Do you also want to suggest here how to fix the out of sync issue after upgrade?

BTW, what's your IRC nick? probably easier if I just ping you.

Comment 13 Alona Kaplan 2014-12-01 07:24:43 UTC
(In reply to Julie from comment #12)
> (In reply to Alona Kaplan from comment #11)
> > Hi Julie,
> > I would also mention that because of this fix, upgrade to 3.5 engine can
> > cause networks with default mtu to be marked as out-of-sync. (As I explained
> > in the previous doc-text).
> 
> hi Alona,
>  Upgrading to 3.5 will get out of sync because of this reason: 'Also note
> that the default host level MTU must be the same as the data center level
> MTU, otherwise the network is considered out of synchronization.'
> Is this not enough to cover this message?
> 

It is the reason. But I still think it worth mentioning the upgrade issue to avoid the user be surprised that some of his networks are out-of-sync after the upgrade.
Please also mention that setting the value of the default mtu is done by setting 'DefaultMTU' using ovirt-engine-config.

> Do you also want to suggest here how to fix the out of sync issue after
> upgrade?

The fix is just to mark the network as 'has to be synced' via the setup networks. It is trivial, I don't think it worth mentioning it.

> 
> BTW, what's your IRC nick? probably easier if I just ping you.

alkaplan, I'm available on ovirt channel.

Comment 14 Alona Kaplan 2014-12-01 07:28:39 UTC
(In reply to Alona Kaplan from comment #13)
> (In reply to Julie from comment #12)
> > (In reply to Alona Kaplan from comment #11)
> > > Hi Julie,
> > > I would also mention that because of this fix, upgrade to 3.5 engine can
> > > cause networks with default mtu to be marked as out-of-sync. (As I explained
> > > in the previous doc-text).
> > 
> > hi Alona,
> >  Upgrading to 3.5 will get out of sync because of this reason: 'Also note
> > that the default host level MTU must be the same as the data center level
> > MTU, otherwise the network is considered out of synchronization.'
> > Is this not enough to cover this message?
> > 
> 
> It is the reason. But I still think it worth mentioning the upgrade issue to
> avoid the user be surprised that some of his networks are out-of-sync after
> the upgrade.
> Please also mention that setting the value of the default mtu is done by
> setting 'DefaultMTU' using ovirt-engine-config.

engine-config -s DefaultMTU=1500

> 
> > Do you also want to suggest here how to fix the out of sync issue after
> > upgrade?
> 
> The fix is just to mark the network as 'has to be synced' via the setup
> networks. It is trivial, I don't think it worth mentioning it.
> 
> > 
> > BTW, what's your IRC nick? probably easier if I just ping you.
> 
> alkaplan, I'm available on ovirt channel.

Comment 16 errata-xmlrpc 2015-02-11 17:56:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0158.html


Note You need to log in before you can comment on or make changes to this bug.