Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1051297 - setupNetworks: nic with dhcp cannot be bonded
Summary: setupNetworks: nic with dhcp cannot be bonded
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.2.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 3.4.0
Assignee: Lior Vernia
QA Contact: Meni Yakove
URL:
Whiteboard: network
Depends On:
Blocks: rhev3.4beta 1082296 1142926
TreeView+ depends on / blocked
 
Reported: 2014-01-10 02:07 UTC by Bryan Yount
Modified: 2018-12-05 16:56 UTC (History)
15 users (show)

Fixed In Version: av3
Doc Type: Bug Fix
Doc Text:
Previously, bonds on physical host network interface cards could not be configured via the Administration Portal when the physical host network interface card was configured with DHCP. Physical host network interface card validation required that no boot protocol, IP address, subnet masks, or gateways be defined on slave network interfaces. Now, it is now possible to configure bonds on physical host network interface cards configured with DHCP.
Clone Of:
: 1082296 (view as bug list)
Environment:
Last Closed: 2014-06-09 15:08:26 UTC
oVirt Team: Network
Target Upstream Version:


Attachments (Terms of Use)
log files (deleted)
2014-01-31 21:12 UTC, Patrick Tavares
no flags Details


Links
System ID Priority Status Summary Last Updated
oVirt gerrit 25554 None None None Never
oVirt gerrit 25580 None None None Never
Red Hat Knowledge Base (Solution) 726433 None None None Never
Red Hat Product Errata RHSA-2014:0506 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Virtualization Manager 3.4.0 update 2014-06-09 18:55:38 UTC

Description Bryan Yount 2014-01-10 02:07:12 UTC
Description of problem:
After upgrading RHEV-M from 3.0 to 3.2, the customer was building a new Data Center / Cluster and was unable to configure the network on the RHEL 6.4 hosts through the Admin Portal. They had to manually edit the ifcfg-ethX files to be able to successfully configure the network ports from the Admin Portal.

Version-Release number of selected component (if applicable):
rhevm-3.2.3
vdsm-4.10.2-27.0

How reproducible:
Unsure

Steps to Reproduce:
1. Create a new Data Center and Cluster
2. Install a RHEL host with the following network configuration:

   /etc/sysconfig/network-scripts/ifcfg-eth0
   DEVICE="eth0"
   HWADDR="(removed mac addr)"
   ONBOOT="yes"
   BOOTPROTO="static"
   BROADCAST=xx.xx.xx.255
   NETWORK=xx.xx.xx.0
   NETMASK=255.255.252.0
   IPADDR=xx.xx.xx.xx

3. Add it to the new DC/Cluster.
4. Select the host and make a change to the network. Click OK.
5. (This part isn't clear currently but customer reports inability to configure network at this step. Not sure of error message.)
6. Then log into each host and edit the ifcfg-ethX files changing the following line

   from:
   BOOTPROTO="static"

   to:
   BOOTPROTO=”none”

7. Restart the network service or “ifup” the ports and have RHEVM reflect the same:
   service vdsmd stop
   service network restart
   service vdsmd start

8. From the Admin Portal, select the host again, open the network tab and click the link to setup networks. Setup bonds, attach the networks and save.  This results in a failure “internal engine failure” after a while. (every time)

9. Log back into the host and run the vdsm network recovery script. (This will only work after the failed configuration attempt above.)

   service vdsm-restore-net-config start
   service vdsmd restart

10. Configure again through the Admin Portal and this time the changes made will save correctly.

11. Reboot to double check.

Comment 7 Dan Kenigsberg 2014-01-30 17:13:19 UTC
Patrick, if this issue is easily reproducible, would you lay out clear steps to do it, and include vdsm.log supervdsm.log and engine log of the whole process? Would you make sure to use the latest rhev-3.3 when you do that? (frankly, I'd love prefer if you can test it with latest ovirt-3.4-beta, but that may be too much to ask).

I see no steps in comment #1, and comment #0 misses a crucial step 5 which explains the nature of the requested change in networking.

Comment 8 Patrick Tavares 2014-01-31 21:12:55 UTC
Created attachment 857981 [details]
log files

Comment 9 Patrick Tavares 2014-01-31 21:16:31 UTC
Dan,

My existing env:
- RHEV-M 3.3.0-0.46 (all-in-one test env with a RHEL 6.5 + KVM box A, though customer had RHEV-M 3.2 managing their RHEL + KVM box)
- Second RHEL 6.4 + KVM box B
- Both boxes were vanilla, default RHEL 6.5 installs.  Both have two NICs
- Box A was added as a hypervisor to the environment via the all-in-one install method.  Box B was added from RHEV-M manually (after being subscribed to appropriate channels in Satellite)
- Both boxes have time synchronized for easier correlation :)
- 'Default' datacenter, 'Default' cluster

Steps to reproduce:
1) After a freshly added RHEL 6.5 hypervisor box B, select Box B from Hosts tab, choose 'Setup Host Networks' from 'Network Interfaces' sub-tab.
2) Fresh Box B should show the 'rhevm' network assigned to eth0 by default (it does in my env and did in the customer's).  Drag eth1 onto eth0 to attempt to create a bond0 interface for the 'rhevm' network (the bond mode setting does not matter but I chose mode 4).
3) Click 'Ok' button and see the error message in attachement #855232.

Please reference the requested logs in attachment #857981 [details].  I've also included the vdsm/supervdsm logs from boxA as well.  I triggered the error a few times between 14:58 and 15:00 in the log files.

Please let me know if I can provide any further info.

Comment 10 Patrick Tavares 2014-01-31 21:18:17 UTC
I will try to duplicate with an ovirt-3.4 beta environment if I find some time to re-build my env with those bits.

Comment 11 Dan Kenigsberg 2014-02-01 22:40:12 UTC
Moti, can you make something of it? I see nothing damning in the reported getCaps.

{'HBAInventory': {'FC': [],
                  'iSCSI': [{'InitiatorName': 'iqn.1994-05.com.redhat:3818f73e83f9'}]},
 'ISCSIInitiatorName': 'iqn.1994-05.com.redhat:3818f73e83f9',
 'bondings': {'bond0': {'addr': '',
                        'cfg': {},
                        'hwaddr': '00:00:00:00:00:00',
                        'mtu': '1500',
                        'netmask': '',
                        'slaves': []},
              'bond1': {'addr': '',
                        'cfg': {},
                        'hwaddr': '00:00:00:00:00:00',
                        'mtu': '1500',
                        'netmask': '',
                        'slaves': []},
              'bond2': {'addr': '',
                        'cfg': {},
                        'hwaddr': '00:00:00:00:00:00',
                        'mtu': '1500',
                        'netmask': '',
                        'slaves': []},
              'bond3': {'addr': '',
                        'cfg': {},
                        'hwaddr': '00:00:00:00:00:00',
                        'mtu': '1500',
                        'netmask': '',
                        'slaves': []},
              'bond4': {'addr': '',
                        'cfg': {},
                        'hwaddr': '00:00:00:00:00:00',
                        'mtu': '1500',
                        'netmask': '',
                        'slaves': []}},
 'bridges': {'rhevm': {'addr': '192.168.1.14',
                       'cfg': {'BOOTPROTO': 'dhcp',
                               'DELAY': '0',
                               'DEVICE': 'rhevm',
                               'IPV6INIT': 'yes',
                               'MTU': '1500',
                               'NM_CONTROLLED': 'no',
                               'ONBOOT': 'yes',
                               'TYPE': 'Bridge',
                               'UUID': 'b2e23a67-784e-4ff1-b164-f0c88df82ed1'},
                       'mtu': '1500',
                       'netmask': '255.255.255.0',
                       'ports': ['eth0'],
                       'stp': 'off'}},
 'clusterLevels': ['3.0', '3.1', '3.2'],
 'cpuCores': '8',
 'cpuFlags': u'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,aperfmperf,pni,dtes64,monitor,ds_cpl,vmx,est,tm2,ssse3,cx16,xtpr,pdcm,dca,sse4_1,lahf_lm,dts,tpr_shadow,vnmi,flexpriority,model_Conroe,model_coreduo,model_core2duo,model_Penryn,model_n270',
 'cpuModel': 'Intel(R) Xeon(R) CPU           E5440  @ 2.83GHz',
 'cpuSockets': '2',
 'cpuSpeed': '2833.000',
 'cpuThreads': '8',
 'emulatedMachines': [u'rhel6.5.0',
                      u'pc',
                      u'rhel6.4.0',
                      u'rhel6.3.0',
                      u'rhel6.2.0',
                      u'rhel6.1.0',
                      u'rhel6.0.0',
                      u'rhel5.5.0',
                      u'rhel5.4.4',
                      u'rhel5.4.0'],
 'guestOverhead': '65',
 'hooks': {},
 'kvmEnabled': 'true',
 'lastClient': '192.168.1.13',
 'lastClientIface': 'rhevm',
 'management_ip': '',
 'memSize': '7871',
 'netConfigDirty': 'False',
 'networks': {'rhevm': {'addr': '192.168.1.14',
                        'bridged': True,
                        'cfg': {'BOOTPROTO': 'dhcp',
                                'DELAY': '0',
                                'DEVICE': 'rhevm',
                                'IPV6INIT': 'yes',
                                'MTU': '1500',
                                'NM_CONTROLLED': 'no',
                                'ONBOOT': 'yes',
                                'TYPE': 'Bridge',
                                'UUID': 'b2e23a67-784e-4ff1-b164-f0c88df82ed1'},
                        'gateway': '192.168.1.1',
                        'iface': 'rhevm',
                        'mtu': '1500',
                        'netmask': '255.255.255.0',
                        'ports': ['eth0'],
                        'stp': 'off'}},
 'nics': {'eth0': {'addr': '',
                   'cfg': {'BRIDGE': 'rhevm',
                           'DEVICE': 'eth0',
                           'HWADDR': '00:e0:81:b5:02:c1',
                           'IPV6INIT': 'yes',
                           'MTU': '1500',
                           'NM_CONTROLLED': 'no',
                           'ONBOOT': 'yes',
                           'UUID': 'b2e23a67-784e-4ff1-b164-f0c88df82ed1'},
                   'hwaddr': '00:e0:81:b5:02:c1',
                   'mtu': '1500',
                   'netmask': '',
                   'speed': 1000},
          'eth1': {'addr': '',
                   'cfg': {'BOOTPROTO': 'dhcp',
                           'DEVICE': 'eth1',
                           'HWADDR': '00:E0:81:B5:02:C0',
                           'NM_CONTROLLED': 'yes',
                           'ONBOOT': 'no',
                           'TYPE': 'Ethernet',
                           'UUID': 'a481615c-c9a2-4e26-92e5-62937bb38584'},
                   'hwaddr': '00:e0:81:b5:02:c0',
                   'mtu': '1500',
                   'netmask': '',
                   'speed': 0}},
 'operatingSystem': {'name': 'RHEL',
                     'release': '6.5.0.1.el6',
                     'version': '6Server'},
 'packages2': {'kernel': {'buildtime': 1386939500.0,
                          'release': '431.3.1.el6.x86_64',
                          'version': '2.6.32'},
               'libvirt': {'buildtime': 1386770011L,
                           'release': '29.el6_5.2',
                           'version': '0.10.2'},
               'qemu-img': {'buildtime': 1384327329L,
                            'release': '2.415.el6_5.3',
                            'version': '0.12.1.2'},
               'qemu-kvm': {'buildtime': 1384327329L,
                            'release': '2.415.el6_5.3',
                            'version': '0.12.1.2'},
               'spice-server': {'buildtime': 1385990636L,
                                'release': '6.el6_5.1',
                                'version': '0.12.4'},
               'vdsm': {'buildtime': 1385472772L,
                        'release': '28.0.el6ev',
                        'version': '4.10.2'}},
 'reservedMem': '321',
 'software_revision': '28.0',
 'software_version': '4.10',
 'supportedENGINEs': ['3.0', '3.1', '3.2'],
 'supportedProtocols': ['2.2', '2.3'],
 'supportedRHEVMs': ['3.0'],
 'uuid': '5FE23BBC-CDCA-32C1-B040-B22721E40BD6',
 'version_name': 'Snow Man',
 'vlans': {},
 'vmTypes': ['kvm']}

Comment 12 Moti Asayag 2014-02-02 09:44:25 UTC
We faced same issue last week on users@ovirt:

http://lists.ovirt.org/pipermail/users/2014-January/020458.html

The 'eth1' configured with 'dhcp' bootprotocol, therefore it cannot serve as a slave.

Providing it via the restapi without boot protocol would have work fine, but the setup network via UI doesn't clear any configuration reported from the host.
The origin of this issue is on Bug 907240.

We can assume that configuring a bond via the UI should clear any pre-configuration from the nics that were selected to act as slaves.
By doing so, we'll override any user-settings for the slaves (before they were selected to act as slaves).

The alternative is to remove the boot protocol from 'eth1' in this case, and restart network & vdsm (same solution as were suggested @users).

I'm in favor of the first option (clearing the pre-configured settings).

Comment 13 Dan Kenigsberg 2014-02-03 10:06:00 UTC
When setting up a network on top of NICs with existing dhcp/static address, it would be nice if it was easy to copy that address to the new network, though it is less clear what should be done with multiple NICs with different static addresses.

In any case, being able to override pre-configured address is important.

Comment 14 Patrick Tavares 2014-02-03 16:39:06 UTC
I agree with the solution where configuration of a bond via the UI should clearn any pre-configuration from the NICS that were selected to act as slaves.  Having said this, I think a warning/confirmation dialog should probably alert the user to the pre-existing config, create a backup file, as well as possibly add comments to the ifcfg-eth* file stating something to the effect of "This file was updated/modifed by vdsm/ovirt-engine/something else on <date>" to inform any CLI junkies that this host is/was being managed by ovirt-engine.

Thoughts?

Comment 15 Dan Kenigsberg 2014-02-03 17:11:54 UTC
Our ifcfg files begin with the likes of
  # Generated by VDSM version 4.14.0

During years of messing with ifcfg files, I do not recall a true need to keep a backup of the pre-ovirt config. However, having something like that seems prudent, and requires its own RFE.

Comment 17 Lior Vernia 2014-03-09 12:03:22 UTC
While testing my patch that tried to clear the boot protocol on the engine side, I've found that VDSM apparently doesn't rewrite the boot protocol according to what's sent from the engine (so even if it is removed on the engine, it won't change on the host).

Since this needs to be fixed on the VDSM side, then that in itself could be a solution to the bug without any intervention on the engine side; if slave interfaces can't have a boot protocol defined, then VDSM can clear it itself (instead of the engine asking to clear it).

Comment 18 Lior Vernia 2014-03-09 12:56:31 UTC
Hmm, this is not as simple as I made it sound; that contradicts the solution to Bug 907240. It appears to me one of them will have to remain unsolved.

Comment 19 Lior Vernia 2014-03-10 07:56:54 UTC
After some further verification the only thing causing the problem is some validation that's too strict on the engine side.

Comment 20 Meni Yakove 2014-03-16 14:20:37 UTC
rhevm-3.4.0-0.5.master.el6ev.noarch

Comment 23 errata-xmlrpc 2014-06-09 15:08:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-0506.html


Note You need to log in before you can comment on or make changes to this bug.