Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1366998 - drop NM masking code, to allow deployment over bond and vlan
Summary: drop NM masking code, to allow deployment over bond and vlan
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-node
Classification: oVirt
Component: General
Version: ---
Hardware: Unspecified
OS: Unspecified
urgent
urgent vote
Target Milestone: ovirt-4.0.2
: ---
Assignee: Ryan Barry
QA Contact: Meni Yakove
URL:
Whiteboard:
: 1366562 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-15 08:44 UTC by dguo
Modified: 2016-08-22 12:32 UTC (History)
19 users (show)

Fixed In Version: redhat-release-virtualization-host-4.0-2.el7.x86_64.rpm
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-22 12:32:46 UTC
oVirt Team: Node
rule-engine: ovirt-4.0.z+
ykaul: blocker+
ylavi: planning_ack+
fdeutsch: devel_ack+
ycui: testing_ack+


Attachments (Terms of Use)
engine.log (deleted)
2016-08-15 08:44 UTC, dguo
no flags Details
network_script on rhvh (deleted)
2016-08-15 08:47 UTC, dguo
no flags Details
sosreport on rhvh (deleted)
2016-08-15 08:48 UTC, dguo
no flags Details
rhvh: /var/log (deleted)
2016-08-15 08:48 UTC, dguo
no flags Details
Creating vlan over p3p1 (deleted)
2016-08-16 02:36 UTC, dguo
no flags Details
Details of "p3p1.20" (deleted)
2016-08-16 02:37 UTC, dguo
no flags Details
After creating the vlan (deleted)
2016-08-16 02:38 UTC, dguo
no flags Details
host deploy log (deleted)
2016-08-17 06:49 UTC, dguo
no flags Details
ifcfg file after creating bond (deleted)
2016-08-17 06:53 UTC, dguo
no flags Details
engine.log_part_aa (deleted)
2016-08-17 07:02 UTC, dguo
no flags Details
engine.log_part_ab (deleted)
2016-08-17 07:03 UTC, dguo
no flags Details
engine.log_part_ac (deleted)
2016-08-17 07:05 UTC, dguo
no flags Details

Description dguo 2016-08-15 08:44:43 UTC
Created attachment 1190819 [details]
engine.log

Description of problem:
Failed to add RHVH to engine with vlan configured

Version-Release number of selected component (if applicable):
redhat-virtualization-host-4.0-20160812.0.x86_64
imgbased-0.8.4-1.el7ev.noarch
vdsm-4.18.11-1.el7ev.x86_64
Red Hat Virtualization Manager Version: 4.0.2.6-0.1.el7ev

How reproducible:
100%

Steps to Reproduce:
1.Install RHVH
2.Configure a vlan on RHVH
3.On RHEVM portal, create a new datacenter and a new Cluster. 
4.On RHEVM portal, select "Network", select target ovirtmgmt then "Edit", enable "Enable VLAN tagging".
5. Adding RHVH from engine side using the vlan ip configured in step#2

Actual results:
After step5, failed to add rhvh to engine. The error were shown as "Failed to install Host dguo_vlan. Processing stopped due to timeout."

Expected results:
After step5, the RHVH could be added to engine successfully

Additional info:
During the installing period, the VLAN IP on RHVH were disappear.

Comment 1 dguo 2016-08-15 08:47:33 UTC
Created attachment 1190821 [details]
network_script on rhvh

Comment 2 dguo 2016-08-15 08:48:16 UTC
Created attachment 1190822 [details]
sosreport on rhvh

Comment 3 dguo 2016-08-15 08:48:39 UTC
Created attachment 1190823 [details]
rhvh: /var/log

Comment 4 Yaniv Lavi 2016-08-15 12:48:38 UTC
Can you have a look?

Comment 5 Edward Haas 2016-08-15 12:57:10 UTC
(In reply to dguo from comment #0)
> Created attachment 1190819 [details]
> engine.log
> 
> Description of problem:
> Failed to add RHVH to engine with vlan configured
> 
> Version-Release number of selected component (if applicable):
> redhat-virtualization-host-4.0-20160812.0.x86_64
> imgbased-0.8.4-1.el7ev.noarch
> vdsm-4.18.11-1.el7ev.x86_64
> Red Hat Virtualization Manager Version: 4.0.2.6-0.1.el7ev
> 
> How reproducible:
> 100%
> 
> Steps to Reproduce:
> 1.Install RHVH
> 2.Configure a vlan on RHVH
> 3.On RHEVM portal, create a new datacenter and a new Cluster. 
> 4.On RHEVM portal, select "Network", select target ovirtmgmt then "Edit",
> enable "Enable VLAN tagging".
> 5. Adding RHVH from engine side using the vlan ip configured in step#2
> 
> Actual results:
> After step5, failed to add rhvh to engine. The error were shown as "Failed
> to install Host dguo_vlan. Processing stopped due to timeout."
> 
> Expected results:
> After step5, the RHVH could be added to engine successfully
> 
> Additional info:
> During the installing period, the VLAN IP on RHVH were disappear.

Please provide more details on what exactly has been done:
- How a VLAN has been added and to what? (and where?).
- If the management network has been moved to a VLAN, how could it be accessed? (it was accessed obviously when there was no vlan, so why with a vlan you are still expecting to be able and access it).

As a general point: The management network is a special one, it should be edited with care as you can end up loosing the host.
If the management should have been set with a VLAN, then this needs to be performed before adding the host to Engine.

Comment 6 Edward Haas 2016-08-15 13:02:11 UTC
Comment on attachment 1190823 [details]
rhvh: /var/log

No VDSM logs

Comment 7 dguo 2016-08-16 02:35:24 UTC
(In reply to Edward Haas from comment #5)
> (In reply to dguo from comment #0)
> > Created attachment 1190819 [details]
> > engine.log
> > 
> > Description of problem:
> > Failed to add RHVH to engine with vlan configured
> > 
> > Version-Release number of selected component (if applicable):
> > redhat-virtualization-host-4.0-20160812.0.x86_64
> > imgbased-0.8.4-1.el7ev.noarch
> > vdsm-4.18.11-1.el7ev.x86_64
> > Red Hat Virtualization Manager Version: 4.0.2.6-0.1.el7ev
> > 
> > How reproducible:
> > 100%
> > 
> > Steps to Reproduce:
> > 1.Install RHVH
> > 2.Configure a vlan on RHVH
> > 3.On RHEVM portal, create a new datacenter and a new Cluster. 
> > 4.On RHEVM portal, select "Network", select target ovirtmgmt then "Edit",
> > enable "Enable VLAN tagging".
> > 5. Adding RHVH from engine side using the vlan ip configured in step#2
> > 
> > Actual results:
> > After step5, failed to add rhvh to engine. The error were shown as "Failed
> > to install Host dguo_vlan. Processing stopped due to timeout."
> > 
> > Expected results:
> > After step5, the RHVH could be added to engine successfully
> > 
> > Additional info:
> > During the installing period, the VLAN IP on RHVH were disappear.
> 
> Please provide more details on what exactly has been done:
> - How a VLAN has been added and to what? (and where?).
> - If the management network has been moved to a VLAN, how could it be
> accessed? (it was accessed obviously when there was no vlan, so why with a
> vlan you are still expecting to be able and access it).
> 
> As a general point: The management network is a special one, it should be
> edited with care as you can end up loosing the host.
> If the management should have been set with a VLAN, then this needs to be
> performed before adding the host to Engine.

Indeed, more than one nic were existing on the rhvh side, one of them(#em1) is connected to public switch, the other #p3p1 is connected to vlan switch which also connect the rhevm. So could always access the rhvh from #em1, and creating vlan over #p3p1.

1. Creating a dhcp vlan over #p3p1 from cockpit, called #p3p1.20 with ip 192.168.20.99
2. From engine side which also has an internal ip 192.168.20.41, adding the rhvh with vlan tagging "20"

For details, I will attach the cockpit screenshot.

Comment 8 dguo 2016-08-16 02:36:53 UTC
Created attachment 1191036 [details]
Creating vlan over p3p1

Comment 9 dguo 2016-08-16 02:37:58 UTC
Created attachment 1191038 [details]
Details of "p3p1.20"

Comment 10 dguo 2016-08-16 02:38:47 UTC
Created attachment 1191039 [details]
After creating the vlan

Comment 11 Edward Haas 2016-08-16 05:58:53 UTC
Thank you for the clarification.

Please provide the VDSM and Engine logs (VDSM ones were missing from the previous logs) from when the failure occurred.

We will also need the list of ifcfg files (with content) after the cockpit configuration has been completed and just before you attempt to add the host to engine. Please also issue an 'ip addr' command and provide the output.

Final question (to get a context to all of this): Is this a regression test? I mean, has this scenario of adding a host through a VLAN management interface has been tested in 3.6?

Comment 12 Edward Haas 2016-08-16 18:24:45 UTC
Results from the tests conducted by mburman:
- Avoid masking NetworkManager allowed the host deployment to finish.
- Configuring a vlan or bond device through Cockpit created an ifcfg file with an uuid as its name. VDSM does not support such naming, it expects the ifcfg file to be named with the device name.

Comment 13 Fabian Deutsch 2016-08-16 19:22:33 UTC
Great findings. We'll revert the masking.

This bug will be used to track the bug which affects the deployment failure.

The ifcfg issue is covered in bug 1367378.

One question: How did you define the bond/vlan?

Comment 14 Dan Kenigsberg 2016-08-17 06:39:37 UTC
*** Bug 1366562 has been marked as a duplicate of this bug. ***

Comment 15 dguo 2016-08-17 06:48:33 UTC
(In reply to Edward Haas from comment #11)
> Thank you for the clarification.
> 
> Please provide the VDSM and Engine logs (VDSM ones were missing from the
> previous logs) from when the failure occurred.
> 
> We will also need the list of ifcfg files (with content) after the cockpit
> configuration has been completed and just before you attempt to add the host
> to engine. Please also issue an 'ip addr' command and provide the output.
> 
> Final question (to get a context to all of this): Is this a regression test?
> I mean, has this scenario of adding a host through a VLAN management
> interface has been tested in 3.6?

Please see the logs and config files attached.


For the final question, you mean ngn3.6? This scenario has been blocked by bug 1329956

Comment 16 dguo 2016-08-17 06:49:46 UTC
Created attachment 1191475 [details]
host deploy log

Comment 17 dguo 2016-08-17 06:53:18 UTC
Created attachment 1191476 [details]
ifcfg file after creating bond

[root@dell-op790-01 ~]# ip a s
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: p4p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:10:18:81:a4:a0 brd ff:ff:ff:ff:ff:ff
3: p4p2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:10:18:81:a4:a2 brd ff:ff:ff:ff:ff:ff
4: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether d4:be:d9:95:61:ca brd ff:ff:ff:ff:ff:ff
    inet 10.66.148.7/22 brd 10.66.151.255 scope global dynamic em1
       valid_lft 13589sec preferred_lft 13589sec
    inet6 2620:52:0:4294:d6be:d9ff:fe95:61ca/64 scope global noprefixroute dynamic 
       valid_lft 2591968sec preferred_lft 604768sec
    inet6 fe80::d6be:d9ff:fe95:61ca/64 scope link 
       valid_lft forever preferred_lft forever
5: p3p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:1b:21:27:47:0b brd ff:ff:ff:ff:ff:ff
6: p3p1.20@p3p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 00:1b:21:27:47:0b brd ff:ff:ff:ff:ff:ff
    inet 192.168.20.99/24 brd 192.168.20.255 scope global dynamic p3p1.20
       valid_lft 85751sec preferred_lft 85751sec
    inet6 fe80::21b:21ff:fe27:470b/64 scope link 
       valid_lft forever preferred_lft forever

Comment 18 dguo 2016-08-17 07:02:03 UTC
Created attachment 1191477 [details]
engine.log_part_aa

Comment 19 dguo 2016-08-17 07:03:47 UTC
Created attachment 1191478 [details]
engine.log_part_ab

Comment 20 dguo 2016-08-17 07:05:18 UTC
Created attachment 1191479 [details]
engine.log_part_ac

Comment 21 dguo 2016-08-19 02:41:22 UTC
Re-test the below scenarios with build 20160817.0, all were added to engine successfully.

1. dhcp vlan
2. static vlan
3. dhcp bond
4. static bond
5. dhcp bond+vlan
6. static bond+vlan

Test steps:
1.Install rhvh
2.Update the cfg files for each scenario
3.Restart the network service to up the bond/vlan
4.Add the rhvh to engine

Actual result:
1.Add the rhvh successfully

Comment 22 dguo 2016-08-20 05:47:21 UTC
The above verification was done base on creating ifcfg-files manually. 

And the issue that the vdsm handle the ifcfg file is coverd in Bug 1367378

Comment 23 dguo 2016-08-20 05:50:58 UTC
Ryan,

Could you please provide the patch for this.


Note You need to log in before you can comment on or make changes to this bug.