Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 1060524

Summary: rhev 3.3 does not work with vdsm 3.4 : Command PollVDS execution failed
Product: Red Hat Enterprise Virtualization Manager Reporter: Ohad Basan <obasan>
Component: vdsmAssignee: Douglas Schilling Landgraf <dougsland>
Status: CLOSED ERRATA QA Contact: sefi litmanovich <slitmano>
Severity: high Docs Contact:
Priority: urgent    
Version: 3.4.0CC: aberezin, bazulay, danken, dougsland, eedri, gklein, iheim, lbopf, lpeer, obasan, pstehlik, talayan, yeylon
Target Milestone: ---Keywords: AutomationBlocker
Target Release: 3.4.0Flags: danken: needinfo? (obasan)
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: av1 Doc Type: Bug Fix
Doc Text:
Previously, connecting a 3.3 Manager with a host running VDSM 4.14.1 would fail. Now, host deploys successfully.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-09 13:28:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Ohad Basan 2014-02-02 11:16:17 UTC
Description of problem:

trying to connect an engine 3.3.0-0.46.el6ev
with a host running vdsm 4.14.1-17.git40a4b45
deployment is successful and vdsm is starting up correctly but the engine is failing to bring the host up.

2014-02-02 12:39:53,296 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (VdsDeploy) Correlation ID: 93667c4, Call Stack: null, Custom Event ID: -1, Message: Installing Host cinteg07.
ci.lab.tlv.redhat.com. Stage: Termination.
2014-02-02 12:39:53,493 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (pool-4-thread-4) [93667c4] java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException
2014-02-02 12:39:53,499 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (pool-4-thread-4) [93667c4] Command PollVDS execution failed. Exception: RuntimeException: java.util.concurrent.ExecutionE
xception: java.lang.reflect.InvocationTargetException
2014-02-02 12:39:54,005 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (pool-4-thread-4) [93667c4] java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException
2014-02-02 12:39:54,006 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (pool-4-thread-4) [93667c4] Command PollVDS execution failed. Exception: RuntimeException: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException
2014-02-02 12:39:55,046 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (pool-4-thread-4) [93667c4] org.ovirt.engine.core.vdsbroker.vdsbroker.VDSRecoveringException: Recovering from crash or Initializing
2014-02-02 12:39:55,046 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (pool-4-thread-4) [93667c4] Command PollVDS execution failed. Exception: VDSRecoveringException: Recovering from crash or Initializing
2014-02-02 12:39:55,568 INFO  [org.ovirt.engine.core.bll.network.NetworkConfigurator] (pool-4-thread-4) [93667c4] Engine managed to communicate with VDSM agent on host cinteg07.ci.lab.tlv.redhat.com
2014-02-02 12:39:55,748 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-4-thread-4) [93667c4] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Network virbr0 is not attached to any interface on host cinteg07.ci.lab.tlv.redhat.com.
2014-02-02 12:39:55,806 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (pool-4-thread-4) [93667c4] START, SetVdsStatusVDSCommand(HostName = cinteg07.ci.lab.tlv.redhat.com, HostId = 9f74e867-18a6-4baf-9b6a-6f4be66b9035, status=Initializing, nonOperationalReason=NONE), log id: 3bc06419

Comment 3 Yaniv Bronhaim 2014-02-04 12:00:45 UTC
something wrong with vdsm there, it restarts in a loop by getting SIGTERM, not sure why yet. is it reproducible or relates only to that specific host?

Comment 5 Dan Kenigsberg 2014-02-04 12:54:50 UTC
Could you do `su - vdsm -s /bin/bash`
and then run /usr/share/vdsm/vdsm
to see why vdsm resets itself so often?

BTW, is sysvinit-tools installed on your host?

Comment 6 Ohad Basan 2014-02-04 13:13:36 UTC
it just gets stuck like this:
[root@cinteg07 ~]# su - vdsm -s /bin/bash
-bash-4.1$ /usr/share/vdsm/vdsm
(PID: 26271) I am the actual vdsm 4.14.1-21.git291df0d.el6 cinteg07.ci.lab.tlv.redhat.com (2.6.32-431.el6.x86_64)
Run and protect: registerDomainStateChangeCallback(callbackFunc=<bound method clientIF.contEIOVms of <clientIF.clientIF instance at 0x284b320>>)
Run and protect: registerDomainStateChangeCallback, Return response: None
Starting up MOM
Setting channels' timeout to 30 seconds.
Starting VM channels listener thread.
trying to connect libvirt


and regarding to your question > yes, it is installed.

Comment 7 Dan Kenigsberg 2014-02-04 14:39:26 UTC
Would you make sure that `service vdsmd stop` before running a new vdsm instance?

When you run Vdsm manually, is it suddenly responsive to Engine's polling?

Comment 8 Ohad Basan 2014-02-04 15:26:00 UTC
I stopped vdsmd but it could be that supervdsm resumed it.
how can I disable supervdsm?

Comment 9 Dan Kenigsberg 2014-02-04 15:37:36 UTC
supervdsm does not restart vdsm.

I do not understand your report then: with `service vdsmd start`, vdsm constantly crashes, and from the command line all is well?

When you run Vdsm manually, is it suddenly responsive to Engine's polling?

Comment 10 Yaniv Bronhaim 2014-02-04 16:06:58 UTC
you get alot of:
libvir: XML-RPC error : authentication failed: Authorization requires authentication but no agent is available.
libvir: XML-RPC error : authentication failed: Authorization requires authentication but no agent is available.
libvir: XML-RPC error : authentication failed: Authorization requires authentication but no agent is available.
libvir: XML-RPC error : authentication failed: Authorization requires authentication but no agent is available.
libvir: XML-RPC error : authentication failed: Authorization requires authentication but no agent is available.
Vm's recovery failed
Traceback (most recent call last):
  File "/usr/share/vdsm/clientIF.py", line 390, in _recoverExistingVms
    caps.CpuTopology().cores())
  File "/usr/share/vdsm/caps.py", line 127, in __init__
    self._topology = _getCpuTopology(capabilities)
  File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 832, in __call__
    value = self.func(*args)
  File "/usr/share/vdsm/caps.py", line 147, in _getCpuTopology
    capabilities = _getCapsXMLStr()
  File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 832, in __call__
    value = self.func(*args)
  File "/usr/share/vdsm/caps.py", line 141, in _getCapsXMLStr
    return libvirtconnection.get().getCapabilities()
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 146, in get
    conn = utils.retry(libvirtOpenAuth, timeout=10, sleep=0.2)
  File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 949, in retry
    return func()
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 102, in openAuth
    if ret is None:raise libvirtError('virConnectOpenAuth() failed')
libvirtError: authentication failed: Authorization requires authentication but no agent is available.

trying to connect libvirt
libvir: XML-RPC error : authentication failed: Authorization requires authentication but no agent is available.

familiar?

Comment 11 Douglas Schilling Landgraf 2014-02-04 18:09:58 UTC
(In reply to Yaniv Bronhaim from comment #10)
> you get alot of:
> libvir: XML-RPC error : authentication failed: Authorization requires
> authentication but no agent is available.
> libvir: XML-RPC error : authentication failed: Authorization requires
> authentication but no agent is available.
> libvir: XML-RPC error : authentication failed: Authorization requires
> authentication but no agent is available.
> libvir: XML-RPC error : authentication failed: Authorization requires
> authentication but no agent is available.
> libvir: XML-RPC error : authentication failed: Authorization requires
> authentication but no agent is available.
> Vm's recovery failed
> Traceback (most recent call last):
>   File "/usr/share/vdsm/clientIF.py", line 390, in _recoverExistingVms
>     caps.CpuTopology().cores())
>   File "/usr/share/vdsm/caps.py", line 127, in __init__
>     self._topology = _getCpuTopology(capabilities)
>   File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 832, in
> __call__
>     value = self.func(*args)
>   File "/usr/share/vdsm/caps.py", line 147, in _getCpuTopology
>     capabilities = _getCapsXMLStr()
>   File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 832, in
> __call__
>     value = self.func(*args)
>   File "/usr/share/vdsm/caps.py", line 141, in _getCapsXMLStr
>     return libvirtconnection.get().getCapabilities()
>   File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line
> 146, in get
>     conn = utils.retry(libvirtOpenAuth, timeout=10, sleep=0.2)
>   File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 949, in retry
>     return func()
>   File "/usr/lib64/python2.6/site-packages/libvirt.py", line 102, in openAuth
>     if ret is None:raise libvirtError('virConnectOpenAuth() failed')
> libvirtError: authentication failed: Authorization requires authentication
> but no agent is available.
> 
> trying to connect libvirt
> libvir: XML-RPC error : authentication failed: Authorization requires
> authentication but no agent is available.
> 
> familiar?

Hi,

Can you please check if virsh tool works? 
Also, I see a report [1] about "Authorization requires authentication but no agent is available" although might be not related to this one.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=872166

Comment 12 Douglas Schilling Landgraf 2014-02-04 18:31:19 UTC
Answering myself, virsh works:

# virsh
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit

virsh # list
Please enter your authentication name: vdsm@rhevh
Please enter your password: 
 Id    Name                           State
----------------------------------------------------

virsh # list --all
 Id    Name                           State
----------------------------------------------------

virsh # net-list
Name                 State      Autostart     Persistent
--------------------------------------------------
default              active     no            yes
vdsm-rhevm           active     yes           yes
vdsm-virbr0          active     yes           yes

Comment 13 Douglas Schilling Landgraf 2014-02-04 19:18:18 UTC
According with scenario looks like the polkit dir doesn't have any rule so vdsm will show us:

"trying to connect to libvirt"

# ls /var/lib/polkit-1/localauthority/10-vendor.d
# 

# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 6.5 (Santiago)


From our vdsm.spec:
=======================
# Required paths
%if 0%{?fedora} >= 18
%global _polkitdir %{_datadir}/polkit-1/rules.d
%else
%global _polkitdir %{_localstatedir}/lib/polkit-1/localauthority/10-vendor.d
%endif


and

%install
# Install the polkit for libvirt
%if 0%{?fedora} >= 18
install -Dm 0644 vdsm/vdsm-libvirt-access.rules \
                 %{buildroot}%{_polkitdir}/10-vdsm-libvirt-access.rules
%else
install -Dm 0644 vdsm/vdsm-libvirt-access.pkla \
                 %{buildroot}%{_polkitdir}/10-vdsm-libvirt-access.pkla

Comment 14 Douglas Schilling Landgraf 2014-02-04 20:36:09 UTC
More evidences:
# rpm -qa | grep -i vdsm
vdsm-python-4.14.1-24.gitb4b06ab.el6.x86_64
vdsm-hook-fileinject-4.12.0-120.git83a430b.el6.noarch
vdsm-xmlrpc-4.14.1-24.gitb4b06ab.el6.noarch
vdsm-4.14.1-24.gitb4b06ab.el6.x86_64
vdsm-python-zombiereaper-4.14.1-24.gitb4b06ab.el6.noarch
vdsm-cli-4.14.1-24.gitb4b06ab.el6.noarch


# yum downloader vdsm
# rpm2cpio ./vdsm-4.14.1-24.gitb4b06ab.el6.x86_64.rpm | cpio -div
# cd vdsm*
# find . -name *pkla*
No polkit available in the package.

- vdsm not starting

Now:
=========
- scp 10-vdsm-libvirt-access.pkla to host
# cp 10-vdsm-libvirt-access.pkla /var/lib/polkit-1/localauthority/10-vendor.d
- started vdsm manually
# service vdsmd status
VDS daemon server is running

# vdsClient -s 0 getVdsCaps (works)

# su - vdsm -s /bin/bash
-bash-4.1$ /usr/share/vdsm/vdsm 
(PID: 17731) I am the actual vdsm 4.14.1-24.gitb4b06ab.el6 cinteg07.ci.lab.tlv.redhat.com (2.6.32-431.el6.x86_64)
Run and protect: registerDomainStateChangeCallback(callbackFunc=<bound method clientIF.contEIOVms of <clientIF.clientIF instance at 0x2170320>>)
Run and protect: registerDomainStateChangeCallback, Return response: None
Starting up MOM
Setting channels' timeout to 30 seconds.
trying to connect libvirt (Not more blocking on this)
Starting VM channels listener thread.

Comment 15 Yaniv Bronhaim 2014-02-05 09:38:09 UTC
still I don't understand how some managed to work with vdsm after the merge of the with_systemd patch [1].. its almost a year since then

[1] http://gerrit.ovirt.org/#/c/12086

Comment 16 sefi litmanovich 2014-03-17 09:56:36 UTC
Verified on rhevm-3.4.0-0.3.master.el6ev.noarch,host deployment for host with vdsm-4.14.2-0.4.el6ev.x86_64.rpm works fine.

Comment 17 errata-xmlrpc 2014-06-09 13:28:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0504.html