Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1356192 - Add node using ssh public key fail even if the key is autorized
Summary: Add node using ssh public key fail even if the key is autorized
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: ovirt-node
Classification: oVirt
Component: Installation & Update
Version: 4.0
Hardware: All
OS: Linux
high
medium vote
Target Milestone: ovirt-4.0.5
: ---
Assignee: Ryan Barry
QA Contact: dguo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-13 15:42 UTC by Federico Fortini
Modified: 2016-09-22 08:12 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-09-22 08:12:09 UTC
oVirt Team: Node
rule-engine: ovirt-4.0.z+
rule-engine: ovirt-4.1+
mgoldboi: planning_ack+
fdeutsch: devel_ack+
cshao: testing_ack+


Attachments (Terms of Use)
successully login to node without password from CLI (deleted)
2016-07-13 15:42 UTC, Federico Fortini
no flags Details
engine.log error about failed auth (deleted)
2016-07-13 15:43 UTC, Federico Fortini
no flags Details
Details about public key and node fingerprint (deleted)
2016-07-13 15:47 UTC, Federico Fortini
no flags Details
var_log_secure_engine.log and var_log_secure_node.log (deleted)
2016-09-07 11:16 UTC, Yihui Zhao
no flags Details

Description Federico Fortini 2016-07-13 15:42:08 UTC
Created attachment 1179332 [details]
successully login to node without password from CLI

Description of problem:
Fresh node installation "Ovirt Node 4.0.0", when i'm trying to add node to a cluster using "SSH Public Key" fail even if can login from engine server to node without password.

Version-Release number of selected component (if applicable):


How reproducible:
Add new host from engin, select SSH Public Key instead of password.

Actual results:
Add host procedure fail with error: "Error while executing action: Cannot add Host. SSH authentication failed, verify authentication parameters are correct (Username/Password, public-key etc.) You may refer to the engine.log file for further details."

Expected results:
Successfully complete the add host Wizard

Additional info:
I'm able to issue the command ssh root@node5.ovirt.corp.vcube.it and login without password. The engine server public key was installed on host server usig the command "ssh-copy-id node5.ovirt.corp.vcube.it"

Comment 1 Federico Fortini 2016-07-13 15:43:02 UTC
Created attachment 1179333 [details]
engine.log error about failed auth

Comment 2 Federico Fortini 2016-07-13 15:47:31 UTC
Created attachment 1179335 [details]
Details about public key and node fingerprint

Comment 3 Ryan Barry 2016-08-02 13:42:25 UTC
Can you please check /var/log/secure on the Node to see if there's any information there?

Comment 4 Fabian Deutsch 2016-08-18 13:32:21 UTC
My geuss is that we do not copy /root/.ssh which basically disables ssh pubkey authentication after updates.

This should be covered by the osupdater

Comment 5 Ryan Barry 2016-08-31 14:15:53 UTC
(In reply to Fabian Deutsch from comment #4)
> My geuss is that we do not copy /root/.ssh which basically disables ssh
> pubkey authentication after updates.
> 
> This should be covered by the osupdater

In investigating this, it appears that /root is already rsynced by osupdater.

Additionally, comment#1 indicates that pubkey auth is working.

QE, can you reproduce this? If not, we may need to move it out, since there's no obvious cause, and I can't reproduce.

Comment 6 Ying Cui 2016-09-01 02:45:20 UTC
daijie, see this bug, and comment 5, could you try this bug on QE side?

Rough Steps:
1. Installed Node, make sure network enable.
2. ssh engine server
3. retrieve the public key from a SSH private key
# cd /etc/pki/ovirt-engine/keys/
# ssh-keygen -y -f engine_id_rsa > engine_id_rsa.pub
# ssh-copy-id -i engine_id_rsa.pub root@node_host
4. verify to add node using ssh PublicKey authentication in engine successful or not.

Comment 7 Yihui Zhao 2016-09-07 11:04:54 UTC
Hi all,
   Thank you for ycui's steps.

#1.Follow the steps on comment 6:

1. Installed Node, make sure network enable.
2. ssh engine server
3. retrieve the public key from a SSH private key
# cd /etc/pki/ovirt-engine/keys/
# ssh-keygen -y -f engine_id_rsa > engine_id_rsa.pub
# ssh-copy-id -i engine_id_rsa.pub root@node_host

1)After step3,the content of engine_id_rsa.pub in engine and authorized_keys in node are the same.but engine ssh node need password and add node to rhevm failed.
##[root@rhevm-40-1 keys]# ssh root@10.66.10.37
  root@10.66.10.37's password: 


#2.I add the other steps on comment 6:

1. Installed Node, make sure network enable.
2. ssh engine server
3. retrieve the public key from a SSH private key
# cd /etc/pki/ovirt-engine/keys/
# ssh-keygen -y -f engine_id_rsa > engine_id_rsa.pub
# ssh-copy-id -i engine_id_rsa.pub root@node_host
4.cp engine_id_rsa.pub /root/.ssh/authorized_keys
5.cp engine_id_rsa /root/.ssh/id_rsa

1)After step5,engine ssh node don't need password.But add node to engine failed.
[root@rhevm-40-1 .ssh]# ls
authorized_keys  id_rsa  known_hosts


[root@rhevm-40-1 .ssh]# ssh root@10.66.10.37
Last login: Wed Sep  7 18:49:10 2016 from 10.66.148.99

  imgbase status: OK

[root@dhcp-10-37 ~]# 


Add:the /var/log/secure on engine and node on attachment var_log_engine.log and var_log_node.log.

Thanks,
Yihui

Comment 8 Yihui Zhao 2016-09-07 11:16:56 UTC
Created attachment 1198652 [details]
var_log_secure_engine.log and var_log_secure_node.log

Comment 9 Ying Cui 2016-09-07 11:25:00 UTC
Ryan, could you help to check the comment 7, it seems QE can not add the rhvh to engine via SSH Public Key too. and the env. is kept.

Comment 10 Ryan Barry 2016-09-07 14:43:46 UTC
It appears that auth actually works (even from engine), but host-deploy logs indicate that the deployment failed because vdsm failed to start.

vdsm failed to start because:

Sep 07 16:30:04 dhcp-10-37.nay.redhat.com systemd[1]: ovirt-imageio-daemon.service holdoff time over, scheduling restart.
Sep 07 16:30:04 dhcp-10-37.nay.redhat.com systemd[1]: Starting oVirt ImageIO Daemon...
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: Traceback (most recent call last):
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: File "/usr/bin/ovirt-imageio-daemon", line 14, in <module>
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: server.main(sys.argv)
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 48, in main
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: start(config)
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 68, in start
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: secure_server(config, image_server)
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: File "/usr/lib/python2.7/site-packages/ovirt_imageio_daemon/server.py", line 90, in secure_server
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: keyfile=config.key_file, server_side=True)
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: File "/usr/lib64/python2.7/ssl.py", line 913, in wrap_socket
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: ciphers=ciphers)
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: File "/usr/lib64/python2.7/ssl.py", line 526, in __init__
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: self._context.load_cert_chain(certfile, keyfile)
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com ovirt-imageio-daemon[2983]: IOError: [Errno 2] No such file or directory
Sep 07 16:30:05 dhcp-10-37.nay.redhat.com systemd[1]: ovirt-imageio-daemon.service: main process exited, code=exited, status=1/FAILURE


Investigating host-deploy...

Comment 11 Ryan Barry 2016-09-07 15:31:56 UTC
I'm still not able to reproduce this locally, unfortunately.

It's clear from the logs on the systems that SSH auth is working, and that it's failing when vdsm checks is_configured (by checking ovirt-imageio-daemon), but it's not clear why. I need access to RHEVM to do any debugging here...

I also don't know the authentication for your rhevm instance to try re-deploying. The clocks on the systems are slightly off, but I'm not sure if that's significant here.

Can you please provide auth details for rhevm-40-1.englab tomorrow?

Comment 18 Ryan Barry 2016-09-13 14:06:43 UTC
We don't have a reproducer for this. Can we move it out?

Comment 19 Fabian Deutsch 2016-09-14 22:04:41 UTC
Federico, can you still reproduce this issue with a more recent Node image?

Comment 20 Fabian Deutsch 2016-09-22 08:12:09 UTC
Closing this for now because we can not reproduce it.

Please reopen if necessary.


Note You need to log in before you can comment on or make changes to this bug.