Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1517295 - SSH from a RHEL-6.6 client to a RHEL-7.4 OpenSSH server fails with: "ssh_dispatch_run_fatal: [...] : error in libcrypto [preauth]"
Summary: SSH from a RHEL-6.6 client to a RHEL-7.4 OpenSSH server fails with: "ssh_disp...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: pre-dev-freeze
: ---
Assignee: Eduardo Habkost
QA Contact: Shai Revivo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-24 14:05 UTC by Aniket Bhavsar
Modified: 2018-09-03 11:34 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-15 14:39:12 UTC


Attachments (Terms of Use)
Server side sshd debug logs (deleted)
2017-11-28 17:19 UTC, Aniket Bhavsar
no flags Details
Client side sshd debug logs (deleted)
2017-11-28 17:19 UTC, Aniket Bhavsar
no flags Details

Description Aniket Bhavsar 2017-11-24 14:05:35 UTC
Description of problem:
Getting error while doing ssh "key_verify failed for server_host_key" with openssl-1.0.2k-8.el7

Version-Release number of selected component (if applicable):
openssl-1.0.2k-8.el7

Additional info:
As per customer the issue exists only when cu is running the RHEL7.4 in there openstack environment and openssl-1.0.2k-8.el7 on RHEL7 else ssh works fine without any issue. If customer downgrade the openssl and openssl-libs rpm to openssl-1.0.1e-60.el7 or when RHEL7 is not configured on openstack environment then the issue goes away.

Comment 2 Tomas Mraz 2017-11-24 14:36:29 UTC
I need the debug information from the ssh server. Also what is the cat /proc/cpuinfo output on the server?

Comment 3 Aniket Bhavsar 2017-11-28 17:19:01 UTC
Created attachment 1359982 [details]
Server side sshd debug logs

Comment 4 Aniket Bhavsar 2017-11-28 17:19:41 UTC
Created attachment 1359983 [details]
Client side sshd debug logs

Comment 6 Tomas Mraz 2017-11-28 17:40:09 UTC
What is the qemu version on the host that runs the openstack guests?

Comment 7 Aniket Bhavsar 2017-11-28 20:46:14 UTC
Please find below required details:

[root@comp1 ~]# rpm -qf /usr/libexec/qemu-kvm
qemu-kvm-rhev-2.9.0-10.el7.x86_64
[root@comp1 ~]# /usr/libexec/qemu-kvm --version
QEMU emulator version 2.9.0(qemu-kvm-rhev-2.9.0-10.el7)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
[root@comp1 ~]# rpm -qi qemu-kvm-rhev
Name        : qemu-kvm-rhev
Epoch       : 10
Version     : 2.9.0
Release     : 10.el7
Architecture: x86_64
Install Date: Wed 25 Oct 2017 07:42:58 AM PDT
Group       : Development/Tools
Size        : 11748154
License     : GPLv2+ and LGPLv2+ and BSD
Signature   : RSA/SHA256, Wed 14 Jun 2017 09:04:58 AM PDT, Key ID 199e2f91fd431d51
Source RPM  : qemu-kvm-rhev-2.9.0-10.el7.src.rpm
Build Date  : Tue 13 Jun 2017 06:33:29 AM PDT
Build Host  : x86-019.build.eng.bos.redhat.com
Relocations : (not relocatable)
Packager    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
Vendor      : Red Hat, Inc.
URL         : http://www.qemu.org/
Summary     : QEMU is a machine emulator and virtualizer
Description :
qemu-kvm-rhev is an open source virtualizer that provides hardware
emulation for the KVM hypervisor. qemu-kvm-rhev acts as a virtual
machine monitor together with the KVM kernel modules, and emulates the
hardware for a full system such as a PC and its associated peripherals.

Comment 8 Tomas Mraz 2017-11-29 11:09:27 UTC
Unfortunately I am unable to reproduce the issue. I was not able to reproduce the issue on CPU with this cpuinfo even with running the same qemu-kvm-rhev package:

vendor_id	: AuthenticAMD
cpu family	: 21
model		: 1
model name	: AMD Opteron(TM) Processor 6272
stepping	: 2
microcode	: 0x600063d
cpu MHz		: 2094.668
cache size	: 2048 KB
physical id	: 0
siblings	: 16
core id		: 2
cpu cores	: 8
apicid		: 34
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 nodeid_msr topoext perfctr_core perfctr_nb cpb hw_pstate arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bogomips	: 4189.33
TLB size	: 1536 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb

As it can be seen from above it is not the exactly same CPU though.

I am reassigning to qemu-kvm-rhev for further investigation.

Comment 10 Kashyap Chamarthy 2017-12-01 12:24:10 UTC
There's not quite enough detail here.

- Please also post the output of /proc/cpuinfo from the *host*.  This is important to compare the CPU flags

  We'd also need the guest XML, and also the QEMU command-line (located in /var/log/libvirt/qemu/instance-XXXXXX.log).

Comment 11 Kashyap Chamarthy 2017-12-01 12:53:16 UTC
[I've sent my previous message too soon.]

After some investigation, and talking to QEMU folks, what I've learnt so 
far is:

- Dave Gilbert from the QEMU team observes: the guest CPU is seeing the
  CPU flags: '3dnowext' and '3dnow' being declared, however, we don't
  see such flags when we cross-checked on a *real* AMD Opteron 
  (Processor 6272).

- Comparing the CPU from the SSH server (the RHEL 7.4 host) in comment#5
  with the CPU flags of a *real* AMD Opteron, the guest is seeing the
  following *extra* flags
  
    3dnow, 3dnowext,acpi, adx, arat, art, bmi1, bmi2, clflushopt, 
    eagrfpu, erms, fsgsbase, nopl, pu, sep, smap, smep, ss, xgetbv1,
    xsaveopt

- And talking to Dan Berrange, it seems: Potentially the crypto library
  is taking a different code path based on the CPUID.  And probably the
  crypto library has some check where if it sees CPUID for feature X
  enabled, it assumes it can use feature X & Y, but then QEMU CPU model
  isn't exposing Y.


(I'm also Ccing Eduardo Habkost from the QEMU team, who works on CPU
modelling infrastructure.  Eduardo, any comments here?)

Comment 12 Tomas Mraz 2017-12-01 13:06:01 UTC
If it was like that, I'd expect seeing SIGILL when the Y feature is used.

Comment 13 Kashyap Chamarthy 2017-12-01 14:28:05 UTC
*Important*: Along with what I asked in comment#10 (which is critical), also get the 'sosreport' from the *bare metal* host where the guests are running, this is essential.  (The sosreports currently I see in 'collab-shell' are only for "host-5-232-59-106", which is the *guest* (OpenSSH server) running on your OpenStack environment.)

And assuming this is a Compute host, also post /etc/nova/nova.conf.

Comment 17 Eduardo Habkost 2017-12-09 01:51:22 UTC
(In reply to Tomas Mraz from comment #12)
> If it was like that, I'd expect seeing SIGILL when the Y feature is used.

This might not happen if the host CPU does support the feature but the VCPU has the feature disabled.  In this case the hypervisor have no way to know that the instruction was used, and might skip saving/loading registers that the guest is not supposed to be using.  This wouldn't be my first bet, but it's still possible.

We really need to try to get a reproducer for this, or it might be very difficult to debug.  Was anybody from Red Hat able to reproduce this in a test environment?

Comment 18 Kashyap Chamarthy 2017-12-13 15:41:07 UTC
Adding NEEDINFO on the reporter back; Karen seem to have accidentally cleared it.

Aniket: Please provide the info asked in comment#10 and comment#13

Comment 23 Aniket Bhavsar 2018-09-03 11:34:24 UTC
This is to inform you that the customer has closed the case, so it now it would not be possible to provide requested data/details.


Note You need to log in before you can comment on or make changes to this bug.