Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1513528 - [Regression] RHEV - Last Refresh: Failed to fetch attributes for vm
Summary: [Regression] RHEV - Last Refresh: Failed to fetch attributes for vm
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Providers
Version: 5.9.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: GA
: 5.10.0
Assignee: Juan Hernández
QA Contact: Hayk Hovsepyan
URL:
Whiteboard:
: 1525037 (view as bug list)
Depends On: 1525302
Blocks: 1526065
TreeView+ depends on / blocked
 
Reported: 2017-11-15 15:14 UTC by Hayk Hovsepyan
Modified: 2018-06-21 21:13 UTC (History)
13 users (show)

Fixed In Version: 5.10.0.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1526065 (view as bug list)
Environment:
Last Closed: 2018-06-21 21:13:47 UTC
Category: ---
Cloudforms Team: RHEVM


Attachments (Terms of Use)
complete evm.log and rhevm.log (deleted)
2017-12-08 19:00 UTC, Filip Brychta
no flags Details
output of asynchronous_inventory.rb (deleted)
2017-12-12 11:15 UTC, Filip Brychta
no flags Details
debug.log and evm.log (deleted)
2017-12-12 13:10 UTC, Filip Brychta
no flags Details


Links
System ID Priority Status Summary Last Updated
Github ManageIQ manageiq-providers-ovirt pull 176 None None None 2017-12-13 02:19:12 UTC
Github ManageIQ manageiq-providers-ovirt pull 177 None None None 2017-12-13 15:32:17 UTC

Description Hayk Hovsepyan 2017-11-15 15:14:41 UTC
Description of problem:
Error in RHEV Provider summary page Last Refresh field. No data is shown.

Version-Release number of selected component (if applicable):
CFME: 5.9.0.8.20171109215303_ed87902
RHEVM: 4.1.6.2-0.1.el7

How reproducible:
only with Hawkular QE team RHEVM after upgrading to 4.1.6,
before this, the last time was tested on our RHEVM 3.x and it worked

Steps to Reproduce:
1. Add RHEV Infrastructure provider.
2. Go to Providers details page.
3. Refresh provider relationships.
4. Verify that error is shown for "Last Refresh".
No vm or any other data is loaded for that provider, summary page is empty.


evm.log contains:

[----] E, [2017-11-15T05:51:19.789437 #55552:12e713c] ERROR -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::FuturesCollector#wait_on_all_futures_ignoring_results) failed waiting on #<ManageIQ::Providers::Redhat::InfraManager::FuturesCollector::KeyedValue:0x0000000aa06450 @key="vm_6891a00d-d518-49a7-9f0a-848f3fd7343a_disk_attachments", @value=#<OvirtSDK4::Future:0x0000000aa06590 @service=#<OvirtSDK4::DiskAttachmentsService:0x0000000aa07490 @parent=#<OvirtSDK4::VmService:0x0000000aa07788 @parent=#<OvirtSDK4::VmsService:0x0000000386f7d8 @parent=#<OvirtSDK4::SystemService:0x0000000aaed008 @parent=#<OvirtSDK4::Connection:0x0000000aaed670 @url="API", @username="$USERNAME", @password="$PASSWORD", @token="nCFoxK8_QtciykUHh4qSsz3EbA1s4KuhVT2fPelVu4mLJwBlZL0JsE5YdyGFPmlCyxlyr1OhEWRgoVKr_8JkwQ", @insecure=true, @ca_file=nil, @ca_certs=nil, @debug=false, @log=#<Vmdb::Loggers::MulticastLogger:0x00000002b20910 @loggers=#<Set: {#<VMDBLogger:0x00000002b20b40 @progname=nil, @level=1, @default_formatter=#<Logger::Formatter:0x00000002b20ac8 @datetime_format=nil>, @formatter=#<VMDBLogger::Formatter:0x00000002b20988 @datetime_format=nil>, @logdev=#<Logger::LogDevice:0x00000002b20a78 @shift_size=1048576, @shift_age=0, @filename=#<Pathname:/var/www/miq/vmdb/log/rhevm.log>, @dev=#<File:/var/www/miq/vmdb/log/rhevm.log>, @mon_owner=nil, @mon_count=0, @mon_mutex=#<Thread::Mutex:0x00000002b20a50>>, @write_lock=#<Thread::Mutex:0x00000002b20960>, @local_levels={}, @thread_hash_level_key=:"ThreadSafeLogger#22611360@level">}>, @level=1, @thread_hash_level_key=:"ThreadSafeLogger#22611080@level">, @kerberos=false, @timeout=3600, @connect_timeout=60, @compress=true, @proxy_url=nil, @proxy_username=nil, @proxy_password=nil, @headers=nil, @connections=0, @pipeline=0, @ca_store=nil, @mutex=#<Thread::Mutex:0x0000000aaed620>, @client=#<OvirtSDK4::HttpClient:0x0000000aaed5d0>, @system_service=#<OvirtSDK4::SystemService:0x0000000aaed008 ...>>, @path="", @clusters_service=#<OvirtSDK4::ClustersService:0x00000003846928 @parent=#<OvirtSDK4::SystemService:0x0000000aaed008 ...>, @path="clusters">, @storage_domains_service=#<OvirtSDK4::StorageDomainsService:0x00000003856d78 @parent=#<OvirtSDK4::SystemService:0x0000000aaed008 ...>, @path="storagedomains">, @hosts_service=#<OvirtSDK4::HostsService:0x0000000385e488 @parent=#<OvirtSDK4::SystemService:0x0000000aaed008 ...>, @path="hosts">, @vms_service=#<OvirtSDK4::VmsService:0x0000000386f7d8 ...>, @templates_service=#<OvirtSDK4::TemplatesService:0x00000003876c68 @parent=#<OvirtSDK4::SystemService:0x0000000aaed008 ...>, @path="templates">, @networks_service=#<OvirtSDK4::NetworksService:0x00000003879418 @parent=#<OvirtSDK4::SystemService:0x0000000aaed008 ...>, @path="networks">, @data_centers_service=#<OvirtSDK4::DataCentersService:0x00000003883378 @parent=#<OvirtSDK4::SystemService:0x0000000aaed008 ...>, @path="datacenters">, @disks_service=#<OvirtSDK4::DisksService:0x00000003884520 @parent=#<OvirtSDK4::SystemService:0x0000000aaed008 ...>, @path="disks">>, @path="vms">, @path="6891a00d-d518-49a7-9f0a-848f3fd7343a", @disk_attachments_service=#<OvirtSDK4::DiskAttachmentsService:0x0000000aa07490 ...>>, @path="diskattachments">, @request=#<OvirtSDK4::HttpRequest:0x0000000aa07350>, @block=#<Proc:0x0000000aa064c8@/opt/rh/rh-ruby23/root/usr/share/gems/gems/ovirt-engine-sdk-4.1.12/lib/ovirtsdk4/service.rb:149>>>, due to: Can't send request: SSL connect error
I, [2017-11-15T05:51:19.795078 #55552]  INFO -- : Exception in realtime_block :ems_refresh - Timings: {:fetch_all=>16.23128867149353, :collect_inventory_for_targets=>18.38281750679016, :ems_refresh=>18.382860898971558}
[----] E, [2017-11-15T05:51:19.795340 #55552:12e713c] ERROR -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Refresh::Strategies::Api4#refresh) EMS: [rhevm.jonqe.lab.eng.bos.redhat.com], id: [2] Refresh failed
[----] E, [2017-11-15T05:51:19.795528 #55552:12e713c] ERROR -- : [RuntimeError]: Failed to fetch attributes for vm  Method:[block in method_missing]
[----] E, [2017-11-15T05:51:19.795622 #55552:12e713c] ERROR -- : /opt/rh/cfme-gemset/bundler/gems/manageiq-providers-ovirt-deff16d97491/app/models/manageiq/providers/redhat/infra_manager/inventory/strategies/v4.rb:80:in `collect_inv_with_attributes_async'
/opt/rh/cfme-gemset/bundler/gems/manageiq-providers-ovirt-deff16d97491/app/models/manageiq/providers/redhat/infra_manager/inventory/strategies/v4.rb:60:in `block (2 levels) in refresh'
/opt/rh/cfme-gemset/bundler/gems/manageiq-providers-ovirt-deff16d97491/app/models/manageiq/providers/redhat/infra_manager/inventory/strategies/v4.rb:60:in `each'
/opt/rh/cfme-gemset/bundler/gems/manageiq-providers-ovirt-deff16d97491/app/models/manageiq/providers/redhat/infra_manager/inventory/strategies/v4.rb:60:in `block in refresh'
/opt/rh/cfme-gemset/bundler/gems/manageiq-providers-ovirt-deff16d97491/app/models/manageiq/providers/redhat/infra_manager/api_integration.rb:96:in `with_provider_connection'
/opt/rh/cfme-gemset/bundler/gems/manageiq-providers-ovirt-deff16d97491/app/models/manageiq/providers/redhat/infra_manager/inventory/strategies/v4.rb:56:in `refresh'
/opt/rh/cfme-gemset/bundler/gems/manageiq-providers-ovirt-deff16d97491/app/models/manageiq/providers/redhat/infra_manager/refresh/refresher.rb:25:in `block (2 levels) in collect_inventory_for_targets'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:28:in `realtime_block'
/opt/rh/cfme-gemset/bundler/gems/manageiq-providers-ovirt-deff16d97491/app/models/manageiq/providers/redhat/infra_manager/refresh/refresher.rb:25:in `block in collect_inventory_for_targets'
/opt/rh/cfme-gemset/bundler/gems/manageiq-providers-ovirt-deff16d97491/app/models/manageiq/providers/redhat/infra_manager/refresh/refresher.rb:12:in `collect'
/opt/rh/cfme-gemset/bundler/gems/manageiq-providers-ovirt-deff16d97491/app/models/manageiq/providers/redhat/infra_manager/refresh/refresher.rb:12:in `collect_inventory_for_targets'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:80:in `block in refresh_targets_for_ems'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:28:in `realtime_block'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:79:in `refresh_targets_for_ems'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:24:in `block (2 levels) in refresh'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:35:in `realtime_block'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:24:in `block in refresh'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:14:in `each'
/var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:14:in `refresh'
/var/www/miq/vmdb/app/models/manageiq/providers/base_manager/refresher.rb:9:in `refresh'
/var/www/miq/vmdb/app/models/ems_refresh.rb:97:in `block in refresh'
/var/www/miq/vmdb/app/models/ems_refresh.rb:96:in `each'
/var/www/miq/vmdb/app/models/ems_refresh.rb:96:in `refresh'
/var/www/miq/vmdb/app/models/miq_queue.rb:449:in `block in dispatch_method'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:91:in `block in timeout'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `block in catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:106:in `timeout'
/var/www/miq/vmdb/app/models/miq_queue.rb:448:in `dispatch_method'
/var/www/miq/vmdb/app/models/miq_queue.rb:425:in `block in deliver'
/var/www/miq/vmdb/app/models/user.rb:253:in `with_user_group'
/var/www/miq/vmdb/app/models/miq_queue.rb:425:in `deliver'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:104:in `deliver_queue_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:134:in `deliver_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:152:in `block in do_work'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:146:in `loop'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:146:in `do_work'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:329:in `block in do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:326:in `loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:326:in `do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:153:in `run'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:127:in `start'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:22:in `start_worker'
/var/www/miq/vmdb/app/models/miq_worker.rb:357:in `block in start_runner_via_fork'
/opt/rh/cfme-gemset/gems/nakayoshi_fork-0.0.3/lib/nakayoshi_fork.rb:24:in `fork'
/opt/rh/cfme-gemset/gems/nakayoshi_fork-0.0.3/lib/nakayoshi_fork.rb:24:in `fork'
/var/www/miq/vmdb/app/models/miq_worker.rb:355:in `start_runner_via_fork'
/var/www/miq/vmdb/app/models/miq_worker.rb:349:in `start_runner'
/var/www/miq/vmdb/app/models/miq_worker.rb:396:in `start'
/var/www/miq/vmdb/app/models/miq_worker.rb:266:in `start_worker'
/var/www/miq/vmdb/app/models/mixins/per_ems_worker_mixin.rb:68:in `start_worker_for_ems'
/var/www/miq/vmdb/app/models/mixins/per_ems_worker_mixin.rb:46:in `block in sync_workers'
/var/www/miq/vmdb/app/models/mixins/per_ems_worker_mixin.rb:45:in `each'
/var/www/miq/vmdb/app/models/mixins/per_ems_worker_mixin.rb:45:in `sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:53:in `block in sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `each'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:22:in `monitor_workers'
/var/www/miq/vmdb/app/models/miq_server.rb:329:in `block in monitor'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:28:in `realtime_block'
/var/www/miq/vmdb/app/models/miq_server.rb:329:in `monitor'
/var/www/miq/vmdb/app/models/miq_server.rb:351:in `block (2 levels) in monitor_loop'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:35:in `realtime_block'
/var/www/miq/vmdb/app/models/miq_server.rb:351:in `block in monitor_loop'
/var/www/miq/vmdb/app/models/miq_server.rb:350:in `loop'
/var/www/miq/vmdb/app/models/miq_server.rb:350:in `monitor_loop'
/var/www/miq/vmdb/app/models/miq_server.rb:234:in `start'
/var/www/miq/vmdb/lib/workers/evm_server.rb:27:in `start'
/var/www/miq/vmdb/lib/workers/evm_server.rb:48:in `start'
/var/www/miq/vmdb/lib/workers/bin/evm_server.rb:4:in `<main>'
[----] E, [2017-11-15T05:51:19.795751 #55552:12e713c] ERROR -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Refresh::Strategies::Api4#refresh) EMS: [rhevm.jonqe.lab.eng.bos.redhat.com], id: [2] Unable to perform refresh for the following targets:
[----] E, [2017-11-15T05:51:19.795830 #55552:12e713c] ERROR -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Refresh::Strategies::Api4#refresh)  --- ManageIQ::Providers::Redhat::InfraManager [rhevm.jonqe.lab.eng.bos.redhat.com] id [2]
[----] I, [2017-11-15T05:51:19.810722 #55552:12e713c]  INFO -- : MIQ(ManageIQ::Providers::Redhat::InfraManager::Refresh::Strategies::Api4#refresh) Refreshing all targets...Complete
[----] E, [2017-11-15T05:51:19.811160 #55552:12e713c] ERROR -- : MIQ(MiqQueue#deliver) Message id: [10122], Error: [Failed to fetch attributes for vm]
[----] E, [2017-11-15T05:51:19.811434 #55552:12e713c] ERROR -- : [EmsRefresh::Refreshers::EmsRefresherMixin::PartialRefreshError]: Failed to fetch attributes for vm  Method:[block in method_missing]
[----] E, [2017-11-15T05:51:19.811680 #55552:12e713c] ERROR -- : /var/www/miq/vmdb/app/models/ems_refresh/refreshers/ems_refresher_mixin.rb:50:in `refresh'
/var/www/miq/vmdb/app/models/manageiq/providers/base_manager/refresher.rb:9:in `refresh'
/var/www/miq/vmdb/app/models/ems_refresh.rb:97:in `block in refresh'
/var/www/miq/vmdb/app/models/ems_refresh.rb:96:in `each'
/var/www/miq/vmdb/app/models/ems_refresh.rb:96:in `refresh'
/var/www/miq/vmdb/app/models/miq_queue.rb:449:in `block in dispatch_method'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:91:in `block in timeout'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `block in catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch'
/opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:106:in `timeout'
/var/www/miq/vmdb/app/models/miq_queue.rb:448:in `dispatch_method'
/var/www/miq/vmdb/app/models/miq_queue.rb:425:in `block in deliver'
/var/www/miq/vmdb/app/models/user.rb:253:in `with_user_group'
/var/www/miq/vmdb/app/models/miq_queue.rb:425:in `deliver'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:104:in `deliver_queue_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:134:in `deliver_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:152:in `block in do_work'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:146:in `loop'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:146:in `do_work'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:329:in `block in do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:326:in `loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:326:in `do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:153:in `run'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:127:in `start'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:22:in `start_worker'
/var/www/miq/vmdb/app/models/miq_worker.rb:357:in `block in start_runner_via_fork'
/opt/rh/cfme-gemset/gems/nakayoshi_fork-0.0.3/lib/nakayoshi_fork.rb:24:in `fork'
/opt/rh/cfme-gemset/gems/nakayoshi_fork-0.0.3/lib/nakayoshi_fork.rb:24:in `fork'
/var/www/miq/vmdb/app/models/miq_worker.rb:355:in `start_runner_via_fork'
/var/www/miq/vmdb/app/models/miq_worker.rb:349:in `start_runner'
/var/www/miq/vmdb/app/models/miq_worker.rb:396:in `start'
/var/www/miq/vmdb/app/models/miq_worker.rb:266:in `start_worker'
/var/www/miq/vmdb/app/models/mixins/per_ems_worker_mixin.rb:68:in `start_worker_for_ems'
/var/www/miq/vmdb/app/models/mixins/per_ems_worker_mixin.rb:46:in `block in sync_workers'
/var/www/miq/vmdb/app/models/mixins/per_ems_worker_mixin.rb:45:in `each'
/var/www/miq/vmdb/app/models/mixins/per_ems_worker_mixin.rb:45:in `sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:53:in `block in sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `each'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:22:in `monitor_workers'
/var/www/miq/vmdb/app/models/miq_server.rb:329:in `block in monitor'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:28:in `realtime_block'
/var/www/miq/vmdb/app/models/miq_server.rb:329:in `monitor'
/var/www/miq/vmdb/app/models/miq_server.rb:351:in `block (2 levels) in monitor_loop'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store'
/opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-f1c90023a2e3/lib/gems/pending/util/extensions/miq-benchmark.rb:35:in `realtime_block'
/var/www/miq/vmdb/app/models/miq_server.rb:351:in `block in monitor_loop'
/var/www/miq/vmdb/app/models/miq_server.rb:350:in `loop'
/var/www/miq/vmdb/app/models/miq_server.rb:350:in `monitor_loop'
/var/www/miq/vmdb/app/models/miq_server.rb:234:in `start'
/var/www/miq/vmdb/lib/workers/evm_server.rb:27:in `start'
/var/www/miq/vmdb/lib/workers/evm_server.rb:48:in `start'
/var/www/miq/vmdb/lib/workers/bin/evm_server.rb:4:in `<main>'

Comment 5 Piotr Kliczewski 2017-11-17 08:52:31 UTC
This bug seems to be related to "async" refresh I see here:

>> Can't send request: SSL connect error

which caused:

>> [RuntimeError]: Failed to fetch attributes for vm  Method:[block in method_missing]

which occurred here -> `collect_inv_with_attributes_async`

Comment 6 Piotr Kliczewski 2017-11-17 08:54:32 UTC
Please share the contents of rhevm.log. I wonder what was the cause of: Can't send request: SSL connect error

Comment 9 Piotr Kliczewski 2017-11-23 13:26:25 UTC
I tried to reproduce the issue with the latest master and I saw no failure. Based on provided information this issue occurs for regular refresh in async code. I performed steps specified in comment #1 without any luck. Is there anything specific about the vm which refresh failed for? Please retest for the last build.

Comment 10 Hayk Hovsepyan 2017-11-28 14:47:04 UTC
I believe it happens on that particular RHEV instance I used, and it works fine with latest or any other RHEV version.

Comment 11 Oved Ourfali 2017-11-28 14:49:56 UTC
Closing as worksforme.
If it happens again, please re-open.

Comment 12 Filip Brychta 2017-12-08 18:57:32 UTC
Reopening. The issue is still visible in the latest cfme Version 5.9.0.12.20171205180333_3e32b3d

Confirmed that the issue is NOT visible in the latest manageIQ master.
I'm attaching complete evm.log and rhevm.log

The issue is reproducible using rhevm with user described in comment2 using steps described in original description.

I compared GET requests and 200 responses from rhevm.log. In case of manageIQ e.g. for following GET request:
[----] I, [2017-12-08T12:53:46.571737 #1946:2b006f07313c]  INFO -- : Sending GET request to URL 'https://rhevm.jonqe.lab.eng.bos.redhat.com/ovirt-engine/api/vms/a8822247-aa11-4c5a-a8db-7b48f63e7045/nics'.

it took 3s to receive response:
[----] I, [2017-12-08T12:53:49.755620 #1946:2b006f07313c]  INFO -- : Received response code 200 for GET request to URL 'https://rhevm.jonqe.lab.eng.bos.redhat.com/ovirt-engine/api/vms/a8822247-aa11-4c5a-a8db-7b48f63e7045/nics'. 

and the issue was not visible.

But in case of CFME:
[----] I, [2017-12-08T08:52:25.827357 #47005:110114c]  INFO -- : Sending GET request to URL 'https://rhevm.jonqe.lab.eng.bos.redhat.com/ovirt-engine/api/vms/a8822247-aa11-4c5a-a8db-7b48f63e7045/nics'.

it took 6s:
[----] I, [2017-12-08T08:52:31.158011 #47005:110114c]  INFO -- : Received response code 200 for GET request to URL 'https://rhevm.jonqe.lab.eng.bos.redhat.com/ovirt-engine/api/vms/a8822247-aa11-4c5a-a8db-7b48f63e7045/nics'.

Some responses for GET requests were not received at all which is probably cause of this issue.


Is there any timeout (looks like 10s) which is responsible for this issue? e.g. if the response for GET request is not received withing 10s the error is thrown.
It seems that for some reason the time between request and response can be longer than 10s in some cases. Could you please check if this is what is happening here? Another question is why it's is visible only in cfme not in managaIQ master. Is cfme getting more data than manageIQ which could result in longer response times?

Our rhevm instance is running on slow storage which would explain high response times and why the issue is not visible for other rhevm instances.

This issue is blocking our automation, if it's really caused by timeouts is it possible to increase it somehow to verify?

Comment 13 Filip Brychta 2017-12-08 19:00:13 UTC
Created attachment 1365006 [details]
complete evm.log and rhevm.log

Comment 14 Filip Brychta 2017-12-08 19:06:25 UTC
Forgot one more thing which is visible on rhevm side after adding the rhevm provider.
In case of manageIQ the user logs in just once and stays logged in.
In case of cfme the user logs in and logs out periodically every ~15s.

Could it be relevant? Any explanation for that?

Comment 15 Piotr Kliczewski 2017-12-11 09:22:49 UTC
In the logs provided in comment #13 I see below issue. Please make sure that your credentials are correct or provide steps where credential validation works but they are not correct.

[----] E, [2017-12-08T08:51:01.131651 #13139:110114c] ERROR -- : MIQ(MiqQueue#deliver) Message id: [823], Error: [Error during SSO authentication: access_denied: Cannot authenticate user 'haw_test_user@internal': The username or password is incorrect..]
[----] E, [2017-12-08T08:51:01.132046 #13139:110114c] ERROR -- : [OvirtSDK4::Error]: Error during SSO authentication: access_denied: Cannot authenticate user 'haw_test_user@internal': The username or password is incorrect..  Method:[block in method_missing]
[----] E, [2017-12-08T08:51:01.132183 #13139:110114c] ERROR -- : /opt/rh/rh-ruby23/root/usr/share/gems/gems/ovirt-engine-sdk-4.1.13/lib/ovirtsdk4/connection.rb:234:in `create_access_token'
/opt/rh/rh-ruby23/root/usr/share/gems/gems/ovirt-engine-sdk-4.1.13/lib/ovirtsdk4/connection.rb:616:in `internal_send'
/opt/rh/rh-ruby23/root/usr/share/gems/gems/ovirt-engine-sdk-4.1.13/lib/ovirtsdk4/connection.rb:203:in `block in send'
/opt/rh/rh-ruby23/root/usr/share/gems/gems/ovirt-engine-sdk-4.1.13/lib/ovirtsdk4/connection.rb:203:in `synchronize'
/opt/rh/rh-ruby23/root/usr/share/gems/gems/ovirt-engine-sdk-4.1.13/lib/ovirtsdk4/connection.rb:203:in `send'
/opt/rh/rh-ruby23/root/usr/share/gems/gems/ovirt-engine-sdk-4.1.13/lib/ovirtsdk4/service.rb:184:in `internal_get'
/opt/rh/rh-ruby23/root/usr/share/gems/gems/ovirt-engine-sdk-4.1.13/lib/ovirtsdk4/services.rb:20790:in `get'
/opt/rh/rh-ruby23/root/usr/share/gems/gems/ovirt-engine-sdk-4.1.13/lib/ovirtsdk4/connection.rb:367:in `test'
/opt/rh/cfme-gemset/bundler/gems/manageiq-providers-ovirt-73d0ac5992be/app/models/manageiq/providers/redhat/infra_manager/api_integration.rb:314:in `check_connect_api'

Comment 16 Filip Brychta 2017-12-11 09:56:06 UTC
Sorry, I did not realize there was one failed attempt to validate credentials when I had typo in password.

Credentials in comment2 ARE correct, I just had typo during first try which is the error you pasted in previous comment. After this error the provider was added successfully with correct credentials, see 2017-12-08T08:51:31.537966 in evm.log and below. Also you can check rhevm.log where you see GET requests and valid 200 responses e.g.:
[----] I, [2017-12-08T08:52:25.998770 #47005:110114c]  INFO -- : Received response code 200 for GET request to URL 'https://rhevm.jonqe.lab.eng.bos.redhat.com/ovirt-engine/api/vms/9f227c90-f856-49b5-a3a1-1581c50d317a/reporteddevices'.


Bottom line, you can ignore "Cannot authenticate user 'haw_test_user@internal': The username or password is incorrect..]" which comes from first failed validation due to typo in password. Second attempt was successful.

Comment 17 Piotr Kliczewski 2017-12-11 11:01:06 UTC
So what are the steps to reproduce?

Comment 18 Filip Brychta 2017-12-11 11:40:23 UTC
Use the latest cfme Version 5.9.0.12.20171205180333_3e32b3d,
1. Add RHEV Infrastructure provider. (hostname and credentials in comment2, set Verify TLS Certificates to No)
2. Go to Providers details page.
3. Refresh provider relationships.
No vm or any other data is loaded for that provider, summary page is empty. Check evm.log for the error.

See comment 12 for more info about the issue.

The failed validation attempt described in comment16 has NO effect on result, you can simply ignore it.

Comment 19 Piotr Kliczewski 2017-12-11 12:39:00 UTC
OK, I will try to reproduce it using provided steps.

Comment 20 Piotr Kliczewski 2017-12-11 14:10:50 UTC
I performed the steps using cfme 5.9.0.12.20171205180333_3e32b3d and rhv 4.1.8.2-0.1. I was not able to reproduce the issue. Are there any additional steps that you perform prior to seeing the issue?

Comment 21 Filip Brychta 2017-12-11 15:04:21 UTC
What is the rhv 4.1.8.2-0.1? Is it some component in cfme or is it version of used Red Hat Enterprise Virtualization Manager?

To see the issue you need to use the Red Hat Enterprise Virtualization Manager specified comment2.

Comment 22 Piotr Kliczewski 2017-12-11 15:58:31 UTC
(In reply to Filip Brychta from comment #21)
> What is the rhv 4.1.8.2-0.1? Is it some component in cfme or is it version
> of used Red Hat Enterprise Virtualization Manager?
> 
> To see the issue you need to use the Red Hat Enterprise Virtualization
> Manager specified comment2.

This is exactly what I am going. So you are saying that the error occurs for rhv 4.1.6.2-0.1 and not for 4.1.8.2-0.1. Is it correct?

Comment 23 Filip Brychta 2017-12-11 16:18:45 UTC
If you read comment12 you will find:

"The issue is reproducible using rhevm with user described in comment2 using steps described in original description." 
(I added more detailed steps in comment18)

"Our rhevm instance is running on slow storage which would explain high response times and why the issue is not visible for other rhevm instances."

That means that the issue is visible only on given rhemv instance described in comment2. And possible reason for that is described in comment 12 too (slow storage -> high response times -> possible timeouts).

So to see the issue you HAVE TO use instance from comment2 (rhevm.jonqe.lab.eng.bos.redhat.com)

Please read comment12 where I'm trying to suggest possible causes of this issue.

Comment 24 Oved Ourfali 2017-12-11 18:18:52 UTC
Filip, the issue reproduces only on that environment?

Juan, you did some tests around latency. Any input you have on this?

Comment 25 Juan Hernández 2017-12-11 19:42:39 UTC
In the approx 6 seconds since the request for the virtual machine NICs was sent till the response was received, there were other 168 requests sent, and 240 responses received. So those 6 seconds of latency for that particular request are reasonable. I don't think that there is an issue with the response time of the RHV environment.

What we need to find is the reason for that "SSL connect error", as that interrupts the refresh process.

Filip, can you check the `/var/log/httpd/ssl_access_log` and `/var/log/httpd/ssl_request_log` files in the RHV server? Do you see any error there?

Would also be nice if you can run the following script, and check if it produces the same error:

  https://github.com/oVirt/ovirt-engine-sdk-ruby/blob/master/sdk/examples/asynchronous_inventory.rb

That script does something similar to what the provider, and generates a very detailed `example.log` file, which can help us determine the reason for the SSL error.

Comment 26 Filip Brychta 2017-12-12 09:52:15 UTC
Reproduced again to be sure and there are no errors in /var/log/httpd/ssl_error_log or /var/log/httpd/error_log when this issue is happening. I can see in /var/log/httpd/ssl_access_log that cfme is repeatedly trying to communicate with rhevm:
10.8.196.251 - - [12/Dec/2017:04:23:17 -0500] "GET /ovirt-engine/api HTTP/1.1" 200 942
10.8.196.251 - - [12/Dec/2017:04:23:18 -0500] "GET /api HTTP/1.1" 404 201
10.16.23.33 - - [12/Dec/2017:04:23:18 -0500] "POST /ovirt-engine/sso/oauth/token HTTP/1.1" 200 234
10.16.23.33 - - [12/Dec/2017:04:23:18 -0500] "POST /ovirt-engine/sso/oauth/token-info HTTP/1.1" 200 330

where 10.8.196.251 is cfme host and 10.16.23.33 is rhevm host.

Side note: this issue is not visible in manageIQ master, we also have foreman instance talking to this rhevm instance over https://rhevm.jonqe.lab.eng.bos.redhat.com/ovirt-engine/api without any issues so this issue is specific to cfme. Also what is confusing is that some requests from cfme are successful (200 response is received) and some are not receiving response at all (it's visible in cfme's rhevm.log).

I will add output of https://github.com/oVirt/ovirt-engine-sdk-ruby/blob/master/sdk/examples/asynchronous_inventory.rb a bit later, I have never worked with ruby...

Comment 27 Filip Brychta 2017-12-12 11:14:35 UTC
Attaching example.log created by https://github.com/oVirt/ovirt-engine-sdk-ruby/blob/master/sdk/examples/asynchronous_inventory.rb as asked in comment25.

It did not show any error. Looking at content, isn't it similar to cfme's rhevm.log which I mentioned in comment12?

The asynchronous_inventory.rb script is using the same user, only difference is that it's using certificate as well (ca_file: '/tmp/apache-ca.pem'). In repro steps for this bz I did not use certificate verification at all (Verify TLS Certificates set to no).

I tried to add the provider with Verify TLS Certificates set to yes and validation (click on Validate button) works correctly but actual addition (click on Add button) fails with "Certificate authority nested asn1 error"

Comment 28 Filip Brychta 2017-12-12 11:15:14 UTC
Created attachment 1366552 [details]
output of asynchronous_inventory.rb

Comment 29 Juan Hernández 2017-12-12 11:43:31 UTC
Filip, can you share more details about the failure that you get when trying to add the provider with TLS verification enabled? It should work correctly.

Also, can you please re-run the `asynchronous_inventory.rb` script adding these parameters to the constructor of the connection:

  debug: true,

I forgot to mention that, it will generate a much verbose log. It may be too big to attach to the BZ, and it will contain all the HTTP traffic, including your user name and password, so you may want to share it by mail instead of attaching it to the BZ.

Note that if it doesn't generate the SSL error then it won't be really useful. Can you try to run it without SSL verification, like you did from CFME. To do so remove the `ca_file` parameter and add this instead:

  insecure: true,

The complete call to the constructor should be like this:

  connection_opts = {
    url: 'https://your_server/ovirt-engine/api',
    username: 'admin@internal',
    password: 'your_password',
    debug: true,
    insecure: true,
    log: Logger.new('example.log')
  }

Finally, did you check the `/var/log/httpd/ssl_error_log` file in the RHV server? Anything relevant there?

Comment 30 Juan Hernández 2017-12-12 11:52:50 UTC
If you can easily repeat your tests from CFME, there is one additional thing I would like you to try: modify the RHV provider so that it generates a complete debug log of the HTTP traffic. To do so locate the `app/models/manageiq/providers/redhat/infra_manager/api_integration.rb` file in your CFME machine. Open it and look for these lines:

  ManageIQ::Providers::Redhat::ConnectionManager.instance.get(
    options[:id],
    :url             => url.to_s,
    :username        => options[:username],
    :password        => options[:password],
    :timeout         => read_timeout,
    :connect_timeout => open_timeout,
    :insecure        => options[:verify_ssl] == OpenSSL::SSL::VERIFY_NONE,
    :ca_certs        => ca_certs,
    :log             => $rhevm_log,
    :connections     => options[:connections] || ::Settings.ems.ems_redhat.connections,
    :pipeline        => options[:pipeline] || ::Settings.ems.ems_redhat.pipeline
  )

Then modify it so that it looks like this:

  require 'logger'
  ManageIQ::Providers::Redhat::ConnectionManager.instance.get(
    options[:id],
    :url             => url.to_s,
    :username        => options[:username],
    :password        => options[:password],
    :timeout         => read_timeout,
    :connect_timeout => open_timeout,
    :insecure        => options[:verify_ssl] == OpenSSL::SSL::VERIFY_NONE,
    :ca_certs        => ca_certs,
    :debug           => true,
    :log             => Logger.new('/tmp/debug.log'),
    :connections     => options[:connections] || ::Settings.ems.ems_redhat.connections,
    :pipeline        => options[:pipeline] || ::Settings.ems.ems_redhat.pipeline
  )

Note the new `debug` parameter, and the change in the `log` parameter.

After changing this, restart your CFME appliance, repeat the test, verify that it fails again with that SSL eeror, and then share the generated `/tmp/debug.log` file.

Remember to restore the modified file once you finish.

Comment 31 Filip Brychta 2017-12-12 12:15:02 UTC
So I found the cause of issue when trying to add the provider with TLS verification enabled and filed bz1524992 for that. I was able to add the provider successfully with TLS verification enabled and the original issue "Failed to fetch attributes for vm" is still visible so TLS verification option has no effect on result.

Will add the info you asked for a bit later.

Comment 32 Filip Brychta 2017-12-12 13:09:28 UTC
"Finally, did you check the `/var/log/httpd/ssl_error_log` file in the RHV server? Anything relevant there?"

Yes I checked that, neither /var/log/httpd/ssl_error_log nor /var/log/httpd/error_log contain any error.



Tried to run asynchronous_inventory.rb with insecure: true, but it did not show any error -> not attaching the output.


Reproduced the issue with debug logging enabled, attaching debug.log and evm.log containing the error.

Comment 33 Filip Brychta 2017-12-12 13:10:03 UTC
Created attachment 1366645 [details]
debug.log and evm.log

Comment 34 Piotr Kliczewski 2017-12-12 13:12:47 UTC
This is the error:

D, [2017-12-12T14:02:50.929019 #9154] DEBUG -- : Connection cache is full, closing the oldest one.
D, [2017-12-12T14:02:50.929073 #9154] DEBUG -- : Closing connection 113

It seems that connection cache (pool) is not big enough.

Comment 35 Piotr Kliczewski 2017-12-12 13:17:41 UTC
At least that was the issue I saw when connection. In provided log I see:

D, [2017-12-12T07:55:22.602300 #2439] DEBUG -- : Connection 7 seems to be dead!
D, [2017-12-12T07:55:22.602383 #2439] DEBUG -- : Closing connection 7

Comment 36 Juan Hernández 2017-12-12 13:38:27 UTC
The cache full message shouldn't be a problem, lubcurl knows how to handle the cache and will open new connections as needed. In addition I don't see that message in the log provided by Filip. Anyhow, it is worth studying as we should never need 113 connections, it is a waste of resources. Piotr, can you provide more details of where you find that? Maybe in a separate BZ?

The second problem, the dead connection, is more worrying, but it happens in the event catcher, not in the refresh worker, so it isn't the cause of the refresh failure.

Comment 37 Piotr Kliczewski 2017-12-12 13:59:22 UTC
(In reply to Juan Hernández from comment #36)
>  Piotr, can you provide more details of where you find that? Maybe in a separate BZ?

You can find it here BZ #1525037.

Comment 38 Juan Hernández 2017-12-13 02:14:58 UTC
I believe that there are two problems here:

1. The oVirt Ruby SDK doesn't explicitly control the number of requests sent to libcurl. With versions of libcurl older than 7.30.0 this results on too many connections open if the application doesn't control it explicitly. The fix for that will be tracked in bug 1525302.

2. The oVirt provider doesn't honour the `pipeline` and `connections` parameters from the configuration. The fix will be tracked in this bug.

Comment 39 Juan Hernández 2017-12-13 02:22:03 UTC
*** Bug 1525037 has been marked as a duplicate of this bug. ***

Comment 40 Juan Hernández 2017-12-13 09:12:23 UTC
The fix for the provider has been merged, but we should also release a new version of the SDK, and update the provider to use it. I am keeping the bug ON_DEV because of that. Will move it to POST when the pull request to use the new SDK is ready.

Comment 42 CFME Bot 2017-12-14 08:04:26 UTC
New commit detected on ManageIQ/manageiq-providers-ovirt/master:
https://github.com/ManageIQ/manageiq-providers-ovirt/commit/e5a94d5b010299efdca0f1f065f03746eff30648

commit e5a94d5b010299efdca0f1f065f03746eff30648
Author:     Juan Hernandez <juan.hernandez@redhat.com>
AuthorDate: Wed Dec 13 16:27:34 2017 +0100
Commit:     Juan Hernandez <juan.hernandez@redhat.com>
CommitDate: Wed Dec 13 16:28:04 2017 +0100

    Update to version 4.2.1 of the oVirt Ruby SDK
    
    The more relevant changes in this new version of the SDK are the
    following:
    
    * Fix handling of the `all_content` parameter
      https://bugzilla.redhat.com/1525555[#1525555].
    
    * Limit the number of requests sent to `libcurl`
      https://bugzilla.redhat.com/1525302[#1525302].
    
    Fixes https://bugzilla.redhat.com/1513528
    
    Signed-off-by: Juan Hernandez <juan.hernandez@redhat.com>

 manageiq-providers-ovirt.gemspec | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comment 43 Juan Hernández 2017-12-14 08:44:58 UTC
Satoe, all the changes required for to fix this bugs are now ready. Note that you will need to back-port the two pull requests:

  Fix location of `pipeline` and `connections` settings
  https://github.com/ManageIQ/manageiq-providers-ovirt/pull/176

  Update to version 4.2.1 of the oVirt Ruby SDK
  https://github.com/ManageIQ/manageiq-providers-ovirt/pull/177

The second one won't be a clean cherry-pick, because the version of the SDK used in the master branch has been changed before, and not back-ported:

  Update to version 4.2.0 of the oVirt Ruby SDK
  https://github.com/ManageIQ/manageiq-providers-ovirt/pull/166

To resolve the conflict just set the version to 4.2.1. Or you can just back-port this pull request before the other two.

Let me know if you need a specific pull request.

Comment 44 Satoe Imaishi 2017-12-14 17:35:15 UTC
Thanks Juan. I'll backport PR 166, followed by 177 to avoid conflict, as 166 will be indirectly backported anyway.


Note You need to log in before you can comment on or make changes to this bug.