Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 985496 - oo-admin-chk level1 times out
Summary: oo-admin-chk level1 times out
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Pod
Version: 2.x
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: ---
Assignee: Abhishek Gupta
QA Contact: libra bugs
Depends On:
TreeView+ depends on / blocked
Reported: 2013-07-17 15:45 UTC by Sten Turpin
Modified: 2015-05-15 00:18 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2013-08-07 22:55:29 UTC
Target Upstream Version:

Attachments (Terms of Use)

Description Sten Turpin 2013-07-17 15:45:26 UTC
Description of problem: oo-admin-chk level 1 times out

Version-Release number of selected component (if applicable): openshift-origin-broker-util-1.10.6-1.el6oso.noarch

How reproducible: always, in production

Steps to Reproduce:
1. execute oo-admin-chk --level 1 on a node
2. wait 40-150 minutes

Actual results:

Stack trace: 

Started at: 2013-07-17 10:37:49 -0400
             /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/networking.rb:306:in `rescue in receive_message_on_socket': Operation failed with the following exception: Connection timed out (Mongo::ConnectionFailure)
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/networking.rb:298:in `receive_message_on_socket'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/networking.rb:159:in `receive_header'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/networking.rb:150:in `receive'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/networking.rb:117:in `receive_message'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/cursor.rb:529:in `send_get_more'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/cursor.rb:463:in `refresh'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/cursor.rb:124:in `next'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/cursor.rb:285:in `each'
        from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.10.7/lib/openshift/data_store.rb:23:in `block in find'
        from /opt/rh/ruby193/root/usr/local/share/gems/gems/mongo-1.8.1/lib/mongo/collection.rb:276:in `find'
        from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-controller-1.10.7/lib/openshift/data_store.rb:22:in `find'
        from /usr/sbin/oo-admin-chk:245:in `<main>'

Expected results:
oo-admin-chk level 1 report, in a more reasonable timeframe

Additional info:

Comment 1 Dan McPherson 2013-07-17 16:06:15 UTC
If you take away all the bloat inside the block you get:

>> require "/var/www/openshift/broker/config/environment"
=> true
?> start_time =
=> 2013-07-17 09:45:00 -0400
>> apps = []
=> []
>> app_selection = {:fields => ["name", "uuid", "created_at", "domain_id", "group_instances.gears._id","group_instances.gears.uuid", "group_instances.gears.uid", "group_instances.gears.server_identity", "group_instances._id", "component_instances._id", "component_instances.cartridge_name", "component_instances.group_instance_id", "group_overrides", "", "app_ssh_keys.content"], :timeout => false}
=> {:fields=>["name", "uuid", "created_at", "domain_id", "group_instances.gears._id", "group_instances.gears.uuid", "group_instances.gears.uid", "group_instances.gears.server_identity", "group_instances._id", "component_instances._id", "component_instances.cartridge_name", "component_instances.group_instance_id", "group_overrides", "", "app_ssh_keys.content"], :timeout=>false}
>> app_query = {"group_instances.gears.0" => {"$exists" => true}}
=> {"group_instances.gears.0"=>{"$exists"=>true}}
>> OpenShift::DataStore.find(:applications, app_query, app_selection) do |app|
?>   apps << app
>> end
=> nil
?> puts apps.size
=> nil
>> puts - start_time

Comment 2 Rajat Chopra 2013-07-17 17:37:16 UTC
About time we employ multiple threads/processes.

Comment 4 Jianwei Hou 2013-07-31 02:46:00 UTC
Verified on devenv_3588

The time it takes to query apps is much saved. There is no timeout reported when running oo-admin-chk on level 1.

irb(main):001:0> require "/var/www/openshift/broker/config/environment"
=> true
irb(main):002:0> start_time =
=> 2013-07-30 22:41:11 -0400
irb(main):003:0> apps = []
=> []
irb(main):004:0> app_selection = {:fields => ["name", "uuid", "created_at", "domain_id", "group_instances.gears._id","group_instances.gears.uuid", "group_instances.gears.uid", "group_instances.gears.server_identity", "group_instances._id", "component_instances._id", "component_instances.cartridge_name", "component_instances.group_instance_id", "group_overrides", "", "app_ssh_keys.content"], :timeout => false}
=> {:fields=>["name", "uuid", "created_at", "domain_id", "group_instances.gears._id", "group_instances.gears.uuid", "group_instances.gears.uid", "group_instances.gears.server_identity", "group_instances._id", "component_instances._id", "component_instances.cartridge_name", "component_instances.group_instance_id", "group_overrides", "", "app_ssh_keys.content"], :timeout=>false}
irb(main):005:0> app_query = {"group_instances.gears.0" => {"$exists" => true}}
=> {"group_instances.gears.0"=>{"$exists"=>true}}
irb(main):006:0> OpenShift::DataStore.find(:applications, app_query, app_selection) do |app|
irb(main):007:1* apps << app
irb(main):008:1> end
=> nil
irb(main):009:0> puts apps.size
=> nil
irb(main):010:0> puts - start_time

Comment 6 Abhishek Gupta 2013-07-31 18:24:28 UTC
Marking it as verified again. Will reopen if its still a problem in PROD.

Comment 14 Jianwei Hou 2013-08-07 03:21:15 UTC
This bug is verified on devenv-stage_439

Note You need to log in before you can comment on or make changes to this bug.