Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 451798 - ec2 status for terminated and removed job not handled properly
Summary: ec2 status for terminated and removed job not handled properly
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: 1.0
Hardware: All
OS: Linux
medium
high
Target Milestone: 2.0
: ---
Assignee: grid-maint-list
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-06-17 13:49 UTC by Matthew Farrellee
Modified: 2011-01-07 17:58 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-01-07 17:58:27 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Matthew Farrellee 2008-06-17 13:49:52 UTC
EC2's DescribeInstances command keeps instances in the terminated state around
for a set period of time, after that period they are no longer reported and can
result in either no response or InvalidInstanceId.NotFound (former is observed).
If the latter is reported, the amazon-gahp returns '1' 
'0' '' '' '', not InvalidInstanceId.NotFound to the gridmanager.

Comment 1 Jaime Frey 2008-09-10 20:10:59 UTC
This behavior is intentional. The amazon_gahp will turn InvalidInstanceID.NotFound errors into successful operations for AMAZON_VM_STATUS and AMAZON_VM_STOP commands. This was done to simplify the error-handling in the gridmanager. If we don't want the gahp to eat these errors, we'll have to modify the gridmanager to recognize them and react appropriately.

Comment 2 Matthew Farrellee 2008-09-11 04:31:47 UTC
I think my description may have been poorly worded. The issue was that instances that are no longer reported by EC2 as even existing were not being handled properly. This is the case of EC2 removing all knowledge of an instance before AMAZON_VM_STATUS can be sent to query the instance. The semantics should be that in such a case the instance is considered terminated, but that wasn't happening. Testing this means withholding AMAZON_VM_STATUS commands until the instance has moved into the terminated state and been flushed from the output of ec2-describe-instances.

Comment 3 Jan Sarenik 2008-12-08 08:57:46 UTC
FYI: amazon-gahp was renamed to amazon_gahp
in version 7.2.0-0.8 (7.2.0 pre-release)

Comment 4 Matthew Farrellee 2011-01-07 17:58:27 UTC
If this appears again or more frequently it can be re-opened.


Note You need to log in before you can comment on or make changes to this bug.