Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 595010 - job_server: calling GetJobSummaries on a submission with live jobs causes seg fault
Summary: job_server: calling GetJobSummaries on a submission with live jobs causes seg...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: Development
Hardware: All
OS: Linux
high
high
Target Milestone: 1.3
: ---
Assignee: Pete MacKinnon
QA Contact: Tomas Rusnak
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-05-22 19:57 UTC by Pete MacKinnon
Modified: 2010-07-22 17:08 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-07-22 17:08:56 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Pete MacKinnon 2010-05-22 19:57:38 UTC
Stack dump for process 18818 at timestamp 1274558043 (19 frames)
condor_job_server(dprintf_dump_stack+0xc7)[0x80fc5db]
condor_job_server[0x80fc7b2]
[0x675400]
condor_job_server(_ZN8AttrList8NextExprEv+0x54)[0x8159546]
condor_job_server(_Z15jobToVariantMapPK3JobRSt3mapISsN4qpid5types7VariantESt4lessISsESaISt4pairIKSsS5_EEEPPKc+0xde)[0x80d7403]
condor_job_server(_ZN16SubmissionObject15GetJobSummariesERSt3mapISsN4qpid5types7VariantESt4lessISsESaISt4pairIKSsS3_EEERSs+0x289)[0x80ce1dd]
condor_job_server(_ZN16SubmissionObject16ManagementMethodEjRN4qpid10management4ArgsERSs+0x38)[0x80ce848]
condor_job_server(_ZN3qmf3com6redhat4grid10Submission8doMethodERSsRKSt3mapISsN4qpid5types7VariantESt4lessISsESaISt4pairIKSsS8_EEERSF_+0x5f0)[0x80c69ce]
/usr/lib/libqmf.so.1(_ZN4qpid10management19ManagementAgentImpl19invokeMethodRequestERKSsS3_S3_+0x1057)[0x13fba7]
/usr/lib/libqmf.so.1(_ZN4qpid10management19ManagementAgentImpl13pollCallbacksEj+0xc6)[0x147276]
condor_job_server(_Z16HandleMgmtSocketP7ServiceP6Stream+0x1f)[0x80c8427]
condor_job_server(_ZN10DaemonCore24CallSocketHandler_workerEibP6Stream+0x187)[0x80e26c7]
condor_job_server(_ZN10DaemonCore35CallSocketHandler_worker_demarshallEPv+0x3b)[0x80e252f]
condor_job_server(_ZN13CondorThreads8pool_addEPFvPvES0_PiPKc+0x29)[0x813dd2b]
condor_job_server(_ZN10DaemonCore17CallSocketHandlerERib+0x1bc)[0x80e24f2]
condor_job_server(_ZN10DaemonCore6DriverEv+0x180e)[0x80e2200]
condor_job_server(main+0x1ce0)[0x80f6a87]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7c4a86]
condor_job_server[0x80be061]
Segmentation fault

Something wrong about the parent chaining of classads...

Comment 1 Pete MacKinnon 2010-05-24 21:35:35 UTC
Trying to externalize the cluster-to-job classads was casuing all sorts of problems. Went to a model where LiveJob ctor chains parents classad.

Comment 2 Pete MacKinnon 2010-07-20 15:09:01 UTC
1) ensure condor is setup for QMF plugins
2) QMF_PUBLISH_SUBMISSIONS=True
3) submit job using condor_submit
4) qpid-tool
5) "list com.redhat.grid:submission"
6) choose a corresponding submission in list from step1 (submission name should have cluster id at end of string)
7) "call some_qmf_object_number_from_step_6 GetJobSummaries"

should return a map of job details like cmd, args, etc.


Note You need to log in before you can comment on or make changes to this bug.