Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 451011 - durable perftest in fanout mode fails with sync store
Summary: durable perftest in fanout mode fails with sync store
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: beta
Hardware: ia64
OS: Linux
Target Milestone: ---
: ---
Assignee: Kim van der Riet
QA Contact: Kim van der Riet
Depends On:
TreeView+ depends on / blocked
Reported: 2008-06-12 11:41 UTC by Gordon Sim
Modified: 2009-05-07 20:09 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2008-06-13 11:26:17 UTC
Target Upstream Version:

Attachments (Terms of Use)
Fix option 1 (deleted)
2008-06-12 17:59 UTC, Gordon Sim
no flags Details | Diff
Option 2: change to qpid (deleted)
2008-06-12 18:02 UTC, Gordon Sim
no flags Details
Option 2: change to rhm store (deleted)
2008-06-12 18:03 UTC, Gordon Sim
no flags Details
remove sync store as option (deleted)
2008-06-12 19:51 UTC, Gordon Sim
no flags Details

Description Gordon Sim 2008-06-12 11:41:37 UTC
perftest using durable messages in fanout mode fais on mrg5

Comment 1 Gordon Sim 2008-06-12 12:08:04 UTC
correction to title: only fails on sync store

Comment 2 Alan Conway 2008-06-12 15:53:04 UTC
I've seen two modes of failure - only with sync store. Client exits with:

Processing 2 messages from pub_done .SubscribeThread exception: Closed: 

I've also seen the test hang.

There is a second issue: the client is not displaying the correct error message.
The client is receiving the error: 

2008-jun-12 07:52:07 warning Broker closed connection: 541, Error dequeing
message, persistence id not set (BdbMessageStore.cpp:1208)

but it is getting overwritten with the generic Closed: message.

Comment 3 Gordon Sim 2008-06-12 16:50:02 UTC
The sync store assumes incorrectly that a message will be enqueued on all
matching queues before any dequeues can occur. This is no longer correct and
concurrent enqueues and dequeues on the sync store are unsafe as the code stands.

Comment 4 Gordon Sim 2008-06-12 17:59:37 UTC
Created attachment 309111 [details]
Fix option 1

One possible fix. This has the advantage of being only two lines, but the
disadvantage of affecting all codepaths, not just sync stored msgs.

Comment 5 Gordon Sim 2008-06-12 18:02:42 UTC
Created attachment 309115 [details]
Option 2: change to qpid

This is another option that minimises the impact on transient and async store
codepaths, but requires changes to both qpid and rhm store.

Comment 6 Gordon Sim 2008-06-12 18:03:53 UTC
Created attachment 309116 [details]
Option 2: change to rhm store

This is the corresponding change to the store for option 2.

Comment 7 Gordon Sim 2008-06-12 18:05:33 UTC
The third option would be locking within the store. However, as well as being
less efficient, that would be quite involved especially to avoid impacting the
standard async path, so I feel its not worth pursuing.

Comment 8 Alan Conway 2008-06-12 19:20:21 UTC
Committed fix to client error message problem revision 667205

Comment 9 Alan Conway 2008-06-12 19:37:57 UTC
We really need to perf test the impact of these fixes. 

Fix 1 looks risky to me, it puts two explicit mallocs and a new variable-size
data structure on the critical path.

Fix 2 has no impact for transient messages but I don't know what impact it might
have on async persistence, we would need to test that on Shaks rig before going

I'd strongly lean towards disabling sync store immediately. If there are still
situations where async store is broken then we need to fix it pronto. If we
support sync store for GA we may be dogged by it for months/years to come.

Comment 10 Gordon Sim 2008-06-12 19:51:27 UTC
Created attachment 309139 [details]
remove sync store as option

Comment 11 Gordon Sim 2008-06-12 19:54:08 UTC
I agree removing store is best option (though I don't see where the mallocs are
added), patch attached above to remove the option.

Comment 12 Gordon Sim 2008-06-13 11:26:17 UTC
Agreed solution was to disable the option to runthe store in sync mode.

Note You need to log in before you can comment on or make changes to this bug.