Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 446213 - optimise transactions for transient-only messaging
Summary: optimise transactions for transient-only messaging
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 1.0
Hardware: All
OS: Linux
urgent
medium
Target Milestone: 1.1
: ---
Assignee: Gordon Sim
QA Contact: Kim van der Riet
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-05-13 15:21 UTC by Gordon Sim
Modified: 2009-02-04 15:35 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-02-04 15:35:07 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2009:0035 normal SHIPPED_LIVE Red Hat Enterprise MRG Messaging 1.1 Release 2009-02-04 15:33:44 UTC

Description Gordon Sim 2008-05-13 15:21:17 UTC
Currently transactions will always result in a write to disk even if all the
published (or consumed) messages are transient. Would like to optimise this
special case.

Comment 1 Gordon Sim 2008-06-20 08:26:57 UTC

*** This bug has been marked as a duplicate of 452119 ***

Comment 2 Gordon Sim 2008-06-20 13:33:41 UTC
Mistakenly marked this a duplicate.

Comment 3 Carl Trieloff 2008-10-09 18:06:31 UTC
The broker is fine for publish side transactions. It matches sync for tx=1 ans sync and is within 20% of non-tx for async.

It does however have a slowdown on consume txns.

The issue is in the TxAccept, in that is looks at all messages and not just the txn range. All CPU time is attributed to evaluating a range that is to wide.


89153    42.7759  libqpidcommon.so.0.1.0   qpid::RangeSet<qpid::framing::SequenceNumber>::contains(qpid::framing::SequenceNumber const&) const
41731    20.0226  libqpidbroker.so.0.1.0   void std::list<qpid::broker::DeliveryRecord, std::allocator<qpid::broker::DeliveryRecord> >::remove_if<std::const_mem_fun_ref_t<bool, qpid::broker::DeliveryRecord> >(std::const_mem_fun_ref_t<bool, qpid::broker::DeliveryRecord>)
37460    17.9734  libqpidbroker.so.0.1.0   qpid::broker::DeliveryRecord::isRedundant() const
7896      3.7885  libqpidcommon.so.0.1.0   __gnu_cxx::__normal_iterator<qpid::Range<qpid::framing::SequenceNumber> const*, std::vector<qpid::Range<qpid::framing::SequenceNumber>, qpid::InlineAllocator<std::allocator<qpid::Range<qpid::framing::SequenceNumber> >, 3ul> > > std::lower_bound<__gnu_cxx::__normal_iterator<qpid::Range<qpid::framing::SequenceNumber> const*, std::vector<qpid::Range<qpid::framing::SequenceNumber>, qpid::InlineAllocator<std::allocator<qpid::Range<qpid::framing::SequenceNumber> >, 3ul> > >, qpid::Range<qpid::framing::SequenceNumber> >(__gnu_cxx::__normal_iterator<qpid::Range<qpid::framing::SequenceNumber> const*, std::vector<qpid::Range<qpid::framing::SequenceNumber>, qpid::InlineAllocator<std::allocator<qpid::Range<qpid::framing::SequenceNumber> >, 3ul> > >, __gnu_cxx::__normal_iterator<qpid::Range<qpid::framing::SequenceNumber> const*, std::vector<qpid::Range<qpid::framing::SequenceNumber>, qpid::InlineAllocator<std::allocator<qpid::Range<qpid::framing::SequenceNumber> >, 3ul> > >, qpid::Range<qpid::framing::SequenceNumber> const&)
7713      3.7007  libqpidbroker.so.0.1.0   qpid::broker::TxAccept::commit()
7627      3.6595  libqpidbroker.so.0.1.0   qpid::broker::TxAccept::prepare(qpid::broker::TransactionContext*)

Comment 4 Gordon Sim 2008-10-28 13:38:39 UTC
I've revised the handling of the tranasactional accepts to remove the redundant work mentioned above. With sync commit this is still slow; with async commit its much faster.

Comment 5 Frantisek Reznicek 2008-10-31 15:08:41 UTC
No clear test info, expectations are clear. Moving to NEEDINFO.

Comment 6 Frantisek Reznicek 2008-10-31 15:37:56 UTC
Mistakenly moved to ASSIGNED, returning back to MODIFIED. NEEDINFO flag set.

Comment 7 Gordon Sim 2008-10-31 16:23:59 UTC
There isn't really a clearly defined failure/success criteria for this on. In general a test such as  ./src/tests/perftest --tx 1 --async-commit true should not be 'too much slower' than a simple ./src/tests/perftest with no args. Very hard to specify a hard and fast rule for what that means in practice. You certainly should not see an order of magnitude difference. Is that sufficient?

Comment 9 Frantisek Reznicek 2008-12-10 13:09:40 UTC
The data summary says:
rev.724474
time { perftest --tx 1 --async-commit true; }
real    10m6.475s
time { perftest; }
real    1m15.135s

mrg1.0.1 rev.687156
time { perftest --tx 1; }  // more than 2 hours, stopped after approx. 2 hours
real    117m5.733s
time { perftest; }
real    0m29.285s

Detail transcript is pasted below.

After discussion with gsim (also pasted below) we come to conclusion that improvement is noticable, so bug is treated as validated (on RHEL5).

->VERIFIED


Detailed transcript:

qpidd --auth no --worker-threads 8 --mgmt-enable no

MRG-candidate
=============

==================================================================
dhcp-0-199:~# rpm -q python-qpid qpidc qpidc-perftest qpidd
python-qpid-0.3.722953-1.el5
qpidc-0.3.724474-2.el5
qpidc-perftest-0.3.724474-2.el5
qpidd-0.3.724474-2.el5

==================================================================
dhcp-0-199:~# time { perftest --tx 1 --async-commit true; }
Processing 1 messages from sub_ready . done.
Sending start 1 times to pub_start
Processing 1 messages from pub_done . done.
Processing 1 messages from sub_done . done.

Total 1000000 transfers of 1024 bytes in 606.416 seconds.

Publish transfers/sec:    
824.764
Average: 824.764

Subscribe transfers/sec:  
824.565
Average: 824.565

Total transfers/sec:      1649.03
Total Mbytes/sec: 1.61038

real    10m6.475s
user    0m10.332s
sys     6m3.207s

==================================================================
dhcp-0-199:~# time { perftest; }
Processing 1 messages from sub_ready . done.
Sending start 1 times to pub_start
Processing 1 messages from pub_done . done.
Processing 1 messages from sub_done . done.

Total 1000000 transfers of 1024 bytes in 75.0588 seconds.

Publish transfers/sec:    
6664.2
Average: 6664.2

Subscribe transfers/sec:  
6660.88
Average: 6660.88

Total transfers/sec:      13322.9
Total Mbytes/sec: 13.0106

real    1m15.135s
user    0m1.205s
sys     0m55.134s

MRG
===

==================================================================
dhcp-0-199:~# rpm -q python-qpid qpidc qpidc-perftest qpidd
python-qpid-0.2.668378-1.el5
qpidc-0.2.687156-1.el5
qpidc-perftest-0.2.687156-1.el5
qpidd-0.2.687156-1.el5

==================================================================
dhcp-0-199:~# time { perftest --tx 1; }
Processing 1 messages from sub_ready . done.
Sending start 1 times to pub_start
Processing 1 messages from pub_done . done.
Processing 1 messages from sub_done .
^C

real    117m5.733s
user    0m13.084s
sys     12m59.402s

==================================================================
dhcp-0-199:~# time { perftest; }
Processing 1 messages from sub_ready . done.
Sending start 1 times to pub_start
Processing 1 messages from pub_done . done.
Processing 1 messages from sub_done . done.

Total 1000000 transfers of 1024 bytes in 23.1444 seconds.

Publish transfers/sec:    
21693.7
Average: 21693.7

Subscribe transfers/sec:  
21621.4
Average: 21621.4

Total transfers/sec:      43207.1
Total Mbytes/sec: 42.1944

real    0m29.285s
user    0m1.651s
sys     0m10.969s


Discussion about BZ:
 
14:05:26
freznice  
gsim:Hi Gordon, I just got data on https://bugzilla.redhat.com/show_bug.cgi?id=446213
Would you have the time for review?
gsim: rev.724474
time { perftest --tx 1 --async-commit true; }
real    10m6.475s
time { perftest; }
real    1m15.135s

mrg1.0.1 rev.687156
time { perftest --tx 1; }  // more than 2 hours, stopped after approx. 2 hours
real    117m5.733s
time { perftest; }
real    0m29.285s
14:07:35
gsim  
freznice: I think we should just close it
freznice: there may well be more that can be done, but its an improvement on what was there and we didn't have any explicit targets
freznice: my view is there are more important defects to verify
freznice: paste in the perftest output by all means (i.e. throughputs)
14:08:54
freznice  
gsim: ok, I'll will
gsim: ok, thanks

Comment 11 errata-xmlrpc 2009-02-04 15:35:07 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-0035.html


Note You need to log in before you can comment on or make changes to this bug.