Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1053749 - [linearstore] Recovery of store failure with "JERR_MAP_NOTFOUND: Key not found in map." error message
Summary: [linearstore] Recovery of store failure with "JERR_MAP_NOTFOUND: Key not foun...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: Development
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: 3.0
: ---
Assignee: Kim van der Riet
QA Contact: Frantisek Reznicek
URL:
Whiteboard:
Depends On:
Blocks: 709325
TreeView+ depends on / blocked
 
Reported: 2014-01-15 17:27 UTC by Kim van der Riet
Modified: 2015-01-21 12:56 UTC (History)
6 users (show)

Fixed In Version: qpid-cpp-0.22-35
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-01-21 12:56:04 UTC
Target Upstream Version:


Attachments (Terms of Use)
Store files that need to be expanded (deleted)
2014-01-15 17:27 UTC, Kim van der Riet
no flags Details


Links
System ID Priority Status Summary Last Updated
Apache JIRA QPID-5480 None None None Never

Description Kim van der Riet 2014-01-15 17:27:47 UTC
Created attachment 850597 [details]
Store files that need to be expanded

While running a qpid-txtest soak test, a store was encountered which could not be recovered. The broker stopped with the following error message:

[Broker] critical Unexpected error: Error commitjexception 0x0b01 wmgr::dequeue() threw JERR_MAP_NOTFOUND: Key not found in map. (rid=0x3e4f0) (/home/kpvdr/RedHat/qpid/cpp/src/qpid/linearstore/TxnCtxt.cpp:55)

The store in question is attached. An additional similar error message is seen during store shutdown.

To reproduce:

1. Expand the store into a store directory. Note that the store directory is the one that contains the qls directory in the attached archive, and is supplied to the broker using the --store-dir parameter.

2. Start the broker so as to recover the store in the store directory:

./qpidd --load-module linearstore.so -m no --auth no --default-flow-stop-threshold 0 --default-flow-resume-threshold 0 --default-queue-limit 0 --store-dir <path-to-store-dir> --log-enable info+ --truncate no

The store does not recover, and the broker exits with the above error message.

Comment 1 Kim van der Riet 2014-01-15 17:28:51 UTC
Upstream  bug: https://issues.apache.org/jira/browse/QPID-5480

Comment 2 Kim van der Riet 2014-02-05 19:16:09 UTC
Fixed in r.1564877.

This checkin also contains fixes for several other recovery edge cases discovered while testing this fix.

Comment 4 Kim van der Riet 2014-03-10 20:42:32 UTC
The fix for this bug makes a change to the way files are recycled to the empty file pool. The bug was caused by the fact that files that contained only transactionally dequeued records for which the transaction had not yet committed were being returned to the empty file pool prematurely. This should not happen until the transaction has committed. On recovery, the enqueue records for which the open transactional dequeues were missing, and hence the error message JERR_MAP_NOTFOUND was thrown.

QE notes:
---------

The implication of this bug and its fix means that the example store supplied with this bug cannot be used to test the fix, as the nature of the error is in the missing journal files, not in how the store handles the recovery with existing data as is the case with many other bugs.

This bug was found by soak-testing the store using a test similar to QEs qpid-txtest soak. I suggest that if saok-testing does not turn up a similar bug, then the issue is resolved.

Comment 6 Zdenek Kraus 2014-03-11 05:28:29 UTC
According to Comment 4 the bug 1052518 have to be retested.

Comment 11 Frantisek Reznicek 2014-03-19 12:59:51 UTC
Extended run of transaction integrity tests (qpid_test_transaction_integrity, qpid_txtest_fails_bz458053) on three individual bare-metal machines (RHEL 6.5 i686 / x86_64) proved that issue has been reliably resolved.
No broker JERR_MAP_NOTFOUND issue detected out of more than 1400 testing cycles.

Testing packageset:
perl-qpid-0.22-11.el6.i686
perl-qpid-debuginfo-0.22-11.el6.i686
python-qpid-0.22-12.el6.noarch
python-qpid-proton-0.6-1.el6.i686
python-qpid-qmf-0.22-28.el6.i686
qpid-cpp-client-0.22-36.el6.i686
qpid-cpp-client-devel-0.22-36.el6.i686
qpid-cpp-client-devel-docs-0.18-20.el6.noarch
qpid-cpp-client-rdma-0.22-36.el6.i686
qpid-cpp-debuginfo-0.22-36.el6.i686
qpid-cpp-server-0.22-36.el6.i686
qpid-cpp-server-devel-0.22-36.el6.i686
qpid-cpp-server-ha-0.22-36.el6.i686
qpid-cpp-server-linearstore-0.22-36.el6.i686
qpid-cpp-server-rdma-0.22-36.el6.i686
qpid-cpp-server-xml-0.22-36.el6.i686
qpid-java-client-0.22-6.el6.noarch
qpid-java-common-0.22-6.el6.noarch
qpid-java-example-0.22-6.el6.noarch
qpid-jca-0.18-8.el6.noarch
qpid-jca-xarecovery-0.18-8.el6.noarch
qpid-proton-c-0.6-1.el6.i686
qpid-proton-c-devel-0.6-1.el6.i686
qpid-proton-debuginfo-0.6-1.el6.i686
qpid-qmf-0.22-28.el6.i686
qpid-qmf-debuginfo-0.22-28.el6.i686
qpid-qmf-devel-0.22-28.el6.i686
qpid-snmpd-1.0.0-16.el6.i686
qpid-snmpd-debuginfo-1.0.0-16.el6.i686
qpid-tests-0.22-14.el6.noarch
qpid-tools-0.22-9.el6.noarch
rh-qpid-cpp-tests-0.22-36.el6.i686
ruby-qpid-0.7.946106-2.el6.i686
ruby-qpid-qmf-0.22-28.el6.i686

-> VERIFIED


Note You need to log in before you can comment on or make changes to this bug.