Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 162814 - Assertion failure in log_do_checkpoint
Summary: Assertion failure in log_do_checkpoint
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
: ---
Assignee: Stephen Tweedie
QA Contact: Brian Brock
URL:
Whiteboard:
: 167343 200434 (view as bug list)
Depends On: 123137
Blocks: 168429
TreeView+ depends on / blocked
 
Reported: 2005-07-08 21:23 UTC by Stephen Tweedie
Modified: 2018-10-19 19:17 UTC (History)
5 users (show)

Fixed In Version: RHSA-2006-0132
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-03-07 19:17:12 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2005:808 normal SHIPPED_LIVE Important: kernel security update 2005-10-27 04:00:00 UTC
Red Hat Product Errata RHSA-2006:0132 qe-ready SHIPPED_LIVE Moderate: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 3 2006-03-09 16:31:00 UTC

Description Stephen Tweedie 2005-07-08 21:23:13 UTC
+++ This bug was initially created as a clone of Bug #123137 +++

Description of problem:
Assertion failure in log_do_checkpoint() at fs/jbd/checkpoint.c:361:
"drop_count != 0 || cleanup_ret != 0"
------------[ cut here ]------------
kernel BUG at fs/jbd/checkpoint.c:361!

Version-Release number of selected component (if applicable):
2.6.5-1.327

How reproducible:
Rare

System was a dual Xeon with AMI Megaraid RAID controller.  File
systems are Ext3.

I'll attach the oops output in a second.

Comment 3 Dave Jones 2005-09-05 03:53:35 UTC
*** Bug 167343 has been marked as a duplicate of this bug. ***

Comment 4 Jeff Welden 2005-09-12 23:29:40 UTC
There is a one-line fix for this by Jan Kara in the Vanilla Linux Kernel with
2.6.11.12:
    http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.11.12

Additional discussion:
    http://lkml.org/lkml/2005/6/1/34
    http://marc.theaimsgroup.com/?l=linux-kernel&m=111761151011571&w=2

Is it possible for you to create a patch for this for 2.6.9-11 EL smp kernel?

Comment 5 Need Real Name 2005-09-14 22:52:55 UTC
I've tried this patch, and it DOES seem to fix this problem!   Well done! 
Hopefully RedHat will create a kernel update ASAP.


Comment 6 Need Real Name 2005-10-03 20:07:40 UTC
This patch has been in production for 3 weeks now without a single problem. 
These machines would PANIC almost daily before, mostly at night when we were
running backups.  

Maybe this problem is mostly associated with high-end hardware, like DL380s, but
I would think that RedHat would be interested in fixing such a serious problem,
especially ones that affect their target hardware.

Sofar, I've heard nothing to show that RedHat interested in fixing this.

Will this patch be included in a future kernel?

Comment 7 Stephen Tweedie 2005-10-05 20:24:57 UTC
Yes, this fix looks good, and it matches the upstream fix.  It will be queued
subject to the usual internal review for the U3 kernel.

I have a kernel built based on U2 plus 3 filesystem fixes:
* readahead fixes for random >4k read performance
* ext3 performance fix for very slow performance when writing large files on
huge filesystems
* this log_do_checkpoint fix.

i686 and x86_64 kernels are available from:

http://people.redhat.com/sct/.private/test-kernels/kernel-2.6.9-22.EL.sct.4/

Comment 9 Stephen Tweedie 2005-11-07 19:12:33 UTC
Fix committed for inclusion in U3.

Comment 12 Red Hat Bugzilla 2006-03-07 19:17:12 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0132.html


Comment 15 Jason Baron 2006-07-27 19:37:41 UTC
*** Bug 200434 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.