Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 235686

Summary: Kernel BUG at dm_cmirror_server while recovering region
Product: [Retired] Red Hat Cluster Suite Reporter: Corey Marthaler <cmarthal>
Component: cmirrorAssignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED CURRENTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: agk, cfeist, dwysocha, jbrassow, kanderso, mbroz, prockai
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-08-05 21:41:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Corey Marthaler 2007-04-09 16:27:16 UTC
Description of problem:
I was running a standard cmirror failure scenario and while the recovering was
taking place, link-08 died:

Senario: Kill primary leg of synced 2 leg mirror

****** Mirror hash info for this scenario ******
* name:      fail_primary_synced_2_legs
* sync:      1
* disklog:   1
* failpv:    /dev/sdb1
* legs:      2
* pvs:       /dev/sdb1 /dev/sdf1 /dev/sde1

Creating mirror on link-02...
qarsh root@link-02 lvcreate -m 1 -n fail_primary_synced_2_legs -L 800M
helter_skelter /dev/sdb1:0-500 /dev/sdf1:0-500 /dev/sde1:0-50

Waiting for mirror to sync
Verifying that the mirror is fully syncd, currently at
 ...6.50% ...17.50% ...28.50% ...39.00% ...49.50% ...60.50% ...70.50% ...81.00%
...91.50% ...100.00%

Creating gfs on top of mirror on link-02...
Creating mnt point /mnt/fail_primary_synced_2_legs on link-02...
Mounting gfs on link-02...
Creating mnt point /mnt/fail_primary_synced_2_legs on link-04...
Mounting gfs on link-04...
Creating mnt point /mnt/fail_primary_synced_2_legs on link-07...
Mounting gfs on link-07...
Creating mnt point /mnt/fail_primary_synced_2_legs on link-08...
Mounting gfs on link-08...

Disabling device sdb on link-02
Disabling device sda on link-04
Disabling device sdb on link-07
Disabling device sdb on link-08

Attempting I/O to cause mirror down conversion on link-02
10+0 records in
10+0 records out
Verifying the down conversion from mirror to linear
  /dev/sdb1: open failed: No such device or address

GFS: fsid=LINK_128:gfs.3: Joined cluster. Now mounting FS...
GFS: fsid=LINK_128:gfs.3: jid=3: Trying to acquire journal lock...
GFS: fsid=LINK_128:gfs.3: jid=3: Looking at journal...
GFS: fsid=LINK_128:gfs.3: jid=3: Done
Apr  9 06:23:23 link-08 kernel: GFS: fsid=LINK_128:gfs.3: Joined cluster. Now
mounting FS...
Apr  9 06:23:23 link-08 kernel: GFS: fsid=LINK_128:gfs.3: jid=3: Trying to
acquire journal lock...
Apr  9 06:23:23 link-08 kernel: GFS: fsid=LINK_128:gfs.3: jid=3: Looking at
Apr  9 06:23:23 link-08 kernel: GFS: fsid=LINK_128:gfs.3: jid=3: Done
Apr  9 06:23:23 link-08 qarshd[5509]: That's enough
Apr  9 06:23:28 link-08 qarshd[5541]: Talking to peer
Apr  9 06:23:28 link-08 qarshd[5541]: Running cmdline: echo offline >
Apr  9 06:23:28 link-08 qarshd[5541]: That's enough
dm-cmirror: Recovering region mismatch: (5/18446744073709551615)
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at dm_cmirror_server:703
invalid operand: 0000 [1] SMP
Modules linked in: lock_dlm(U) gfs(U) lock_harness(U) dm_cmirror(U) dlm(U)
cman(U) md5 ipv6 parport_pc lp parport autofs4 sunrpc ds yenta_socket
pcmcia_core button battery ac ohci_hcd hw_random k8_edac edac_mc tg3 floppy
dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod qla2300 qla2xxx scsi_transport_fc
mptscsih mptsas mptspi mptscsi mptbase sd_mod scsi_mod
Pid: 5467, comm: cluster_log_ser Not tainted 2.6.9-50.ELsmp
RIP: 0010:[<ffffffffa025be6e>]
RSP: 0018:000001001a805e28  EFLAGS: 00010212
RAX: 0000000000000044 RBX: 000001003d66c920 RCX: ffffffff803e5768
RDX: ffffffff803e5768 RSI: 0000000000000246 RDI: ffffffff803e5760
RBP: 000001003c1f2680 R08: ffffffff803e5768 R09: 000001003d66c920
R10: 0000000100000000 R11: ffffffff8011e9a4 R12: 000001001f789400
R13: ffffffffa02617a0 R14: 0000000000000001 R15: 000001003c1f26a0
FS:  0000002a95562b00(0000) GS:ffffffff804ee200(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000031d2f2f8d8 CR3: 0000000000101000 CR4: 00000000000006e0
Process cluster_log_ser (pid: 5467, threadinfo 000001001a804000, task
Stack: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
       0000000000000000 0000000000000000 000001003b5434c0 000001003b5434c0
       000001001a805eb8 0000000000000010
Call Trace:<ffffffff80134660>{default_wake_function+0}

Code: 0f 0b 52 e1 25 a0 ff ff ff ff bf 02 49 c7 44 24 40 ff ff ff
RIP <ffffffffa025be6e>{:dm_cmirror:cluster_log_serverd+3721} RSP <000001001a805e28>
 <0>Kernel panic - not syncing: Oops

Version-Release number of selected component (if applicable):

How reproducible:
Only once so far

Comment 1 Jonathan Earl Brassow 2007-04-09 16:54:59 UTC
no machines killed during this test?

Comment 2 Corey Marthaler 2007-04-09 17:55:57 UTC
no machines were killed as apart of this test.

Comment 3 Jonathan Earl Brassow 2007-04-09 17:59:19 UTC
Here is the code causing the machine to panic:

			if (lr->u.lr_region != lc->recovering_region) {
				DMERR("Recovering region mismatch: (%Lu/%Lu)",
				      lr->u.lr_region, lc->recovering_region);

The question is, how do we have a record state saying that there is a region in
recovery _and_ lc->recovering_region saying that there is not a region in recovery.

Comment 4 Jonathan Earl Brassow 2007-04-10 07:14:00 UTC
Several fixes have gone in to fix the handling of this bug:
1) During server relocation (which can happen due to machine failure
or normal mirror suspension), the server value could get set before
the client had a chance to clean-up.  This caused the server to
become confused and issue a BUG().

2) perform a flush of the log before suspending.  This ensures
that regions which are in-sync get correctly flushed to the disk
log.  Without this, there will always be recovery work to be done
when a mirror starts up - even if it was properly in-sync during

3) clean-up memory used to record region users when a mirror is
shutdown.  It was possible for some regions to be left over
(causing a memory leak) during certain fault scenarios.

4) properly initialize the state field (ru_rw) in the region
user structure when a mark occurs.  Without the initialization,
it was sometimes possible for the region to be misinterpretted
as recovering instead of marked.

5) resolve and unhandled case in server_complete_resync_work

6) reset a variable in cluster_complete_resync_work.  Failure
to do so was causing a retry to include the wrong value for
the completion of the resync work - confusing the server.

assigned -> post

Comment 5 Rob Kenna 2007-04-10 10:55:44 UTC
Important repair for cluster mirror release in 4.5.  pm-ack

Comment 6 Corey Marthaler 2007-04-10 18:20:07 UTC
This BUG can still be tripped when attempting the 'non synced secondary leg
failure' scenario. Marking back to ASSIGNED, with the QA Whiteboard tag.

Comment 7 Jonathan Earl Brassow 2007-04-10 18:23:26 UTC
Here are the comments I put into the code validating the scenario encountered
and obviating the need for the BUG();

<               BUG();
>               /*
>                * This is a valid case, when the following happens:
>                * 1) a region is recovering and has waiting writes
>                * 2) recovery fails and calls complete_resync_work (w/ failure)
>                * 2.1) RU is removed from our list
>                * 3) waiting writes are released
>                * 3.1) writes do not mark, because b/c region state != RH_CLEAN
>                * 4) write fails and calls complete_resync_work (w/ failure)
>                * 5) boom, we are here.
>                *
>                * Not a bug to be here
>                */

assigned -> post

Comment 8 Jonathan Earl Brassow 2007-08-22 17:37:25 UTC
post -> modified.

Comment 9 Corey Marthaler 2007-08-24 18:42:06 UTC
Fix verified with the latest code.

Comment 10 Chris Feist 2008-08-05 21:41:53 UTC
Fixed in current release (4.7).