Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 453672 - system appears to deadlock (OOM) during 3-way cmirror I/O plus failure
Summary: system appears to deadlock (OOM) during 3-way cmirror I/O plus failure
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: cmirror-kernel
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jonathan Earl Brassow
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-07-01 19:57 UTC by Corey Marthaler
Modified: 2010-05-14 19:59 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-05-14 19:59:27 UTC


Attachments (Terms of Use)
log and kern dump from taft-01 (deleted)
2008-07-01 20:40 UTC, Corey Marthaler
no flags Details

Description Corey Marthaler 2008-07-01 19:57:36 UTC
Description of problem:
I created 3 3-way mirrors and then started I/O to all 3 mirrors from all 4 nodes
(taft-0[1234]). I noticed that taft-01 started to slow way down right away.
Then, after failing /dev/sdh it became almost unresponsive. I then killed
taft-02 (in an attempt to test bz 233034). That caused taft-01 to just about
completly lock up. All the other nodes' recovery is stuck waiting for taft-01 to
fence taft-02.

  mirror1            taft       Mwi-ao 15.00G                    mirror1_mlog
100.00         mirror1_mimage_0(0),mirror1_mimage_1(0),mirror1_mim
age_2(0)
  [mirror1_mimage_0] taft       iwi-ao 15.00G                                  
             /dev/sdb1(0)
  [mirror1_mimage_1] taft       iwi-ao 15.00G                                  
             /dev/sdc1(0)
  [mirror1_mimage_2] taft       iwi-ao 15.00G                                  
             /dev/sdd1(0)
  [mirror1_mlog]     taft       lwi-ao  4.00M                                  
             /dev/sdh1(0)

  mirror2            taft       Mwi-ao 15.00G                    mirror2_mlog
100.00         mirror2_mimage_0(0),mirror2_mimage_1(0),mirror2_mim
age_2(0)
  [mirror2_mimage_0] taft       iwi-ao 15.00G                                  
             /dev/sde1(0)
  [mirror2_mimage_1] taft       iwi-ao 15.00G                                  
             /dev/sdf1(0)
  [mirror2_mimage_2] taft       iwi-ao 15.00G                                  
             /dev/sdg1(0)
  [mirror2_mlog]     taft       lwi-ao  4.00M                                  
             /dev/sdd1(3840)

  mirror3            taft       Mwi-ao 15.00G                    mirror3_mlog
100.00         mirror3_mimage_0(0),mir
age_2(0)
  [mirror3_mimage_0] taft       iwi-ao 15.00G                                  
             /dev/sdh1(1)
  [mirror3_mimage_1] taft       iwi-ao 15.00G                                  
             /dev/sdb1(3840)
  [mirror3_mimage_2] taft       iwi-ao 15.00G                                  
             /dev/sdc1(3840)
  [mirror3_mlog]     taft       lwi-ao  4.00M                                  
             /dev/sdd1(3841)  

I'll attach a kern dump from taft-01.


Version-Release number of selected component (if applicable):
2.6.9-71.ELsmp

lvm2-2.02.37-3.el4    BUILT: Thu Jun 12 10:09:19 CDT 2008
lvm2-cluster-2.02.37-3.el4    BUILT: Thu Jun 12 10:22:07 CDT 2008
device-mapper-1.02.25-2.el4    BUILT: Mon Jun  9 09:28:41 CDT 2008
cmirror-1.0.1-1    BUILT: Tue Jan 30 17:28:02 CST 2007
cmirror-kernel-2.6.9-41.4    BUILT: Tue Jun  3 13:54:29 CDT 2008

Comment 1 Corey Marthaler 2008-07-01 20:38:38 UTC
Looks like this is some kind of memory leak.

Comment 2 Corey Marthaler 2008-07-01 20:40:05 UTC
Created attachment 310717 [details]
log and kern dump from taft-01

Comment 4 Jonathan Earl Brassow 2010-05-14 19:59:27 UTC
No 3-way cluster mirrors on rhel4.

If bug is present in re-write of later releases, please open new bug(s).


Note You need to log in before you can comment on or make changes to this bug.