Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 236580 - [HA LVM]: Bringing site back on-line after failure causes problems
Summary: [HA LVM]: Bringing site back on-line after failure causes problems
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: rgmanager
Version: 4
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Jonathan Earl Brassow
QA Contact: Cluster QE
Depends On:
TreeView+ depends on / blocked
Reported: 2007-04-16 15:31 UTC by Jonathan Earl Brassow
Modified: 2009-04-16 20:05 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2009-02-05 00:19:57 UTC

Attachments (Terms of Use) script with bad device exclusion (deleted)
2007-04-16 15:35 UTC, Jonathan Earl Brassow
no flags Details

Description Jonathan Earl Brassow 2007-04-16 15:31:48 UTC
Our HA cluster is configured as follows:
Two node cluster - one node in the 'B' datacentre and one node in the 'C' datacentre.
Two disk arrays - one in each datacentre.
Two services - each using lvm volumes that are mirrored across the two disk arrays.
IPMI is the only automatic fencing mechanism.

To simulate a failure of the 'C' datacentre, we simultaneously shut the power off to the 'C' node while 
disabling the SAN ports to the 'C' disk array. To prevent the 'B' node from fencing the 'C' node - the 'B' 
node network interface that connects to the 'C' node IPMI device was disabled. 

After the 'C' node had missed too many heartbeats, the 'B' node attempted to fence the 'C' node using 
fence_ipmilan. This failed because the 'B' node couldn't connect to the 'C' node IPMI device. 

I then initiated a manual fence with the fence_ack_manual command. The 'B' node successfully took 
over the services from the 'C' node. It handled the volume group inconsistencies, and successfully 
activated the previously mirrored volumes as linear volumes. 

Up to this point I'm very happy with how it's operating! 

The problems begin if I then power on the 'C' node again. At the point when the 'C' node is powered on, 
the 'B' node is running all the services and the SAN ports to the 'C' disk array are still unavailable. 

When rgmanager starts on the 'C' node, it attempts to stop all the resources that are running on the 'B' 
node. It then appears to attempt to start the services locally - even though they are running on the B-
node. When I run clustat on the 'C' node, it now reports that all the services are failed and that the last 
node they ran on was the 'C' node. 

I wanted to see if the logical volumes were still active on the 'B' node; however, when I entered the 'lvs -
a -o +devices,tags' command on the 'B' node, it hung and never returned. No LVM commands would 
return on the 'B' node. The only way I could recover the 'B' node was to power it off and on again. I 
couldn't reboot the node because the Cluster Suite services were hung. 

When I enter the 'lvs -a -o +devices,tags' command on the 'C' node, the Cluster Suite-managed 
volumes are NOT active, but they are tagged with BOTH nodenames!

Comment 1 Jonathan Earl Brassow 2007-04-16 15:34:16 UTC
I'm seeing something very different from you, but it may be worth trying with the changes I've made.

Here's what I see:
Site fail-over works fine.  If I reactivate the failed site (including the storage device), when the service 
tries to move back, it fails to activate due to a conflict it sees in the available devices.  [The failed device 
has now come back - leaving a LVM metadata conflict.]  This leaves the service in the 'failed' state.

Here's what I've done.
I've added some code to determine what the valid devices are, and use those and only those devices 
when activating.  This solved the problem for me.  You will need to ensure that this works fine with 
your multipath setup.  I don't think there should be issues in that regard, but I don't want to guess.

This may not be the issue you are seeing, but the bug I found could certainly cause similar problems.  
Be sure that you have the latest updates.  I've attached the file to be placed in /usr/share/cluster 
on all the machines.  When we've gone through a few successful iterations of testing we will be sure to 
commit the changes.

Comment 2 Jonathan Earl Brassow 2007-04-16 15:35:49 UTC
Created attachment 152701 [details] script with bad device exclusion

Comment 3 Jonathan Earl Brassow 2007-04-18 18:17:17 UTC
bad device exclusion script with minor changes checked-in

assigned -> post

Another concern I have in the user's implementation is the initrd.  The initrd
should (must) contain the correctly modified lvm.conf

Comment 4 Chris Feist 2009-02-05 00:19:57 UTC
This has been built and is in the current RHEL4 release of rgmanager.

Note You need to log in before you can comment on or make changes to this bug.