Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 163168 - clustat hangs when member is fenced off.
Summary: clustat hangs when member is fenced off.
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: dlm
Version: 4
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
Assignee: Christine Caulfield
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On: 171153
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-07-13 17:43 UTC by jim wilcox
Modified: 2009-04-16 19:59 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-05-05 07:27:57 UTC


Attachments (Terms of Use)
strace of a clustat hang (deleted)
2005-09-09 18:07 UTC, Henry Harris
no flags Details
strace of a clustat hang (deleted)
2005-09-09 18:07 UTC, Henry Harris
no flags Details
strace of clusvcadm hang (deleted)
2005-09-09 18:09 UTC, Henry Harris
no flags Details
strace of clusvcadm hang (deleted)
2005-09-09 18:09 UTC, Henry Harris
no flags Details
This is a DLM hang, not an rgmanager/clustat problem per se. (deleted)
2005-12-16 18:46 UTC, Lon Hohberger
no flags Details

Description jim wilcox 2005-07-13 17:43:32 UTC
Description of problem:

- With both 2 and 3 node clusters (and suspect this is true for any number of 
nodes in a cluster)when a member is manually fenced off clustat hangs on the 
nodes still in quorum.
- All gfs related operations also hang - which we would expect, but we still 
anticipate that clustat needs to still function and give accurate status.

Version-Release number of selected component (if applicable):

kernel - 2.6.9-11.EL_smp
gfs - 6.1
cluster suite - 4

How reproducible:

Everytime

Steps to Reproduce:
1. configuration a 2 or 3 node cluster (believe it will be the same on any 
number though) for manual fencing
2. pull the heartbeat or do something to stop the heartbeat communication to a 
member.
3. verify the member was fenced off
  
4. go to another that should still have quorum and try executing clustat

Actual results:

clustat hangs

Expected results:

- clustat should not hang and show the fenced off node no longer in the 
cluster.

Additional info:

please let me know if any additional info is required to reproduce. this is a 
high priority item for us. 

thanks in advance.

Jim

Comment 1 Lon Hohberger 2005-07-13 18:39:35 UTC
clustat is really a piece of rgmanager, which will stop during transitions if
GFS is in use.

clustat should probably time out after a few seconds of trying to reach clurgmgrd.





Comment 2 Henry Harris 2005-09-09 18:07:26 UTC
Created attachment 118651 [details]
strace of a clustat hang

This clustat hang occurred while running the test described in bug #166701.

Comment 3 Henry Harris 2005-09-09 18:07:39 UTC
Created attachment 118652 [details]
strace of a clustat hang

This clustat hang occurred while running the test described in bug #166701.

Comment 4 Henry Harris 2005-09-09 18:09:09 UTC
Created attachment 118653 [details]
strace of clusvcadm hang

This clusvcadm hang occurred while running the test described in bug #166701.

Comment 5 Henry Harris 2005-09-09 18:09:17 UTC
Created attachment 118654 [details]
strace of clusvcadm hang

This clusvcadm hang occurred while running the test described in bug #166701.

Comment 8 Lon Hohberger 2005-12-16 18:46:17 UTC
Created attachment 122339 [details]
This is a DLM hang, not an rgmanager/clustat problem per se. 

Rgmanager goes into D (disk-wait/task-uninterruptible state) waiting on the
DLM; here's a Sysrq-T when this happens.

Comment 10 Christine Caulfield 2005-12-22 10:21:11 UTC
With luck this will turn out to be the same bug as #175805.

Has anybody tested with this fix in place ?


Note You need to log in before you can comment on or make changes to this bug.