Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 159740 - cluster heartbeat interface not used
Summary: cluster heartbeat interface not used
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: clumanager
Version: 3
Hardware: i386
OS: Linux
Target Milestone: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
Depends On:
TreeView+ depends on / blocked
Reported: 2005-06-07 17:39 UTC by Luis Alexandre Fontes
Modified: 2009-04-16 20:17 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2005-06-08 19:25:12 UTC

Attachments (Terms of Use)

Description Luis Alexandre Fontes 2005-06-07 17:39:15 UTC
We have 2 machines, each one with the following configuration:
- RHEL 3.0 AS u3;
- clumanager 1.2.22-2;
- 2 NICS (Intel Gigabit): eth0 for public and eth1 for private network;
- 1 HBA (QLogic 2200) attached to EMC storage;

this is the physical model:

                            |             |
                            |   router    |
                            |  |
                            |             |
        eth0 +---------------------+--------------------+ eth0 |                                          |
             |        (virtual ip:        |
             |                                          |
     +-------+------+ eth1                      +-------+------+
     |              |                  |              |
     | linpro035ias +---------heartbeat---------+ linpro036ias |
     |              |         |              |
     +-------+------+                      eth1 +-------+------+
         hba |                                          | hba
                            |             |
                            | EMC storage |
                            |  /dev/sdc3  |
                            |             |

- the member names are linpro035ias (which points to and 
linpro036ias (which points to;
- the virtual ip ( associated with the service in the cluster;
- the heartbeat is broadcast;
- the tiebreaker is network (ip, the router/gateway);
- there's a crossover cable for private networking (eth1);

The problem:

If clumembd%broadcast_primary_only is not defined (or set to no, the default 
setting), heartbeat packets are sent over eth0 (public) and eth1 (private),
but if I unplug the crossover cable, the cluster continues as if nothing were 

If clumembd%broadcast_primary_only is set to yes, heartbeat packets are sent 
just over eth0.

So, I conclude that a private network between the two nodes is not being used.

Steps to reproduce:

01. set eth0 on first machine to; on the second machine to (both connected to a router/gateway);
02. set eth1 on first machine to; on the second machine to 
(connected using a crossover cable);
03. set a cluster service to use httpd (/etc/init.d/httpd);
04. set a service ip address to;
05. set a device (/dev/sdc3, attached to the storage), to mount /u02 as ext3
06. /u02 must contain the www directory (mv /var/www /u02);
07. /etc/httpd/conf/httpd.conf must be edited to replace /var/www to /u02/www;
08. in the Cluster Daemon Properties, enable Broadcast Heartbeating and Network 
Tiebreaker ( -> the router/gateway);
09. edit /etc/syslog.conf and append the following line:
    local4.*  /var/log/cluster
    and restart the syslog service;
09. start the rawdevices service;
10. start the clumanager service;

What happened after you performed the steps above? 
1. if the crossover cable is unplugged (that should be used for heartbeating), 
nothing happens; it can be monitored using 'tail -f /var/log/cluster';
2. if the crossover cable is plugged again, and clumembd%broadcast_primary_only 
is set to yes
   ( cludb -put clumembd%broadcast_primary_only yes ) the heartbeat packets go 
through eth0 (enhancement requested by Lon Hohberger), so eth1 is not used.

What should have happened instead?

Lon Hohberger said on =>
"Cluster Manager requires that all members coexist on the same fully connected 
subnet and that the link(s) used for cluster communication are the same link(s) 
used to monitor the tiebreaker IP address."

So, if the link used for cluster communication must be the same link used to 
monitor the tiebreaker IP address:

a) packets used to attend the service httpd go through eth0;
b) packets used to monitor the tiebreaker IP address go through eth0;
c) heartbeat packets go through eth0;
d) eth1 (that should be used for heartbeating) is useless (and never used with 
clumembd%broadcast_primary_only set to yes);

So, the "clumembd%broadcast_primary_only" set to yes will make the service 
packets (httpd), heartbeating packets and tiebreaker monitoring packets go 
through all the same physical interface (ie eth0).

I think that we should have a parameter, or an option in the cluster 
configuration GUI, to specify that broadcast heartbeating will use network 
interface X (ie eth1), like this:

# cludb -put clumembd%broadcast_interface eth1

Comment 1 Lon Hohberger 2005-06-08 19:25:12 UTC
Expected behavior for all cases.  I've attempted to explain this in a general
manner here:

For additional assistance, please contact Red Hat Support for additional
configuration assistance.

Note You need to log in before you can comment on or make changes to this bug.