Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 228530 - the VIP doesnt go down, where the first service deactivated
Summary: the VIP doesnt go down, where the first service deactivated
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: piranha
Version: 4
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Marek Grac
QA Contact: Cluster QE
Depends On:
Blocks: 249312
TreeView+ depends on / blocked
Reported: 2007-02-13 16:54 UTC by Bryn M. Reeves
Modified: 2018-10-19 21:13 UTC (History)
2 users (show)

Fixed In Version: RHBA-2008-0794
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2008-07-25 19:08:49 UTC

Attachments (Terms of Use)
Add check for inactive services to pulse's deactivateLvs (deleted)
2007-02-13 16:54 UTC, Bryn M. Reeves
no flags Details | Diff
example that reproduces the problem (deleted)
2007-02-13 17:23 UTC, Bryn M. Reeves
no flags Details

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2008:0794 normal SHIPPED_LIVE piranha bug fix and enhancement update 2008-07-25 19:08:36 UTC

Description Bryn M. Reeves 2007-02-13 16:54:45 UTC
Description of problem:
This is related to bug 123342. The patch for that bug added two identical guards
in activateFOSMonitors and sendLvsArps:

+	      if (!config->failoverServices[j].isActive)
+		continue;

This causes us to skip inactive services so that we do not incorrectly think
their VIPs are already active if they are used by other services.

A similar problem exists in deactivateLvs:

  /* deactivate the interfaces */
  for (i = 0; i < config->numVirtServers; i++)

      if (config->virtServers[i].failover_service)
          piranha_log (flags, (char *) "Warning; skipping failover service");
          continue;             /* This should not be possbile anymore */
      for (j = 0; j < i; j++)
          if (!memcmp (&config->virtServers[i].virtualAddress,
                       sizeof (config->virtServers[i].virtualAddress)))

      if (j == i)
        disableInterface (config->virtServers[i].virtualDevice, flags);

In the inner loop, we will incorrectly break and avoid deactivating the
interface in the case that virtServers[j] is inactive but its virtualAddress
matches the other service.

This needs another check to see if virtServers[j] is inactive and continue if
that is the case.

This problem causes the VIP to remain active on the LVS router that is shutting
down, leading to it being active on both the primary and backup router in the
case of a failover.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Create an LVS configuration with at least two virtual servers sharing a
single VIP
2. Disable the first service by setting "active = 0" in
3. Start the pulse service on both primary & backup routers
4. VIP should start correctly on primary
5. Stop pulse on the primary router
6. Confirm that VIP has been failed over to the backup router

Actual results:
The VIP is active on both primary and backup LVS routers

Expected results:
The VIP is active only on one router at a time (the backup router in this example).

Additional info:
The same effect is seen when failing back to the primary by re-starting pulse on
the primary router then stopping pulse on the backup router.

Comment 1 Bryn M. Reeves 2007-02-13 16:54:46 UTC
Created attachment 148004 [details]
Add check for inactive services to pulse's deactivateLvs

Comment 2 Bryn M. Reeves 2007-02-13 17:23:12 UTC
Created attachment 148008 [details]
example that reproduces the problem

Comment 4 Lon Hohberger 2007-06-14 19:53:47 UTC
Reassigning to component owner

Comment 6 Marek Grac 2007-07-23 19:01:54 UTC
Patch is in the CVS branch RHEL4

Comment 10 errata-xmlrpc 2008-07-25 19:08:49 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

Note You need to log in before you can comment on or make changes to this bug.