Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 156280 - multipath-tools tests active paths but never uses the status to fail them
Summary: multipath-tools tests active paths but never uses the status to fail them
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: device-mapper-multipath
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Alasdair Kergon
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-04-28 16:27 UTC by Lars Marowsky-Bree
Modified: 2007-11-30 22:11 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-09-04 23:05:59 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Novell 81679 None None None Never

Description Lars Marowsky-Bree 2005-04-28 16:27:23 UTC
multipath-tools does check all paths. But when it finds that a path previously
active has failed, it never tells the kernel about it.

(Reported by Edward.)

Comment 1 Christophe Varoqui 2005-05-02 21:08:57 UTC
Candidate fix in 0.4.5-pre2
Please confirm the new behaviour is what is expected.

Comment 2 Lan Tran 2005-05-02 22:33:49 UTC
Hm, I just tried 0.4.5-pre2, but it doesn't appear fixed... 

After doing a switch port disable, multipathd path checker detects the 2 paths 
are down, but the internel dm path state is  still 'active'. I would expect it
to go to 'failed'. 

1IBM     2105            739FCA30
[size=953 MB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [enabled][first]
  \_ 1:0:0:1 sdbn 68:16   [ready ][active]
  \_ 1:0:1:1 sdbv 68:144  [ready ][active]
  \_ 0:0:0:1 sdb  8:16    [faulty][active]
  \_ 0:0:1:1 sdj  8:144   [faulty][active]
 


Comment 3 Christophe Varoqui 2005-05-02 22:46:58 UTC
The framework is in place, certainly needs debugging now :

multipathd/main.c:checkerloop() calls fail_path(), which calldm_fail_path() upon
path going down events.

I just verified the log received the "checker failed path %s in map %s" message
when removing a path through sysfs.

If you don't beat me to it, I'll see what I can do tomorrow.

Comment 4 Lan Tran 2005-05-03 13:08:40 UTC
Hi Christophe, 

It turns out that dm_fail_path() is never being called in my setup because the
check for !pp->mpp always fails. Not sure why you removed the intial multipath
reconfiguration from multipathd (in patch below), because otherwise, the
multipath maps are not created. And as you removed the signal handling, moving
to uevents I believe, I'm not quite sure how the multipath daemon's allpaths
gets updated when multipath is run. It looks like uevent is triggered only when
removing/adding underlying sd devices?  

I also get this odd behavior that multipathd process will just die on me if I
try to restart it when there are already maps configured. Not sure why, as I see
no debug messages. 

(BTW, I was running 0.4.5-pre2 on the RHEL4 U1 beta1 kernel.) 

--- multipath-tools-0.4.5-pre2/multipathd/main.c        2005-04-28
16:52:56.000000000 -0700
+++ multipath-tools-0.4.5-pre2-patched/multipathd/main.c        2005-05-03
05:50:47.203453424 -0700
@@ -468,7 +471,7 @@
        }
                                                                               
                 
        log_safe(LOG_NOTICE, "initial reconfigure multipath maps");
-//     execute_program(conf->multipath, buff, 1);
+       execute_program(conf->multipath, buff, 1);
                                                                               
                 
        while (1) {


Comment 5 Christophe Varoqui 2005-05-03 16:18:21 UTC
> And as you removed the signal handling, moving to uevents I believe,
> I'm not quite sure how the multipath daemon's allpaths gets updated when
> multipath is run. It looks like uevent is triggered only when
> removing/adding underlying sd devices?

multipath is run from hotplug/udev, and only for "add"s.
For each hotplug "add" event, the daemon will receive a "add" uevent.
The signal thing was safe to kill.

As for the initial multipath run, if your hotplug/udev setup is right, you don't
need it : the maps should already be configured when starting the daemon.

Even if your setup has multipath.dev disabled, you'd better put the multipath
run either in intrd or in multipathd startup script.

Now, multipathd dying on you needs to be fixed. But as far as I can see the
design directions are right.

Comment 7 Alasdair Kergon 2005-05-05 21:46:44 UTC
MODIFIED means it should be fixed but we're awaiting confirmation of that.
If there are no comments in a week or so, then we assume it is fixed and close
the bug.  If subsequently it's found not to have been fixed, then we simply
reopen it.

Comment 8 Lan Tran 2005-05-26 15:29:47 UTC
Using the multipath-tools git snapshot from May 16, 2005. 
Without any I/O running, I disabled a port (bringing down half the paths for
each multipath device), and the state of the disabled paths were correctly put
into '[faulty][failed]' state.

However, I also just tried this on the May 26 git snapshot, and it doesn't work
anymore. It seems that multipathd keeps dying whenever I try to start it  up
using RH4, U1's '/etc/init.d/multipathd start' script. Not sure what's going on.

(I'm using RH4 U1 beta 2.6.9-9.ELsmp kernel.)


   


Comment 9 Lan Tran 2005-05-27 01:07:55 UTC
> 
> Instinctively I would say I messed the case where no "failback" keyword
> is provided in the config file, meaning the culprit is the last commit.
> 

Just checked out the latest from the git repository and tried failing paths with
and without I/O running. The paths are failed and recovered as expected under
both scenarios. Thanks Christophe. 


Comment 10 Rahul Sundaram 2005-09-04 23:05:59 UTC

Closing as per previous comment. 


Note You need to log in before you can comment on or make changes to this bug.