Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1065366 - ZFS filesystems not being indexed
Summary: ZFS filesystems not being indexed
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: mlocate
Version: rawhide
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Michal Sekletar
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-02-14 13:39 UTC by Michal Sekletar
Modified: 2014-07-25 19:46 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 1023779
Environment:
Last Closed: 2014-07-25 19:46:01 UTC


Attachments (Terms of Use)
do not prune zfs filesystems (deleted)
2014-07-22 15:09 UTC, Bill McGonigle
msekleta: review+
Details | Diff

Description Michal Sekletar 2014-02-14 13:39:01 UTC
+++ This bug was initially created as a clone of Bug #1023779 +++

Description:

I've determined that mlocate's cronjob (/etc/cron.daily/mlocate.cron) doesn't index any ZFS filesystems. 

The reason for this is that the cronjob builds a list of all "nodev" filesystem types listed in /proc/filesystems, which unfortunately includes ZFS.

I can understand mlocate's rationale for doing that (so as to avoid indexing obviously pseudo filesystems like /proc, /sys, etc), but in ZFS case, it's wrong. Also, I can understand ZFS' rationale for listing itself as "nodev": a ZFS filesystem is not directly connected to a device, it's connected to a ZFS pool (which in turn is connected to a device). 

So, it seems we have an unintentional side-effect of the two rationales above. 

To fix it, I patched the cronjob to source a config file (/etc/sysconfig/mlocate) which defines a single env var called NODEV_FS_TO_SCAN_REGEXP; this should be an awk regexp matching every filesystem which should be indexed by mlocate *even* if it's listed as "nodev" in /proc/filesystems; this way, we have a generic fix which can be used for other filesystem types, in case others in the future fall in the same category as ZFS.

Version-Release number of selected component (if applicable): 0.22.2-4.el6 on RHEL 6.4.


How reproducible: 100%


Steps to Reproduce:
1. Run "/etc/cron.daily/mlocate" with a mounted ZFS filesystem;
2. Wait for it to finish;
3. Try to locate any file in the ZFS filesystem using "locate filename"

Actual results:
Files in ZFS filesystem aren't shown by "locate".

Expected results:
Files in ZFS filesystem should be shown by "locate".

Additional info:
Problem originally reported in the zfs-discuss mailing list (see my post here: https://groups.google.com/a/zfsonlinux.org/forum/#!msg/zfs-discuss/nxlya9MNmYc/2u9jZTr2W4oJ); I've fixed the issue and determined that "upstream" in this case is Redhat EL6, so I'm reporting it here along with my patch.

Comment 1 Michal Sekletar 2014-02-17 17:32:55 UTC
I've cloned original bugzilla, since I plan to introduce fix in Fedora first. Whether this is going to be fixed in RHEL6 is TBD by Red Hat Product Management.

I am not sure about the proposed approach tough. I don't really like config file in /etc/sysconfig/. Maybe in this case we might just fix cron script. If cases like this one pop up in the future we should think about the proper fix for this via native updatedb option.

Comment 2 Bill McGonigle 2014-07-22 06:13:34 UTC
It looks to me like the established config is in updatedb.conf and one special case of removing known-valuable nodev devices ("rootfs") is hardcoded in the cron script.

I presume the original solution here was done because updatedb gets grumpy about unknown parameters in its config file.  I hacked on conf.c a bit to give it some knowledge (so it wouldn't just fail) about a NO_PRUNE parameter I added like this:

::::::::::::::
/etc/updatedb.conf
::::::::::::::
NO_PRUNE = "rootfs zfs"

::::::::::::::
/etc/cron.daily/mlocate
::::::::::::::
#!/bin/sh
export no_prune=$(awk -F ' *= *' '$1 == "NO_PRUNE" { print $2 }' < /etc/updatedb.conf | sed 's/"//g')
nodevs=$(awk '{split(ENVIRON["no_prune"],np); do_print = 1; for (fs in np) if ($2 == np[fs]) do_print = 0; if (do_print) print $2}' < /proc/filesystems)


That works for me to get my /home on zfs indexed, but I have conf.c in a state that wouldn't make tremendous sense to the next guy.   It seems to me that the right thing to do next is to keep going along this path and make updatedb handle NO_PRUNE and /proc/filesystems natively and simplify the cron script.

I put what I got done so far here:

  https://www.bfccomputing.com/downloads/fedora/mlocate/

if anybody wants to take a look.

Comment 3 Miloslav Trmač 2014-07-22 14:04:45 UTC
Honestly the simplest thing to do is to just add an explicit case excluding zfs from the list to the script, like rootfs is currently being excluded.

There’s no obvious need for this list to be configurable.

I do rather strongly object to having updatedb and the cron script use the same config file (with necessarily different semantics), and loosening the updatedb.conf parser for it.  The horrible updatedb.conf parser from slocate has been one of the motivations for creating mlocate in the first place :)

Comment 4 Bill McGonigle 2014-07-22 14:27:56 UTC
so:

2c2
< nodevs=$(< /proc/filesystems awk '$1 == "nodev" && $2 != "rootfs" { print $2 }')
---
> nodevs=$(< /proc/filesystems awk '$1 == "nodev" && $2 != "rootfs" && $2 != "zfs" { print $2 }')

?

(In reply to Miloslav Trmač from comment #3)
> 
> There’s no obvious need for this list to be configurable.
> 

Agree, there's no code-level need, but I see people talking about this cron script in Fedora vis-a-vis zfs as early as November '09 and here we are.  It's not likely to ever get into EL6 (or EL7?) in my opinion (but happy to be proven wrong) - having sysadmins need to edit "code" files rather than config files is suboptimal and precludes a proper update path.  IMO we shouldn't say "sysadmins should need to edit cron job scripts because we don't want to maintain a more complex parser".

Not that I'm a huge fan of coding around process but I'm also thinking about the next new filesystem that is currently being invented - 10 years is a long time on the scale of filesystems.

I'll happily admit to being to general for the purposes of getting *this* bug fixed!

Comment 5 Miloslav Trmač 2014-07-22 14:37:19 UTC
(In reply to Bill McGonigle from comment #4)
> so:
> 
> 2c2
> < nodevs=$(< /proc/filesystems awk '$1 == "nodev" && $2 != "rootfs" { print
> $2 }')
> ---
> > nodevs=$(< /proc/filesystems awk '$1 == "nodev" && $2 != "rootfs" && $2 != "zfs" { print $2 }')

Yes.

> (In reply to Miloslav Trmač from comment #3)
> > 
> > There’s no obvious need for this list to be configurable.
> 
> Agree, there's no code-level need, but I see people talking about this cron
> script in Fedora vis-a-vis zfs as early as November '09 and here we are. 

That’s a fair enough point; I do not ”own” the mlocate package though so it is not up to me.  (I’m here only somewhat imposing my wishes as the original and upstream author.)

Comment 6 Bill McGonigle 2014-07-22 15:09:11 UTC
Created attachment 919948 [details]
do not prune zfs filesystems

just to make it formal.

Just an additional thought: maybe the original patch author went for the /etc/sysconfig option because that is a proper file for a sysadmin to be editing to make local system changes.  We could handle all that as a separate RFE, though.

I'm guessing Miloslav is the component owner? I don't have access to the .list on this bz.

Comment 7 Miloslav Trmač 2014-07-22 15:10:59 UTC
(In reply to Bill McGonigle from comment #6)
> Created attachment 919948 [details]
> do not prune zfs filesystems
> 
> just to make it formal.
> 
> Just an additional thought: maybe the original patch author went for the
> /etc/sysconfig option because that is a proper file for a sysadmin to be
> editing to make local system changes.
Yes, a separate /etc/sysconfig for the cron script would be more palatable to me (but the hardcoding is IMHO sufficient, and I’m not the decision maker anyway).

> I'm guessing Miloslav is the component owner? I don't have access to the
> .list on this bz.
No, Michal Sekletář is.

Comment 8 Michal Sekletar 2014-07-25 18:07:39 UTC
Patch looks fine to me. Thanks!


Note You need to log in before you can comment on or make changes to this bug.