Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 97843

Summary: LVM snapshots dont mount
Product: [Retired] Red Hat Linux Reporter: Deon George <dizzy>
Component: lvmAssignee: Dave Jones <davej>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.3CC: chad, eg_bugzilla, galens, harrisl, jch, menscher, mpaesold, olaf.meske, pfrields, pspencer, sct
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-01-05 03:43:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Description Flags
Minimal fix for LVM snapshotting none

Description Deon George 2003-06-23 06:44:20 UTC
Description of problem:
LVM snapshots of mounted filesystems dont mount.

[root@merlin-1 root]# mount
/dev/vgroot/lvexchange on /vmware/exchange type reiserfs (rw)

[root@merlin-1 root]# lvcreate -n snap -s -L2G /dev/vgroot/lvexchange
lvcreate -- WARNING: the snapshot will be automatically disabled once it gets full
lvcreate -- INFO: using default snapshot chunk size of 64 KB for
lvcreate -- doing automatic backup of &amp;quot;vgroot&amp;quot;
lvcreate -- logical volume &amp;quot;/dev/vgroot/snap&amp;quot; successfully created

[root@merlin-1 root]# mount -t reiserfs /dev/vgroot/snap /mnt/tmp -oro
mount: wrong fs type, bad option, bad superblock on /dev/vgroot/snap,
       or too many mounted file systems


Now, if the file system is not mounted when the snapshot is taken (or is
mounted, but there has been NO I/O, it will mount successfully).

I did see somewhere (but sorry havent been able to find it), something about the
snapshot being created quicker then the reiserfs journal could be synced/closed
would cause the problem. It also mentioned that it would be rare - but since I'm
running a busy SMP system, Im wondering if I fall into this category.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. mount file system
2. do some I/O on it
3. create a snapshot
4. snapshot fails to mount.

If you unmount it, create a snapshot, then the snap will mount.

Expected Results:  Snapshot should mount - otherwise why bother hey?

Comment 1 Alain Spineux 2003-08-22 09:37:24 UTC
I used redhat 8.0 kernel 2.4.20-18.8smp and had the same probleme 

> Now, if the file system is not mounted when the snapshot is taken (or is
> mounted, but there has been NO I/O, it will mount successfully).

I have not tested this ! 

It's looks this is a bug in kernel 2.4.20

Looks the web for keywords :  2.4.20  lvm snapshot

Comment 2 Eric Bollengier 2003-10-22 14:56:09 UTC
I used redhat 7.3 kernel 2.4.20-20.7smp and had the same probleme   
The lvm-VFS-lock patch is ok, but i have to do something like 
fuser -km /path/to/app 
umount /path/to/app 
to make my snapshot... 

Comment 3 Stephen Tweedie 2003-10-23 17:23:11 UTC
Created attachment 95434 [details]
Minimal fix for LVM snapshotting

Comment 4 Stephen Tweedie 2003-10-23 17:27:33 UTC
There are two parts to the problem.  One is the absence of the
LVM_VFS_ENHANCEMENT #define, which managed to get lost for reasons unknown
during an update of our trees.  The other is that even with that fix, you get
benign but annoying messages like

lvm - lvm_map: ll_rw_blk write for readonly LV /dev/spock/snap1
Can't write to read-only device 3a:06

in the logs.  Those are harmless, but fixable; but the fix is moderately risky
so I'll send it upstream in the first instance.

The minimal fix, for LVM_VFS_ENHANCEMENT (plus an additional special-case fix
for data=writeback mode) is in the previous attachment; it's also in our errata
trees now.  But somehow that fix only got picked up into the 2.4.20-20.8 errata
for 8.0; the RHL 7.x and 9 versions of the kernel managed to miss it.

Anyway, it's fixed internally with the patch above, which you can use in the
mean time.  Reassinging to our errata maintainer.

Comment 5 Stephen Tweedie 2003-10-23 17:32:03 UTC
*** Bug 88115 has been marked as a duplicate of this bug. ***

Comment 6 Stephen Tweedie 2003-10-23 17:33:37 UTC
*** Bug 84278 has been marked as a duplicate of this bug. ***

Comment 7 Damian Menscher 2003-10-23 18:12:49 UTC
Do you have an ETA for the errata release?  While we wait, others might be 
interested in my fix for doing dumps: rsync to another partition, then dump the 
copy.  Of course, this requires double the disk space....

Comment 8 Philip Spencer 2003-12-02 05:11:44 UTC
Well, a new errata release of the kernel (2.4.20-24.9) for RHL9 is out now.

And it's still broken! Which sort of contradicts the earlier posting about the fix being "in the errata tree" (implying it would be released with the next errata).

Attention RedHat: support and timely release of bugfixes has seriously deteriorated over the past six months. Like this one -- how many months of a patch being "in the errata tree" is it reasonable to have pass before it starts being reflected in the errata which are actually released? The same has been true of the bug reports that I've filed recently.

Judging by the past track record, it now seems likely that this bug will never be fixed, and we will now have to build our own kernel RPMS.

Once the "support" period for RHL9 expires in April, customers will have to choose between paying more for Enterprise Linux or switching to a free version (Fedora or another distribution).

Six months ago, Enterprise seemed an attractive option; RedHat's support was good. However, now that support has deteriorated so much, I think RedHat will have a hard time persuading anybody to pay more money for it. Which is a shame, because it used to be a very well supported and put together distribution, and it's sad to see it so much in decline with serious bugs like this languishing unfixed for so many months.

Comment 9 Damian Menscher 2003-12-02 05:34:11 UTC
Anyone know if this has been fixed in Fedora?  What about in RHEL?

BTW, I agree that support is severely lacking here.  The bug was 
reported in February with fixes suggested as early as May.  Now even 
when fixes are in the errata tree, the just get "missed" time after 
time.  Considering the inability of "dump" to create a consistent 
backup of an active filesystem, we're forced to resort to rsync to 
extra disks to get our backups.

We're considering RHEL as a future option.  Maybe I can beat RH into 
action when I have a support contract and a judge.  :P

Comment 10 Damian Menscher 2003-12-02 06:05:11 UTC
I just checked the kernel sources on a Fedora box, and the patch made 
it in there.  I also downloaded the .src.rpm for RHEL3 and the patch 
was there.  I checked a RH8.0 machine with 2.4.20-20.8 and it does 
NOT appear to have the patch, contrary to the comment above.

Comment 11 Michael Paesold 2003-12-02 11:22:38 UTC
We already have RHEL v.3. Nevertheless some core services currently 
run on 9.0. We cannot update those servers. We have to install the 
services on new machine and then switch. But what about the 500 GB 
lvm managed data? And we can't make consistent backups because of 
this bug! The bug history -- it's a shame -- RedHat!

Comment 12 Stephen Tweedie 2003-12-04 19:10:23 UTC
The fixes _are_ in the bugfix errata tree, but the last kernel was not
built from that branch --- it was a security errata with minimal
change against the old 2.4.20-20.9 kernel, released at short notice
due to the do_brk exploit.

We expect the proper bugfix errata to be released shortly.

Comment 13 Chris Adams 2003-12-09 15:28:58 UTC
I'm running kernel 2.4.20-24.8 rebuilt with this patch and
LVM_VFS_ENHANCEMENT defined, and I had a crash last night that I think
may have been when a snapshot was being created, so there may still be
a problem here somewhere.  See bug 111735 for more information.

Comment 14 Stephen Tweedie 2003-12-09 19:56:24 UTC
There's no sign of an LVM footprint in that oops, and the crash is
accessing a data structure which is (a) often associated with bad
memory, and (b) not touched by anything on LVM.  Is it reproducible? 
I'd be inclined to suspect something else at this stage, but obviously
if you see it again that will give more info.

Comment 15 Dave Jones 2003-12-14 00:10:20 UTC
*** Bug 111337 has been marked as a duplicate of this bug. ***

Comment 16 Dave Jones 2003-12-14 00:12:13 UTC
There is another bugfix errata currently in QA, that should be out
'real soon'. I apologise for this fix not making it into the recent
update, but that was a quick release in order to fix the recent do_brk
security problem. To 'rush' that kernel through QA, a kernel with
minimal change vs the previous errata kernel was deemed necessary.