Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1691785 - Unable to mount BTRFS file system to remove missing drive
Summary: Unable to mount BTRFS file system to remove missing drive
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: btrfs-progs
Version: 29
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Josef Bacik
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-22 14:03 UTC by Chris Guilbault
Modified: 2019-03-23 03:53 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-03-23 03:53:41 UTC


Attachments (Terms of Use)

Description Chris Guilbault 2019-03-22 14:03:38 UTC
Description of problem:
Lost a drive due to either drive failure or sata controller failure. Attempted to mount the drive in degraded mode and would not mount. Able to mount in read only mode but cannot use btrfs device delete missing /<mount point> due to the read only flag. 

Version-Release number of selected component (if applicable):


How reproducible:
Unable to test since I only have one btrfs array.

Steps to Reproduce:
1. Have a drive fail either by drive failure or sata controller failure
2. Attempt to mount in degraded mode
3. Attempt to remove missing drives.

Actual results:
Unable to mount in rw mode due to missing drives.

Expected results:
Mount in rw degraded mode allowing the removal or replacement of missing devices.

Additional info:
wolf@io:~$sudo mount -t btrfs -o degraded /dev/sda /array/
mount: /array: wrong fs type, bad option, bad superblock on /dev/sda, missing codepage or helper program, or other error.
32 wolf@io:~$sudo mdmesg | tail
[  282.976861] BTRFS warning (device sdb1): writeable mount is not allowed due to too many missing devices
[  282.998878] BTRFS error (device sdb1): open_ctree failed
[  300.138362] BTRFS info (device sdb1): allowing degraded mounts
[  300.138363] BTRFS info (device sdb1): disk space caching is enabled
[  300.148898] BTRFS warning (device sdb1): devid 6 uuid 16b078bc-623e-4a41-8d4b-5e1aaf45d981 is missing
[  300.630138] BTRFS info (device sdb1): bdev /dev/sdf1 errs: wr 8, rd 22, flush 0, corrupt 0, gen 0
[  300.630148] BTRFS info (device sdb1): bdev (null) errs: wr 8, rd 84, flush 0, corrupt 0, gen 0
[  301.989691] BTRFS warning (device sdb1): chunk 16795374387200 missing 1 devices, max tolerance is 0 for writeable mount
[  301.989695] BTRFS warning (device sdb1): writeable mount is not allowed due to too many missing devices
[  302.006720] BTRFS error (device sdb1): open_ctree failed
wolf@io:~$sudo mount -t btrfs -ro degraded /dev/sda /array/
wolf@io:~$sudo btrfs filesystem show
Label: 'fedora-server_io'  uuid: 9dc6a74b-5dad-422b-8010-a99cd5725788
        Total devices 7 FS bytes used 6.91TiB
        devid    1 size 1.82TiB used 1.49TiB path /dev/sdb1
        devid    2 size 1.82TiB used 1.48TiB path /dev/sde2
        devid    3 size 1.82TiB used 1.49TiB path /dev/sdf1
        devid    4 size 1.82TiB used 1.48TiB path /dev/sdc
        devid    5 size 1.82TiB used 340.00GiB path /dev/sdd
        devid    7 size 1.82TiB used 340.00GiB path /dev/sda
        *** Some devices missing

wolf@io:~$sudo btrfs device delete missing /array/
ERROR: error removing device 'missing': Read-only file system

Comment 1 Chris Murphy 2019-03-22 17:35:16 UTC
The read-only degraded mount policy is enforced by the kernel code, not btrfs user space tools. Strictly speaking it's not a bug, it's just a heavy handed policy to avoid making the file system more inconsistent, in particular it depends on the mkfs profile used. What do you get for:

# btrfs fi us /mntpoint/

This is a read-only command, and will work on an ro mounted drive.

Comment 2 Chris Guilbault 2019-03-23 01:17:00 UTC
Overall:
    Device size:                  12.74TiB
    Device allocated:              6.93TiB
    Device unallocated:            5.80TiB
    Device missing:                1.82TiB
    Used:                          6.92TiB
    Free (estimated):              5.81TiB      (min: 2.91TiB)
    Data ratio:                       1.00
    Metadata ratio:                   2.00
    Global reserve:              512.00MiB      (used: 0.00B)

Data,RAID0: Size:6.91TiB, Used:6.90TiB
   /dev/sda      340.00GiB
   /dev/sdb1       1.48TiB
   /dev/sdc        1.48TiB
   /dev/sdd      340.00GiB
   /dev/sde2       1.48TiB
   /dev/sdf1       1.48TiB
   missing       340.00GiB

Metadata,RAID1: Size:9.00GiB, Used:8.15GiB
   /dev/sdb1       7.00GiB
   /dev/sdc        5.00GiB
   /dev/sdf1       6.00GiB

System,RAID1: Size:32.00MiB, Used:384.00KiB
   /dev/sdb1      32.00MiB
   /dev/sdf1      32.00MiB

Unallocated:
   /dev/sda        1.49TiB
   /dev/sdb1     340.98GiB
   /dev/sdc      343.02GiB
   /dev/sdd        1.49TiB
   /dev/sde2     348.02GiB
   /dev/sdf1     341.98GiB
   missing         1.49TiB

Feel kind of bad I haven't balanced the array in ages but I guess the upside is its only missing 340 gb of data...

Comment 3 Chris Murphy 2019-03-23 03:53:41 UTC
(In reply to Chris Guilbault from comment #2)

> Data,RAID0: Size:6.91TiB, Used:6.90TiB
>    /dev/sda      340.00GiB
>    /dev/sdb1       1.48TiB
>    /dev/sdc        1.48TiB
>    /dev/sdd      340.00GiB
>    /dev/sde2       1.48TiB
>    /dev/sdf1       1.48TiB
>    missing       340.00GiB

This is raid0 profile for data, so there's no redundancy. With any other raid0 (mdadm, lvm, hardware), a single missing device is total data loss. With Btrfs it's a little different, you might be able to extract some things, but it depends. 

Anyway, it's going read only because it's raid0 and is missing a device.

Were all of these drives part of the original mkfs? Or were some added later? If you did something like sdb1, sdc, sde2, sdf1 at mkfs, filled it up with 1.2TB of stuff, and then later you added sda, sdd, sdg(missing), something very interesting happens on Btrfs. Quite a lot of your original data is in stripes on those original drives. Only after adding the device that's now missing, will there be data with 64KiB holes in them due to the missing device. So depending on what drives were present for mkfs, depending on how the drives were populated, that missing 340GiB isn't likely to affect most of the data. But 340GiB of holes is still quite a lot...


> Metadata,RAID1: Size:9.00GiB, Used:8.15GiB
>    /dev/sdb1       7.00GiB
>    /dev/sdc        5.00GiB
>    /dev/sdf1       6.00GiB
> 
> System,RAID1: Size:32.00MiB, Used:384.00KiB
>    /dev/sdb1      32.00MiB
>    /dev/sdf1      32.00MiB

These block groups are the file system itself, and it's raid1 so two copies of metadata, and it's distributed across three devices none of which are missing. So as weird as it seems the file system itself is OK, you just have a bunch of 64KiB holes in some of your data due to the raid0 striping and missing device. 

> 
> Unallocated:
>    /dev/sda        1.49TiB
>    /dev/sdb1     340.98GiB
>    /dev/sdc      343.02GiB
>    /dev/sdd        1.49TiB
>    /dev/sde2     348.02GiB
>    /dev/sdf1     341.98GiB
>    missing         1.49TiB
> 
> Feel kind of bad I haven't balanced the array in ages but I guess the upside
> is its only missing 340 gb of data...

If the data profile was single, that would be true. But the data profile is raid0, so everything is striped, with a stripe element size of 64KiB. So depending on the history of the file system, some or all files have 64KiB holes in them due to the missing device.

A balance before this would mean almost everything is damaged, because a balance restripes. So if there's different ages for the devices, and the dead drive was added later, good chance of less damage because you didn't do a balance.

If you have a 128KiB file, striped over two devices, if you lose 64KiB of that file, so is it 64KiB data loss or 128KiB? The file is cut in half. If it's a 1GiB file, you'll have a bunch of 64KiB holes in it, because it's striped across all the devices. Of course it's more complicated because you have so many drives, maybe some smaller files are stored only on non-missing drives. And any files less than about 3K are actually stored in metadata leaves, so they're fine.

A few things to note:

- Btrfs will not copy out corrupted files from a mounted file system. If you copy a corrupt file, Btrfs won't give you a partial file, it will return EIO, so you get nothing. But it will report a path to the corrupt file, so at least you know what's damaged. The path information to bad files is recorded in kernel messages, `journalctl -k`. So if you want a list of corruptions, you'll have to parse the journal.

- you can do scrub to get a list of corrupt files, since it's mounted ro, you'll need to do 'btrfs scrub start -r /mountpoint/' and again the corrupt files will be listed in kernel messages

- you can do an offline scrape using 'btrfs restore' which can be configured to tolerate errors, and will give you partial files with the holes in them.
https://btrfs.wiki.kernel.org/index.php/Restore

- You'd need to ask on the upstream Btrfs list if there's a patch possible that will enable you to build a kernel that will tolerate mounting this volume read write so you can delete all the corrupt files. They'd have to be deleted before Btrfs will allow you to remove the missing device. Then you'd rebalance, and at that time you can convert data from raid0 to single, if you want. That'd make a future lost drive more tolerable. If you post to btrfs list you can add me to the cc, I'm active on that list.
http://vger.kernel.org/vger-lists.html#linux-btrfs

Anyway, I'm gonna mark this as notabug, as given the circumstances the behavior is expected.


Note You need to log in before you can comment on or make changes to this bug.