Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1359596 - All recent kernels fail to boot claiming metadata corruption. 4.2.3 boots fine [NEEDINFO]
Summary: All recent kernels fail to boot claiming metadata corruption. 4.2.3 boots fine
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 23
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: fs-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-25 06:10 UTC by Phil V
Modified: 2016-10-26 16:57 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-10-26 16:57:23 UTC
labbott: needinfo? (pv.bugzilla)


Attachments (Terms of Use)
Photo of failed boot of kernel 4.6.4 (deleted)
2016-07-25 06:10 UTC, Phil V
no flags Details
Photo of failed boot of kernel 4.5.7-202 (deleted)
2016-07-25 06:12 UTC, Phil V
no flags Details
Photo of failed boot of kernel 4.5.7-200 (deleted)
2016-07-25 06:13 UTC, Phil V
no flags Details
Photo of failed boot of kernel 4.5.5-201 (deleted)
2016-07-25 06:14 UTC, Phil V
no flags Details
Photo of failed boot of kernel 4.4.9 (deleted)
2016-07-25 06:16 UTC, Phil V
no flags Details
Photo of failed boot of kernel 4.4.8 (deleted)
2016-07-25 06:17 UTC, Phil V
no flags Details

Description Phil V 2016-07-25 06:10:20 UTC
Created attachment 1183552 [details]
Photo of failed boot of kernel 4.6.4

Fedora23 system boots successfully with original kernel 4.2.3.
All recent kernels, from 4.4.8 thru 4.6.4,
fail to boot, reporting 
XFS (rootPartition): Metadata corruption detected at xfs_dir3_block_read_verify+0x5e/0x100 [xfs], block 0x19188

In previous line, newer kernels insert after the comma: xfs_dir3_block

The system then announces
Entering emergency mode. 
(This emergency mode is extremely limited, and will not reboot)


Please note, that booting with CentOS and running xfs_repair -n /dev/sdaX
gives no errors.


See attached photos.

Comment 1 Phil V 2016-07-25 06:12:36 UTC
Created attachment 1183554 [details]
Photo of failed boot of kernel 4.5.7-202

Comment 2 Phil V 2016-07-25 06:13:49 UTC
Created attachment 1183560 [details]
Photo of failed boot of kernel 4.5.7-200

Comment 3 Phil V 2016-07-25 06:14:39 UTC
Created attachment 1183561 [details]
Photo of failed boot of kernel 4.5.5-201

Comment 4 Phil V 2016-07-25 06:16:17 UTC
Created attachment 1183562 [details]
Photo of failed boot of kernel 4.4.9

Comment 5 Phil V 2016-07-25 06:17:10 UTC
Created attachment 1183567 [details]
Photo of failed boot of kernel 4.4.8

Comment 6 Carlos Maiolino 2016-07-25 12:46:29 UTC
Have you tested all kernels in the same machine without reformatting it?

From all images, looks like you have a corrupted filesystem and not a bug.

What the images you posted shows, are IO errors to the same block, in the same pattern. Have you tried to run xfs_repair on it?

Comment 7 Brian Foster 2016-07-25 12:52:42 UTC
Most of the images say something like "systemd-fstab-generator[NNN]: Failed to open /sysroot/etc/fstab: Structure needs cleaning," which is interesting. That suggests the fs mounts fine and whatever this utility is doing triggers or stumbles upon the corruption.

Given that you have the ability to boot this system with other kernels/OS, could you try to reproduce this corruption more directly (outside of this utility)? For example, boot with a live image or some such, mount the fs and try 'ls <mnt>/etc', 'cat <mnt>/etc/fstab', etc. If you can reproduce the corruption in that manner, the best bet might be to provide the steps and create an xfs_metadump of the fs (note that xfs_metadump -o disables obfuscation, otherwise we'll probably need the inode number of etc/fstab as well in order to find it).

Comment 8 Phil V 2016-07-27 03:06:35 UTC
(In reply to Carlos Maiolino from comment #6)
> Have you tested all kernels in the same machine without reformatting it?

I generated the uploaded images in the same day without intentionally making changes to the filesystem. Each time the emergency mode is reached, it refuses to reboot or shutdown (claims it would be destructive). So I force the hardware to power off. On reboot I choose a new kernel. If I choose the original kernel, the system loads and runs fine. 
 
> From all images, looks like you have a corrupted filesystem and not a bug.

Yes, that is the appearance. Please note, when I boot the CentOS partition and scan the supposedly corrupted filesystem with xfs_repair -n there are no reported errors.

> What the images you posted shows, are IO errors to the same block, in the
> same pattern. Have you tried to run xfs_repair on it?

I do not recall whether xfs_repair works at all in the emergency mode.
I am not sure I should trust it to make changes if it does.

Comment 9 Phil V 2016-07-27 03:31:55 UTC
Thank you, Carlos and Brian, for looking into this.

(In reply to Brian Foster from comment #7)
> Most of the images say something like "systemd-fstab-generator[NNN]: Failed
> to open /sysroot/etc/fstab: Structure needs cleaning," which is interesting.
> That suggests the fs mounts fine and whatever this utility is doing triggers
> or stumbles upon the corruption.

Yes, it seems this way to me.

> Given that you have the ability to boot this system with other kernels/OS,
> could you try to reproduce this corruption more directly (outside of this
> utility)? For example, boot with a live image or some such, mount the fs and
> try 'ls <mnt>/etc', 'cat <mnt>/etc/fstab', etc. If you can reproduce the
> corruption in that manner, the best bet might be to provide the steps and
> create an xfs_metadump of the fs (note that xfs_metadump -o disables
> obfuscation, otherwise we'll probably need the inode number of etc/fstab as
> well in order to find it).

Brian, I am not sure if I am understanding how your suggestion is different from what I have already tried. If my response here is not complete enough please clarify:

If I boot with the original kernel: there are no problems using the partition.

If I boot the CentOS partition and mount the accused filesystem, there are no errors in browsing it with command line nor with GUI browser. If I unmount it, and run 'xfs_repair -n' the results are

xfs_repair -n /dev/sda9
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

It is my impression that this output indicates no filesystem errors.

Comment 10 Phil V 2016-07-27 03:39:24 UTC
Booting with CentOS:
'dmesg|grep sda9'
only gives lines like:

[] XFS (sda9): Mounting V5 Filesystem
[] XFS (sda9): Ending clean mount
[] SELinux: initialized (dev sda9, type xfs), uses xattr
[] XFS (sda9): Unmounting Filesystem

Comment 11 Eric Sandeen 2016-07-27 03:43:36 UTC
Yes, that looks like a clean xfs_repair.  Strange.

There were a few spurious verifier failures in the past but this one isn't ringing a bell.

At this point I wonder if an xfs_metadump would be the best bet, so we could examine the metadata directly.  If you can boot a rescue or live cd with xfs_metadump on it,

# xfs_metadump -o /dev/<rootdev> - | bzip2 > /mounted/usb/stick/metadump.bz2

or something like that.  If filenames are considered sensitive, you can remove the "-o".  The metadump image will contain *no* file data, only metadata like names, timestamps, structures, etc.

If you'd rather not attach the image or it's too big let one of us know and we can arrange a different way to transfer.

I suppose another option might be to boot a bleeding-edge rawhide kernel, and see if it persists.  But to get to the bottom of the issue I think the metadump would be most useful.

Thanks,
-Eric

Comment 12 Eric Sandeen 2016-07-27 03:44:28 UTC
(you can find me on freenode irc in #xfs for a little while longer tonight, if you'd like, or in the #fedora-kernel channel)

-Eric

Comment 13 Eric Sandeen 2016-07-27 03:54:04 UTC
Another approach, if the metadump route is not possible, is to use xfs_db (which I think *might* be in the rescue shell, not sure) to examine the block in question:

# xfs_db /dev/sda9
xfs_db> daddr 0x19188
xfs_db> type dir3
xfs_db> print

-Eric

Comment 14 Dave Chinner 2016-07-27 04:46:00 UTC
couple of things. This is a crc enabled filesystem, so the xfs_repair in CentOS is not going to find everything that may be wrong - centos ships with a 3.2.x version, and so not have many of the fixes we've added in the past 18+ months. I'd recommend making sure that you run repair from a more recent xfsprogs package, such as 4.5.0 or 4.7-rc2 direct from the git tree (which will become 4.7.0 in the next couple of days).

Also, from the data dump in the photos, the owner info in the directory block says it belongs to inode 0x60, which is often the root inode. can you add the output of "ls -i /" on th eproblem filesystem so we can see the inode numbers of the entries in the root directory to see if it really is the owner inode we are expecting for this directory block....

-Dave.

Comment 15 Phil V 2016-07-27 06:58:32 UTC
should I run ls -i / after booting that partition?
Or do you want the output of (CentOS Boot)
ls -i /run/media/USER/PARTITIONUUID/    ?

(In reply to Dave Chinner from comment #14)
> couple of things. This is a crc enabled filesystem, so the xfs_repair in
> CentOS is not going to find everything that may be wrong - centos ships with
> a 3.2.x version, and so not have many of the fixes we've added in the past
> 18+ months. I'd recommend making sure that you run repair from a more recent
> xfsprogs package, such as 4.5.0 or 4.7-rc2 direct from the git tree (which
> will become 4.7.0 in the next couple of days).
> 
> Also, from the data dump in the photos, the owner info in the directory
> block says it belongs to inode 0x60, which is often the root inode. can you
> add the output of "ls -i /" on th eproblem filesystem so we can see the
> inode numbers of the entries in the root directory to see if it really is
> the owner inode we are expecting for this directory block....
> 
> -Dave.

Comment 16 Phil V 2016-07-27 07:02:30 UTC
If I should run ls -i / after booting that partition, does it matter whether with 4.2.3 or a new kernel that falls into emergency mode?

(In reply to Dave Chinner from comment #14)

> Also, from the data dump in the photos, the owner info in the directory
> block says it belongs to inode 0x60, which is often the root inode. can you
> add the output of "ls -i /" on th eproblem filesystem so we can see the
> inode numbers of the entries in the root directory to see if it really is
> the owner inode we are expecting for this directory block....
> 
> -Dave.

Comment 17 Brian Foster 2016-07-27 11:50:24 UTC
(In reply to Phil V from comment #9)
...
> 
> > Given that you have the ability to boot this system with other kernels/OS,
> > could you try to reproduce this corruption more directly (outside of this
> > utility)? For example, boot with a live image or some such, mount the fs and
> > try 'ls <mnt>/etc', 'cat <mnt>/etc/fstab', etc. If you can reproduce the
> > corruption in that manner, the best bet might be to provide the steps and
> > create an xfs_metadump of the fs (note that xfs_metadump -o disables
> > obfuscation, otherwise we'll probably need the inode number of etc/fstab as
> > well in order to find it).
> 
> Brian, I am not sure if I am understanding how your suggestion is different
> from what I have already tried. If my response here is not complete enough
> please clarify:
> 

What I mean to ask is can we narrow down the reproducer on a kernel that exhibits the problem? Right now we know boot fails at systemd-fstab-generator, but we don't really know what it's doing. We know that the fs mounts fine if we get far enough that systemd accesses files on the fs, however. So can we boot direct to emergency mode (or some such means of using a "bad" kernel), mount the fs manually, and poke around that file and see what causes the crash? That might even include running something like 'strace systemd-fstab-generator' if all else fails (note that I have no idea what that utility does beyond what is implied by the name).

Also note it might be wise to run xfs_metadump first in case we do something that happens to clear the state. If the above is not feasible, a metadump at the very least would allow us to try some similar experiments on the metadata image.

An unrelated question... the initial report mentions that a variety of kernels have been tested. Some appear to be official fedora kernels (4.5.7-202) while at other times you refer to 4.4.8, 4.6.4, etc., which imply pristine upstream kernels. Are you indeed referring to pure upstream kernels in those cases or are all of these fedora kernels?

Comment 18 Brian Foster 2016-07-27 11:53:56 UTC
(In reply to Phil V from comment #16)
> If I should run ls -i / after booting that partition, does it matter whether
> with 4.2.3 or a new kernel that falls into emergency mode?
> 

Not sure I follow what you mean above, but I believe Dave is asking for the 'ls -i /' output from a fully successful boot. E.g., use a kernel that works, complete the boot as normal and run the aforementioned command to get some information on the contents of the root directory.

Note that we'll also be able to get this information from the metadump if you can provide one.

Comment 19 Phil V 2016-07-29 01:31:39 UTC
(In reply to Dave Chinner from comment #14)
> couple of things. This is a crc enabled filesystem, so the xfs_repair in
> CentOS is not going to find everything that may be wrong - centos ships with
> a 3.2.x version, and so not have many of the fixes we've added in the past
> 18+ months. I'd recommend making sure that you run repair from a more recent
> xfsprogs package, such as 4.5.0 or 4.7-rc2 direct from the git tree (which
> will become 4.7.0 in the next couple of days).

I booted with the Fedora 24 install DVD. 

$ xfs_repair -V
xfs_repair version 4.5.0

# xfs_repair -n /dev/sda9
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 3
        - agno = 2
        - agno = 1
        - agno = 4
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
Maximum metadata LSN (51:10436) is ahead of log (8:22393).
Would format log to cycle 54.
No modify flag set, skipping filesystem flush and exiting.

Comment 20 Dave Chinner 2016-07-29 04:34:15 UTC
(In reply to Phil V from comment #19)
> (In reply to Dave Chinner from comment #14)
> > couple of things. This is a crc enabled filesystem, so the xfs_repair in
> > CentOS is not going to find everything that may be wrong - centos ships with
> > a 3.2.x version, and so not have many of the fixes we've added in the past
> > 18+ months. I'd recommend making sure that you run repair from a more recent
> > xfsprogs package, such as 4.5.0 or 4.7-rc2 direct from the git tree (which
> > will become 4.7.0 in the next couple of days).
> 
> I booted with the Fedora 24 install DVD. 
> 
> $ xfs_repair -V
> xfs_repair version 4.5.0

[....]

> Phase 7 - verify link counts...
> Maximum metadata LSN (51:10436) is ahead of log (8:22393).
> Would format log to cycle 54.

Yeah. That. Log recovery is going to do unpredicatable things because a running an old version of xfs_repair has zeroed the log and reset the sequence numbers used to determine if log recovery should replay an item in the log or not. That is a corruption vector, whether or not it's the cause of the problem you're having it remains to be seen. Please run xfs_repair (the v4.5.0 version) without the "-n" to reformat the log with the correct sequence numbers so we know log recovery is doing the right thing from now on.

-Dave.

Comment 21 Phil V 2016-07-29 05:19:34 UTC
Dave, do I want to back up the partition before running xfs_repair, or after?

Comment 22 Dave Chinner 2016-07-29 08:28:36 UTC
(In reply to Phil V from comment #21)
> Dave, do I want to back up the partition before running xfs_repair, or after?

Before if you want to protect against something bad happening and losing data. The likelihood of that happening is extremely low, however. It all depends how paranoid you are:. Your data, your choice.

-Dave.

Comment 23 Phil V 2016-07-29 09:19:09 UTC
# xfs_repair -v /dev/sda9
Phase 1 - find and verify superblock...
        - block cache size set to 3070416 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 27130 tail block 27130
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 4
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (51:10436) is ahead of log (8:27130).
Format log to cycle 54.

        XFS_REPAIR Summary    Fri Jul 29 05:18:14 2016

Phase		Start		End		Duration
Phase 1:	07/29 05:18:12	07/29 05:18:12	
Phase 2:	07/29 05:18:12	07/29 05:18:12	
Phase 3:	07/29 05:18:12	07/29 05:18:13	1 second
Phase 4:	07/29 05:18:13	07/29 05:18:13	
Phase 5:	07/29 05:18:13	07/29 05:18:13	
Phase 6:	07/29 05:18:13	07/29 05:18:14	1 second
Phase 7:	07/29 05:18:14	07/29 05:18:14	

Total run time: 2 seconds
done

Comment 24 Phil V 2016-07-29 10:18:45 UTC
Thank you all -- the system now boots successfully in the kernels I have tried!

However after a minute of inactivity, the screen goes blank and keyboard activity cannot reactivate the screen. 
Furthermore, the keyboard seems to be unresponsive -- CAPSLOCK and NUMLOCK LED's do not toggle. 

Do you think this is related or a different bug?

I note the Closed/EOL bug https://bugzilla.redhat.com/show_bug.cgi?id=1185597

What is the best way to move forward?

Comment 25 Brian Foster 2016-07-29 12:29:02 UTC
Sounds like a possible kernel panic. There's no way of knowing what it's related to without seeing the log output with stack trace and whatnot. Is the machine accessible via the network when this occurs?

If not, you could try to monitor via a serial console or perhaps easier is to configure kdump, which on panic will automatically jump to a standby kernel, generate a vmcore and dmesg output under /var/crash/ (may take a few minutes to complete) and restart. (Please do file a separate bug with that data if this is not related to the original problem.)

Comment 26 Laura Abbott 2016-09-23 19:50:05 UTC
*********** MASS BUG UPDATE **************
 
We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 23 kernel bugs.
 
Fedora 23 has now been rebased to 4.7.4-100.fc23.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
 
If you have moved on to Fedora 24 or 25, and are still experiencing this issue, please change the version to Fedora 24 or 25.
 
If you experience different issues, please open a new bug report for those.

Comment 27 Laura Abbott 2016-10-26 16:57:23 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 4 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.


Note You need to log in before you can comment on or make changes to this bug.