Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1361952 - crash: page excluded: kernel virtual address: ... type: ["fill_task_struct"|"list entry"]
Summary: crash: page excluded: kernel virtual address: ... type: ["fill_task_struct"|"...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: crash
Version: 24
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Dave Anderson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-01 03:54 UTC by Ian Wienand
Modified: 2016-08-18 13:40 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-18 13:40:48 UTC


Attachments (Terms of Use)

Description Ian Wienand 2016-08-01 03:54:55 UTC
Description of problem:

Trying to debug an oops on Fedora 24 (4.6.4-301.fc24.x86_64) kernel

The "crash" that came with F24 didn't seem to work at all, so I've built a git version ("make lzo" required, head b349598bb7553774869467a5495834f56baee08e)

On startup I get a lot of "page excluded: ... fill_task_struct" messages, and "bt -F" gives the same with "list entry"

---
[ianw@iwienand-f24-test crash]$ sudo ./crash /var/crash/127.0.0.1-2016-08-01-00\:28\:51/vmcore /usr/lib/debug/lib/modules/4.6.4-301.fc24.x86_64/vmlinux 

crash 7.1.5++
...
This GDB was configured as "x86_64-unknown-linux-gnu"...

please wait... (gathering task table data)      
crash: page excluded: kernel virtual address: ffff880236251000  type: "fill_task_struct"

[ a lot more of this message]

WARNING: active task ffff880159c09e80 on cpu 1 not found in PID hash
WARNING: active task ffff880236251e80 on cpu 2 not found in PID hash
WARNING: active task ffff8801ab191e80 on cpu 3 not found in PID hash


crash: page excluded: kernel virtual address: ffff8801ab191e80  type: "fill_task_struct"
      KERNEL: /usr/lib/debug/lib/modules/4.6.4-301.fc24.x86_64/vmlinux
    DUMPFILE: /var/crash/127.0.0.1-2016-08-01-00:28:51/vmcore  [PARTIAL DUMP]
        CPUS: 4
        DATE: Mon Aug  1 00:28:49 2016
      UPTIME: 00:33:54
LOAD AVERAGE: 4.98, 3.31, 1.83
       TASKS: 333
    NODENAME: iwienand-f24-test
     RELEASE: 4.6.4-301.fc24.x86_64
     VERSION: #1 SMP Tue Jul 12 11:50:00 UTC 2016
     MACHINE: x86_64  (2497 Mhz)
      MEMORY: 8 GB
       PANIC: "kernel BUG at net/core/skbuff.c:104!"
         PID: 3781
     COMMAND: "ovs-vswitchd"
        TASK: ffff880234193d00  [THREAD_INFO: ffff880179ae4000]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)

crash> bt -F
PID: 3781   TASK: ffff880234193d00  CPU: 0   COMMAND: "ovs-vswitchd"
 #0 [ffff88023fc03930] machine_kexec at ffffffff8105ce78
    ffff88023fc03938: 000088023fc03998 ffff880000000000 
    ffff88023fc03948: 000000002d001000 ffff88002d001000 
    ffff88023fc03958: 000000002d000000 ffff88023fc03978 
    ffff88023fc03968: 00000000a75d3f05 ffff88023fc03998 
    ffff88023fc03978: 000000000000000b ffff88023fc03c08 
    ffff88023fc03988: ffff88023fc03a50 __crash_kexec+93 
 #1 [ffff88023fc03990] __crash_kexec at ffffffff81134c1d
    ffff88023fc03998: 0000000000000000 mld2_all_mcr     
    ffff88023fc039a8: in6addr_any      bt: page excluded: kernel virtual address: ffff880179a69d28  type: "list entry"
crash> 
----

Comment 1 Ian Wienand 2016-08-01 05:08:18 UTC
I have saved the kdump & vmklinux files to [1] for hopeful easy replication

[1] https://drive.google.com/file/d/0B6xc98OTIzA3cUFycHFMTVFMYTA/view?usp=sharing

Comment 2 Dave Anderson 2016-08-08 20:21:23 UTC
Sorry for the delay -- just back from vacation today...

There's not much the crash utility can do when vital pages have been
excluded/filtered out of the vmcore by makedumpfile.  Just taking the task
structures of the 4 active tasks, only the active task on cpu 0 is included
in the dumpfile:

crash> p current_task:a
per_cpu(current_task, 0) = $9 = (struct task_struct *) 0xffff880234193d00
per_cpu(current_task, 1) = $10 = (struct task_struct *) 0xffff880159c09e80
per_cpu(current_task, 2) = $11 = (struct task_struct *) 0xffff880236251e80
per_cpu(current_task, 3) = $12 = (struct task_struct *) 0xffff8801ab191e80
crash> rd 0xffff880234193d00
ffff880234193d00:  0000000000000000                    ........
crash> rd 0xffff880159c09e80
rd: page excluded: kernel virtual address: ffff880159c09e80  type: "64-bit KVADDR"
crash> rd 0xffff880236251e80 
rd: page excluded: kernel virtual address: ffff880236251e80  type: "64-bit KVADDR"
crash> rd 0xffff8801ab191e80
rd: page excluded: kernel virtual address: ffff8801ab191e80  type: "64-bit KVADDR"
crash> 

And other crash commands show other required kernel memory pages being 
excluded for some reason.   

If you set the makedumpfile to only do compression with -c (no -d <level>),
or just set the core_collector in /etc/kdump.conf to be "scp", what happens?

And for that matter, are you using the most recent kexec-tools package?

Comment 3 Ian Wienand 2016-08-09 04:28:58 UTC
This was on fedora 24 with just whatever defaults are there, and the default kernel.  I didn't modify anything in terms of arguments, etc

Comment 4 Dave Anderson 2016-08-09 13:12:49 UTC
> This was on fedora 24 with just whatever defaults are there, and the 
> default kernel.  I didn't modify anything in terms of arguments, etc

Right, I understand, I was just asking if you can do that.

Unfortunately -- unlike stable RHEL kernels -- kernels of a particular
Fedora version can be updated on any given day.  And tools like the
crash utility and makedumpfile can break if things that they depend
upon change. (as you saw with whatever crash utility version was
originally installed with f24)

The most recent version of in Fedora is kexec-tools-2.0.13-3.fc26.src.rpm.
In its makedumpfile.h file it shows kernel version support limits:

 #define OLDEST_VERSION          KERNEL_VERSION(2, 6, 15)/* linux-2.6.15 */
 #define LATEST_VERSION          KERNEL_VERSION(4, 5, 3)/* linux-4.5.3 */

and where in makedumpfile.c, get_kernel_version() will generate this warning
message on the console when the vmcore is created by the secondary kdump kernel.
However, it will still attempt to create a vmcore:

        if ((version < OLDEST_VERSION) || (LATEST_VERSION < version)) {
                MSG("The kernel version is not supported.\n");
                MSG("The makedumpfile operation may be incomplete.\n");
        }

It looks like the most recent f24 version is kexec-tools-2.0.12-7.fc24.1.src.rpm
which supports up to kernel version 4.1.0:

 #define OLDEST_VERSION          KERNEL_VERSION(2, 6, 15)/* linux-2.6.15 */
 #define LATEST_VERSION          KERNEL_VERSION(4, 1, 0)/* linux-4.1.0 */

Kernel version problems in makedumpfile may be related to recognizing pages
to filter/exclude.  I'm not saying that is the case here, but I am saying that
it vital pages are mistakenly filtered/excluded, there's nothing the crash
utility can do but fail as it has.

However, if you modify kdump.conf to *not* filter any pages (perhaps except
zero-filled pages with -d1), and only use it to compress the vmcore with -c,
you might get a more useful vmcore.  And if that fails, you could take
makedumpfile out of the picture entirely by change the "core_collector"
in kdump.conf to "scp", which will simply copy /proc/vmcore to a permanent
location.

Comment 5 Ian Wienand 2016-08-09 23:52:39 UTC
Thanks, this is all great info

I'm not sure what to do ... I did manage to figure out my issue from the partial dumps anyway; if this is all down to version skew I'm not really sure there is much you can do?

I've tried to distil the gist of what I think is going on from the helpful info here into a wiki page I found useful, in a separate section "On Versions" [1].  Maybe you could have a quick read of that and make sure it's not off-base?

[1] https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes#On_versions

Comment 6 Dave Anderson 2016-08-10 13:21:39 UTC
Looks good to me.  You can always file a BZ against kexec-tools as well, requesting an updated version of makedumpfile, but the upstream version
of makedumpfile (which gets collected inside of the kexec-tools package)
is still 1.6.0, supporting up to version 4.5.3.


Note You need to log in before you can comment on or make changes to this bug.