Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1694305 - kcompacd0 using 100% cpu
Summary: kcompacd0 using 100% cpu
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 29
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-30 10:37 UTC by Fredrik Mikker
Modified: 2019-04-01 22:23 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:


Attachments (Terms of Use)
kernel log (deleted)
2019-03-30 10:37 UTC, Fredrik Mikker
no flags Details
trace.dat.gz (deleted)
2019-03-30 10:39 UTC, Fredrik Mikker
no flags Details
transparent_hugepage (deleted)
2019-03-30 10:43 UTC, Fredrik Mikker
no flags Details
kcompatd-trace (deleted)
2019-04-01 22:23 UTC, Fredrik Mikker
no flags Details

Description Fredrik Mikker 2019-03-30 10:37:08 UTC
Created attachment 1549716 [details]
kernel log

1. Please describe the problem:

The process kcompactd0 is using 100% cpu on one core during performance demanding tasks.
In my case this is happening when I'm running applications using much resources in a virtual machine.

2. What is the Version-Release number of the kernel:

5.0.4-200.fc29.x86_64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

The issue started after upgrading to the 5.0.4 kernel. The issue was not present in kernel-core-4.20.14-200

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Yes.

1. Run performance demanding tasks such as running virtual machines, compiling a large source tree or similar.

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:


6. Are you running any modules that not shipped with directly Fedora's kernel?:

Yes. zfs and Nvidia

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

The issue is known upstream and is discussed in the following LKML-thread: https://lore.kernel.org/lkml/20190126200005.GB27513@amd/T/#u


A workaround is to drop the caches which mitigates the issue for a short while:

# echo 3 > /proc/sys/vm/drop_caches

Comment 1 Fredrik Mikker 2019-03-30 10:39:40 UTC
Created attachment 1549717 [details]
trace.dat.gz

A trace while kcompactd is pegged at 100% is attached, and created by using the following command:

# trace-cmd record -a -e compaction -e migrate -e kmem:mm_page_alloc -e vmscan:mm_vmscan_kswapd_wake -e vmscan:mm_vmscan_kswapd_sleep sleep 10

Comment 2 Fredrik Mikker 2019-03-30 10:43:08 UTC
Created attachment 1549718 [details]
transparent_hugepage

As requested upstream[1] the output from # grep -r . /sys/kernel/mm/transparent_hugepage/* is attached.

[1]: https://lore.kernel.org/lkml/20190130104020.GE9565@techsingularity.net/

Comment 3 Fredrik Mikker 2019-04-01 22:23:45 UTC
Created attachment 1550794 [details]
kcompatd-trace

Another datapoint requested from upstream dev.


Note You need to log in before you can comment on or make changes to this bug.