Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1055234 - frequent ~10-30 s freezes caused by delayed disk access [NEEDINFO]
Summary: frequent ~10-30 s freezes caused by delayed disk access
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 22
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-01-19 18:16 UTC by Matej
Modified: 2015-11-23 17:19 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-23 17:19:11 UTC
jforbes: needinfo?


Attachments (Terms of Use)
strace while frozen (deleted)
2014-01-19 21:54 UTC, Matej
no flags Details
strace in normal state (deleted)
2014-01-19 21:55 UTC, Matej
no flags Details
echo l > /proc/sysrq-trigger ; echo w > /proc/sysrq-trigger (deleted)
2014-04-02 13:57 UTC, Matej
no flags Details
another echo l > /proc/sysrq-trigger ; echo w > /proc/sysrq-trigger (deleted)
2014-04-02 13:58 UTC, Matej
no flags Details
Comment (deleted)
2014-03-29 18:54 UTC, Matej
no flags Details

Description Matej 2014-01-19 18:16:55 UTC
Description of problem:

Frequent temporary freezes of certain applications and/or operations that require disk access. Most noticeable in Firefox. The whole application freezes for 10 to 30 seconds. It occurs every few minutes and renders system unusable.

It seems, that the freezes affect the system as a whole. When Firefox is frozen, gedit also waits to save a file.

The affected files are stored on a ext4 filesystem. I also use ecryptfs, but Firefox profile and other files that were tested with gedit are not stored on the ecryptfs. 

Version-Release number of selected component (if applicable):

kernel-3.12.7-200.fc19.x86_64
kernel-3.12.6-200.fc19.x86_64
kernel-3.12.5-200.fc19.x86_64

Additional info:

$ mount | grep home
/dev/mapper/fedora_pmd85-home on /home type ext4 (rw,relatime,seclabel,data=ordered)
/home/.ecryptfs/matej/.Private on /home/matej/Private type ecryptfs (rw,nosuid,nodev,relatime,ecryptfs_fnek_sig=a81c48c0afdafd2f,ecryptfs_sig=073efb865fd28a0c,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs)


kernel kernel-0:3.9.5-301.fc19.x86_64 seems to run fine

Comment 1 Matej 2014-01-19 21:53:00 UTC
When Firefox was in the frozen state I run wc with lot of disk accesses.

$ strace -tt wc -l kernel-3.10.11-200.fc19.x86_64.rpm

$ ls -lh kernel-3.10.11-200.fc19.x86_64.rpm
-rw-------. 1 matej matej 30M 19. led 19.08 kernel-3.10.11-200.fc19.x86_64.rpm

I made an analysis on call timestamps:
- run while Firefox is frozen: http://nbviewer.ipython.org/gist/palmstrom/8511272
- run in correct state on 3.10.11-200.fc19.x86_64 http://nbviewer.ipython.org/gist/palmstrom/8511370

Summary:
- The wc command took 31 seconds to finish while Firefox was frozen contrary to 0.55 s during normal operation. 
- There were 6 calls to read() taking more than 1 second in the frozen state.

Hw is Lenovo Thinkpad x201i, hdd Seagate SSHD ST1000LM014-1EJ1.

Comment 2 Matej 2014-01-19 21:54:28 UTC
Created attachment 852488 [details]
strace while frozen

strace -tt wc -l kernel-3.10.11-200.fc19.x86_64.rpm

Comment 3 Matej 2014-01-19 21:55:07 UTC
Created attachment 852489 [details]
strace in normal state

strace -tt wc -l kernel-3.10.11-200.fc19.x86_64.rpm

Comment 4 Matej 2014-01-19 21:56:26 UTC
kernel 3.10.11-200.fc19.x86_64 seems also not affected

Comment 5 Matej 2014-01-20 14:21:46 UTC
The problem began to appear in 3.11 kernel.

ok kernel-0:3.9.5-301.fc19.x86_64
ok kernel-0:3.10.11-200.fc19.x86_64
bad kernel-0:3.11.1-200.fc19.x86_64
bad kernel-0:3.11.10-200.fc19.x86_64
bad kernel-0:3.12.5-200.fc19.x86_64
bad kernel-0:3.12.6-200.fc19.x86_64
bad kernel-0:3.12.7-200.fc19.x86_64

Comment 6 Stanislaw Gruszka 2014-02-14 12:15:36 UTC
Please install latencytop and run it on broken version, it should identify delay reason. Note latencytop require kernel feature that is only enabled on kernel-debug variant, so you have to install kernel-debug and make tests on that kernel.

Comment 7 Matej 2014-02-14 15:13:15 UTC
Latencytop on kernel-debug-0:3.12.9-201.fc19.x86_64: https://www.dropbox.com/sh/8zqautfpxdqk772/CXL-VrR7n-#/

Up to 113s latency.

Comment 8 Stanislaw Gruszka 2014-02-14 15:46:58 UTC
We waiting so long for I/O to complete, but reason of that is still unknown. This could be problem in ecrypt-fs, device-mapper or disk device driver.

Please check if the problem happen also if you use only unencrypted partitions. Additionally provide output of lsmod command. Please also repeat test and during it on on other console do "echo l > /proc/sysrq-trigger ; echo w > /proc/sysrq-trigger" and provide dmesg (if dmesg will wrap, logs should be accessible using journalctl)

Comment 9 Stanislaw Gruszka 2014-03-28 10:38:13 UTC
Closing due to lack of response.

Comment 10 Matej 2014-03-29 18:54:04 UTC
Created attachment 915880 [details]
Comment

(This comment was longer than 65,535 characters and has been moved to an attachment by Red Hat Bugzilla).

Comment 11 Stanislaw Gruszka 2014-04-02 12:57:25 UTC
Next time, please attach data as plain-text attachment, otherwise text is mangled by bugzilla. 

Looks like you are using USB dongle as storage, is this primary storage or just mounted as some not important partition. If the second, does the delay happen without USB dongle connected ?

Comment 12 Matej 2014-04-02 13:57:48 UTC
Created attachment 881805 [details]
echo l > /proc/sysrq-trigger ; echo w > /proc/sysrq-trigger

echo l > /proc/sysrq-trigger ; echo w > /proc/sysrq-trigger

while disk IO is frozen

Comment 13 Matej 2014-04-02 13:58:43 UTC
Created attachment 881806 [details]
another echo l > /proc/sysrq-trigger ; echo w > /proc/sysrq-trigger

echo l > /proc/sysrq-trigger ; echo w > /proc/sysrq-trigger

while disk IO is frozen

Comment 14 Matej 2014-04-02 13:59:16 UTC
The USB dongle is not in use and the delays also happen without it. All the data are on Seagate SSHD ST1000LM014-1EJ1, partly on an ecryptfs partition. I'm reattaching the log above and adding one more.

Comment 15 Matej 2014-04-11 12:15:45 UTC
The problem is most probably caused by ecryptfs. I encounter no lags when ~/Private on ecryptfs is unmounted.

Comment 16 Justin M. Forbes 2014-05-21 19:30:19 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.14.4-100.fc19.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20.

If you experience different issues, please open a new bug report for those.

Comment 17 Matej 2014-05-26 07:12:55 UTC
I'm still affected by the issue on the kernel 3.14.4-100.fc19.x86_64.

I have to take back the ecryptfs cause. It's independent on my ecryptfs ~/Private partition (occurs event with ~/Private unmounted). 

The issue is triggered by the tracker processes indexing some folders in my home (~/Documents, ~/Music). The freezes stop occurring after the indexing finishes or after I manually interrupt it (tracker-control -t).

Comment 18 Matej 2014-08-11 14:26:15 UTC
On the kernel 3.15.6-200.fc20.x86_64 still the same. The bug is triggered not only by the tracker process, probably also by other disk IO demanding processes. The system is quite often barely usable for me.

Comment 19 Justin M. Forbes 2014-11-13 16:01:04 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 20 kernel bugs.

Fedora 20 has now been rebased to 3.17.2-200.fc20.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 21, and are still experiencing this issue, please change the version to Fedora 21.

If you experience different issues, please open a new bug report for those.

Comment 20 Justin M. Forbes 2014-12-10 15:00:40 UTC
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in over 3 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.

Comment 21 Matej 2015-08-26 16:43:37 UTC
Still having the same problem on 4.1.5-200.fc22.x86_64.

Comment 22 Christian Stadelmann 2015-09-26 22:12:47 UTC
I guess I am running into the same issue here. I only noticed it so far with firefox. How can I make sure it is the same issue?

What I see is that firefox sometimes freezes for 5…30 seconds. While firefox freezes, other applications work as far as I noted. But they may not have accessed the disk (synchronously).

In my case I have a btrfs root + home partition on top of a LUKS container on an SSD (Samsung 840 Series). No LVM etc, nothing weird in kernel logs. I don't know what triggers it.
I am running into this issue on all 4.0.x and 4.1.x kernels (latest: 4.1.8) on Fedora 22. I don't remember when it started.
I am not using any weird USB devices, just 2 HID devices (mouse + keyboard).

Comment 23 Justin M. Forbes 2015-10-20 19:33:10 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 22 kernel bugs.

Fedora 22 has now been rebased to 4.2.3-200.fc22.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 23, and are still experiencing this issue, please change the version to Fedora 23.

If you experience different issues, please open a new bug report for those.

Comment 24 Fedora Kernel Team 2015-11-23 17:19:11 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in over 4 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.


Note You need to log in before you can comment on or make changes to this bug.