Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 452438 - kernel-2.6.26-0.81.rc7.fc10.i686 hangs with ntfs-3g mounts ...
Summary: kernel-2.6.26-0.81.rc7.fc10.i686 hangs with ntfs-3g mounts ...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Eric Paris
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-06-22 23:20 UTC by Tom London
Modified: 2008-09-08 23:40 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-09-08 23:40:31 UTC


Attachments (Terms of Use)
output of 'echo "t">/proc/sysrq-trigger' (deleted)
2008-06-27 17:29 UTC, Tom London
no flags Details
output of 'echo "t" >/proc/sysrq-trigger' with sync hanging (deleted)
2008-07-01 17:23 UTC, Tom London
no flags Details

Description Tom London 2008-06-22 23:20:34 UTC
Description of problem:
Downloaded kernel-2.6.26-0.81.rc7.fc10.i686 from koji.

After installing and restarting, it hung at "Mounting local filesystems", with
SELinux in either permisssive or enforcing.

kernel-2.6.26-0.74.rc6.git4.fc10.i686 boots fine.

For quite a while, I have had the following line in /etc/fstab to mount my
Windows partition:

/dev/sda1		/mnt/windows		ntfs-3g	rw		0 0

Removing this line from /etc/fstab allows the system to boot.

With the system booted up in gnome, a mount on this partition hangs. (I
Ctrl-C'ed it after about 2 minutes): "mount -t ntfs-3g /dev/sda1 /mnt/windows"

Afterwards running "ntfs-3g /dev/sda1 /mnt/windows" reported /dev/sda1 being
"temporarily unavailable" (don't have the exact text).

There are no messages in dmesg or /var/log/messages.
Version-Release number of selected component (if applicable):
kernel-2.6.26-0.81.rc7.fc10.i686

How reproducible:
Every boot

Steps to Reproduce:
1. Add line in /etc/fstab for ntfs-3g partition
2. reboot
3. hang at "Mounting local filesystems"
  
Actual results:


Expected results:


Additional info:

Comment 1 Tom London 2008-06-24 17:05:21 UTC
Running 0.82, with ntfs partition omitted from /etc/fstab, I get "stuck" mounts
of ntfs-3g:

 3219 ?        S      0:00 /usr/libexec/gvfsd-trash --spawner :1.6
/org/gtk/gvfs/exec_spaw/0
 3240 ?        S      0:00 /usr/libexec/gvfsd-burn --spawner :1.6
/org/gtk/gvfs/exec_spaw/1
 3244 ?        S      0:00 gnome-mount -b -d /dev/sda1 -n
 3263 ?        S      0:00 /usr/libexec/hal-storage-mount
 3266 ?        S      0:00 /bin/mount -t ntfs-3g -o
nosuid,nodev,uhelper=hal,locale=en_US.UTF-8 /dev/sda1 /media/IBM_PRELOAD_
 3267 ?        S      0:00 /sbin/mount.ntfs-3g /dev/sda1 /media/IBM_PRELOAD_ -o
rw,nosuid,nodev,uhelper=hal,locale=en_US.UTF-8

These never complete or die.

Believe I get new mount points created in /media (IBM_PRELOAD, IBM_PRELOAD_,
etc.), but nothing mounted.

Comment 2 Tom London 2008-06-24 22:12:21 UTC
Continues to happen with 0.87.

Noticed this in /var/log/messages from session with 0.82:

Jun 24 10:53:06 localhost ntfs-3g[6178]: Version 1.2506 integrated FUSE 27
Jun 24 10:53:06 localhost ntfs-3g[6178]: Mounted /dev/sda1 (Read-Write, label
"IBM_PRELOAD", NTFS 3.1)
Jun 24 10:53:06 localhost ntfs-3g[6178]: Cmdline options: (null)
Jun 24 10:53:06 localhost ntfs-3g[6178]: Mount options:
silent,allow_other,nonempty,relatime,fsname=/dev/sda1,blkdev,blksize=4096
Jun 24 10:53:06 localhost ntfs-3g[6178]: Unmounting /dev/sda1 (IBM_PRELOAD)
Jun 24 10:53:45 localhost init: tty4 main process (2708) killed by TERM signal
Jun 24 10:53:45 localhost init: tty6 main process (2713) killed by TERM signal
Jun 24 10:53:45 localhost init: tty5 main process (2709) killed by TERM signal
Jun 24 10:53:45 localhost init: tty2 main process (2710) killed by TERM signal
Jun 24 10:53:45 localhost init: tty3 main process (2711) killed by TERM signal
Jun 24 10:53:45 localhost smartd[2702]: smartd received signal 15: Terminated
Jun 24 10:53:45 localhost smartd[2702]: smartd is exiting (exit status 0)

So it looks like the "mount" completed during shutdown, followed immediately by
the "unmount".

I tried doing an "strace ntfs-3g /dev/sda1 /mnt/windows" after rebooting into
single user mode.

Here are the last few lines of strace output:

open("/dev/fuse", O_RDWR|O_LARGEFILE)   = 4
getegid32()                             = 0
getgid32()                              = 0
getegid32()                             = 0
setresgid32(-1, 0, 0)                   = 0
getegid32()                             = 0
geteuid32()                             = 0
getuid32()                              = 0
geteuid32()                             = 0
setresuid32(-1, 0, 0)                   = 0
geteuid32()                             = 0
getuid32()                              = 0
lstat64("/mnt/windows", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
getuid32()                              = 0
getuid32()                              = 0
getuid32()                              = 0
getuid32()                              = 0
getgid32()                              = 0
getuid32()                              = 0
geteuid32()                             = 0
getegid32()                             = 0
mount("/dev/sda1", "/mnt/windows", "fuseblk", 0,
"allow_other,blksize=4096,fd=4,ro"...

Hanging on the "mount" .....


Comment 3 Chuck Ebbert 2008-06-27 16:15:59 UTC
Try removing 'quiet' adding this to the kernel options in /etc/grub.conf:

  ignore_loglevel sysrq_always_enabled

The when mount hangs run this command:

  echo "t" >/proc/sysrq-trigger

Look in /var/log/messages for the output of that and post it as an attachment.


Comment 4 Tom London 2008-06-27 17:28:48 UTC
OK. Believe I did as requested:

I rebooted with above options.  At gdm screen, I cntl-alt-F1 and logged in as root.

"ps agx" showed no processes doing "mounts".

I entered "mount /dev/sda1 /mnt&".  "ps agx" showed mount hung.

I ran 'echo "t" >/proc/sysrq-trigger', and watched the text flow by.

I rebooted, copied the ouput from /var/log/messages to /tmp/sysrq.txt.  I attach
below.

Let me know if I didn't do this right, and I will rerun.

Comment 5 Tom London 2008-06-27 17:29:24 UTC
Created attachment 310463 [details]
output of 'echo "t">/proc/sysrq-trigger'

Comment 6 Tom London 2008-06-27 18:31:12 UTC
Got this comment on fedora-test (just archiving here for completeness):
	
Tom London <selinux <at> gmail.com> writes:

> Kernel versions since about 0.81 can no longer mount ntfs-3g filesystems
> for me.
> I've BZ'ed this here: https://bugzilla.redhat.com/show_bug.cgi?id=452438
>
> The symptoms appear that the call to mount just hangs.

This is probably the kernel Smack problem fixed here:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e97dcb0eadbb821eccd549d4987b653cf61e2374

If not then please send the stack traceback of the hanging
mount process (echo t > /proc/sysrq-trigger).

Regards,   Szaka

--
NTFS-3G: http://ntfs-3g.org

Comment 7 Dave Jones 2008-06-27 19:06:03 UTC
Unlikely. We don't build smack.


Comment 8 Miklos Szeredi 2008-06-27 21:25:53 UTC
Yeah, this time it's fuse vs. selinux, but the issue is similar.  This is what
happens:

sys_mount
  vfs_kern_mount
    fuse_get_sb
    security_sb_kern_mount
      selinux_sb_kern_mount
        fuse_getxattr

The mount syscall won't return until fuse_getxattr() finishes.  But
fuse_getxattr() cannot finish until the mount syscall returns -> deadlock.  The
reason fuse_getxattr cannot finish is because fuse userspace only starts request
processing after the mount has succeeded.  Since this is part of the userspace
ABI, it cannot easily be changed, and so selinux will probably have to work
around it in some way.

Looking at selinux_set_mnt_opts() in mainline, I don't actually see it calling
->getxattr().  Is this perhaps a recent addition only in fedora kernels?  That
would explain why this hang weren't reported earlier.


Comment 9 Chuck Ebbert 2008-06-30 20:14:40 UTC
Apparently caused by linux-2.6-selinux-ecryptfs-support.patch


Comment 10 Chuck Ebbert 2008-06-30 22:40:22 UTC
Patch disabled for now. Leaving bug open.

Comment 11 Tom London 2008-07-01 17:20:38 UTC
running 0.98 (believe the patch is still in):

In addition to the above, even without attempting to mount ntfs-3g partition,
"sync" command hangs when running in runlevel 5.  "sync" completes in runlevel 3.

Could this be related to the above (and fuse)?

Here is output of "strace sync":
execve("/bin/sync", ["sync"], [/* 29 vars */]) = 0
brk(0)                                  = 0x8404000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=104008, ...}) = 0
mmap2(NULL, 104008, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb8050000
close(3)                                = 0
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000wz\0004\0\0\0"...,
512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1511052, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0xb804f000
mmap2(0x791000, 1513040, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) =
0x791000
mmap2(0x8fd000, 12288, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16c) = 0x8fd000
mmap2(0x900000, 9808, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,
-1, 0) = 0x900000
close(3)                                = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0xb804e000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb804e6c0, limit:1048575,
seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0,
useable:1}) = 0
mprotect(0x8fd000, 8192, PROT_READ)     = 0
mprotect(0x789000, 4096, PROT_READ)     = 0
munmap(0xb8050000, 104008)              = 0
brk(0)                                  = 0x8404000
brk(0x8425000)                          = 0x8425000
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=79736512, ...}) = 0
mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7e4e000
close(3)                                = 0
sync(

I attach below the output from 'echo "t" >/proc/sysrq-trigger'.

Comment 12 Tom London 2008-07-01 17:23:22 UTC
Created attachment 310695 [details]
output of 'echo "t" >/proc/sysrq-trigger' with sync hanging

Obtained by booting with ignore_loglevel sysrq_always_enabled, booting to
runlevel 5, running "sync&" in terminal window, and running 'echo
"t">/proc/sysrq-trigger'

Comment 13 Miklos Szeredi 2008-07-01 18:07:08 UTC
(In reply to comment #12)
> Created an attachment (id=310695) [edit]
> output of 'echo "t" >/proc/sysrq-trigger' with sync hanging
> 
> Obtained by booting with ignore_loglevel sysrq_always_enabled, booting to
> runlevel 5, running "sync&" in terminal window, and running 'echo
> "t">/proc/sysrq-trigger'

That's exactly the same issue: gvfs trying to mount some fuse filesystem, which
hangs in sys_mount() holding the s_umount semaphore for write, and sys_sync()
trying to acquire s_umount for read.


Comment 14 Tom London 2008-07-02 15:14:03 UTC
Works for me with kernel-2.6.26-0.104.rc8.git2.fc10.i686

Comment 15 Tom London 2008-08-23 22:22:59 UTC
Close this? Been working for me since 2 July.....

Issue with linux-2.6-selinux-ecryptfs-support.patch resolved?


Note You need to log in before you can comment on or make changes to this bug.