Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 453929 - 100 LVs (in single PV/VG) cause long hang when booting
Summary: 100 LVs (in single PV/VG) cause long hang when booting
Alias: None
Product: Fedora
Classification: Fedora
Component: mkinitrd
Version: 11
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Peter Jones
QA Contact: Fedora Extras Quality Assurance
Depends On:
TreeView+ depends on / blocked
Reported: 2008-07-03 10:02 UTC by Richard W.M. Jones
Modified: 2010-01-12 15:32 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2010-01-12 15:32:25 UTC

Attachments (Terms of Use)
init script from the saved initrd (deleted)
2008-07-03 11:11 UTC, Richard W.M. Jones
no flags Details
lvmdump with 75 LVs (deleted)
2008-07-03 12:26 UTC, Richard W.M. Jones
no flags Details
nash log screen (deleted)
2008-07-03 14:12 UTC, Milan Broz
no flags Details

Description Richard W.M. Jones 2008-07-03 10:02:07 UTC
Description of problem:

  I need to create a large number of LVs for unrelated testing purposes.
  However when I do this, it causes a very long hang during boot
  (apparently somewhere in initrd).  This is very easy to reproduce (see
  the steps below).

Version-Release number of selected component (if applicable):


  The machine is running Rawhide and is up to date as of about
  two days ago (2008-07-01).

How reproducible:


Steps to Reproduce:

  On a machine with a single PV & VG (ie. an ordinary Rawhide install)
  and just a little bit of free space in the VG, you can reproduce this

  for i in `seq 0 99`; do /sbin/lvcreate -L 32M -n Temp$i VolGroup00; done

Actual results:

  The machine will hang for around 15 minutes at one stage in the boot
  process.  There is no visual indication during this time that anything
  is happening at all, but if you wait long enough the machine should
  eventually reboot.

  The hang appears to happen somewhere in initrd.

Expected results:

  Machine should either reboot more quickly, or give some indication
  of progress.

Additional info:

Comment 1 Alasdair Kergon 2008-07-03 10:15:59 UTC
Can you be more precise?
What are the last messages shown before it hangs?

Comment 2 Alasdair Kergon 2008-07-03 10:19:48 UTC
If you have 50 LVs instead of 100, how long is the hang in comparison?

Comment 3 Richard W.M. Jones 2008-07-03 10:21:47 UTC
This is the last message before it hangs (I copied this by hand so it
may not be exactly correct):

device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.13.0-ioctl (2007-10-18) initialised:

I will try with 50 LVs in a moment and let you know.

Comment 4 Richard W.M. Jones 2008-07-03 10:31:09 UTC
50 LVs => 3 minutes

(These are all wallclock times, so accurate to the nearest minute).

Comment 5 Alasdair Kergon 2008-07-03 10:34:43 UTC
Can you also attach the actual 'init' script from inside the initrd?  And what
messages come next when it wakes up - try to spot whereabouts in the script the
delay is happening if you can.

And 3mins (50) -> 15mins (100)  - what about something in between like 75?

Comment 6 Alasdair Kergon 2008-07-03 10:36:27 UTC
(you said it's fully up-to-date rawhide - so I'm assuming that means you ran
mkinitrd *after* updating the lvm2 package)

Comment 7 Alasdair Kergon 2008-07-03 10:38:03 UTC
(if not, make sure you keep the problematic initrd before replacing it with a
new one)

Comment 8 Milan Broz 2008-07-03 10:44:21 UTC
well, it should work, if not, another problem related to lvmcache probably.
anyway, assigning to me, I have test system for this.

Comment 9 Richard W.M. Jones 2008-07-03 11:04:44 UTC
Version of LVM in the saved initrd.img ('bin/lvm version'):

  LVM version:     2.02.39 (2008-06-27)
  Library version: 1.02.27 (2008-06-25)

Comment 10 Richard W.M. Jones 2008-07-03 11:09:52 UTC
The version of nash in the saved initrd.img is 6.0.54.

Comment 11 Richard W.M. Jones 2008-07-03 11:11:38 UTC
Created attachment 310911 [details]
init script from the saved initrd

Comment 12 Richard W.M. Jones 2008-07-03 12:24:33 UTC
I'm now completely up to date, and initrd has been rebuilt.

75 LVs cause a 10 minute boot delay.

Comment 13 Richard W.M. Jones 2008-07-03 12:26:33 UTC
Created attachment 310918 [details]
lvmdump with 75 LVs

Comment 14 Richard W.M. Jones 2008-07-03 12:29:09 UTC

Linux thinkpad 2.6.26-0.98.rc8.git1.fc10.i686 #1 SMP Mon Jun 30 15:27:47 EDT
2008 i686 i686 i386 GNU/Linux



Comment 15 Milan Broz 2008-07-03 13:34:47 UTC
ok, I have system with 1000 LVs, vgchange in initrd works ok (<2min) but there
is another delay later, also I see

Creating root device.
Mounting root filesystem.
get_netlink_msg returned No buffer space available

... probably nash have some problem here.

I'll add more debug info later.

Comment 16 Richard W.M. Jones 2008-07-03 13:46:33 UTC
As a data point, there is no problem with RHEL 5.2.

Comment 17 Milan Broz 2008-07-03 14:12:07 UTC
Created attachment 310924 [details]
nash log screen

Screenshot with time info this nash script:
The time 1:40 for vgachange is ok - it is the same time like in normal system.
(but see delay in "mount /sysroot")

echo ----------------
echo Scanning logical volumes
time lvm vgscan --ignorelockingfailure
echo Activating logical volumes
time lvm vgchange -ay --ignorelockingfailure  vg_test
echo ----------------
echo ----------------

echo resume UUID=...
resume UUID=f7e8fede-d4f0-42b8-8250-7c31fa8d0c87

echo Creating root device.
mkrootdev -t ext3 -o noatime,ro UUID=eb7582a3-b3d1-4244-ba4b-5c3b1e79e948

echo Mounting root filesystem.
mount /sysroot

echo Setting up other filesystems.

echo loadpolicy

Comment 18 Milan Broz 2008-07-03 14:20:05 UTC
Note that I have root on normal partition in previous example, lvm commands are
just added there to show the problem.

# rpm -q nash

Reassigning to mkinitrd (nash is not in pkg list).

Comment 19 Milan Broz 2008-07-07 14:34:55 UTC
Also mkinitrd (during automatic kernel update) has problems.

Updating initrd with 1000 LVs is almost impossible - mkinitrd waits for
echo nash-resolveDevice ... | /sbin/nash --forcequiet

Comment 20 Alexandre Oliva 2008-07-13 07:14:50 UTC
Doesn't this make this a dupe of bug 277271? (with 9 months old patch never
integrated :-(

Comment 21 Milan Broz 2008-07-13 09:36:02 UTC
Seems so.
I think with device-mapper (LVM devices) there is another scanning loop on top
of problem mentioned in bug 277271...

Comment 22 Bug Zapper 2008-11-26 02:30:30 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:

Comment 23 Milan Broz 2009-04-08 13:16:09 UTC
This is still problem, I cannot believe that this problems still remains unfixed...

Runinng on rawhide, mkinitrd-6.0.81-1.fc11.x86_64

Without DM devices:
# time /usr/libexec/plymouth/plymouth-update-initrd

real    0m16.124s
user    0m5.060s
sys     0m11.870s

Now let's create some fake DM devices (150 new mapped devices - return zero on access):

# for i in $(seq 1 150) ; do dmsetup create "pv$i" --table "0 1024 zero" ; done
# time /usr/libexec/plymouth/plymouth-update-initrd

real    112m21.046s
user    111m53.578s
sys     0m32.850s

nash eat a lot of memory too, and 100% cpu
root     18471 99.9  6.2 421484 371376 pts/0   R+   13:53  56:02 /sbin/nash --forcequiet

This problem was observed with real lvm config, about 200 PVs. The example above is just easy reproducer without lvm involved.

The kernel update on system with such config and activated volumes is still almost impossible.

Comment 24 Bug Zapper 2009-06-09 09:38:09 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:

Comment 25 Hans de Goede 2010-01-12 15:32:25 UTC
This is a mass edit of all mkinitrd bugs.

Thanks for taking the time to file this bug report (and/or commenting on it).

As you may have heard in Fedora 12 mkinitrd has been replaced by dracut. In Fedora 12 the mkinitrd package is still around as some programs depend on
certain libraries it provides, but mkinitrd itself is no longer used.

In Fedora 13 mkinitrd will be removed completely. This means that all work
on initrd has stopped.

Rather then keeping mkinitrd bugs open and giving false hope they might get fixed we are mass closing them, so as to clearly communicate that no more work will be done on mkinitrd. We apologize for any inconvenience this may cause. 

If you are using Fedora 11 and are experiencing a mkinitrd bug you cannot work around, please upgrade to Fedora 12. If you experience problems with the initrd in Fedora 12, please file a bug against dracut.

Note You need to log in before you can comment on or make changes to this bug.