Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 159877 - x86_64 kernel panic after force removal of active lv
Summary: x86_64 kernel panic after force removal of active lv
Keywords:
Status: CLOSED DUPLICATE
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: lvm2-cluster
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Christine Caulfield
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-06-08 19:26 UTC by Corey Marthaler
Modified: 2010-01-12 04:03 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-08-10 15:01:14 UTC


Attachments (Terms of Use)

Description Corey Marthaler 2005-06-08 19:26:58 UTC
Description of problem:
I had just finished up the LVM I/O on the x86_64 cluster (link-01, link-02,
link-08) and was tearing down lvm volumes inorder to make new ones for file
system testing. An lvremove attempt caused all my nodes to panic:

Unable to handle kernel paging request at 0000000030345f4e RIP:
<ffffffff801dced5>{rb_first+10}
PML4 1d829067 PGD 1f6e1067 PMD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in: gnbd(U) lock_nolock(U) gfs(U) lock_dlm(U) dlm(U) cman(U)
lock_harness(U) md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc
ds yenta_socket pcmcia_core buttonbattery ac ohci_hcd hw_random tg3 floppy
dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod qla2300qla2xxx scsi_transport_fc
mptscsih mptbase sd_modscsi_mod
Pid: 14792, comm: clvmd Not tainted 2.6.9-11.ELsmp
RIP: 0010:[<ffffffff801dced5>] <ffffffff801dced5>{rb_first+10}
RSP: 0018:000001001e743ea0  EFLAGS: 00010206
RAX: 0000000030345f36 RBX: 000001001fdbb6a8 RCX: 0000010037e49c00
RDX: 0000000000000000 RSI: 000000000000006c RDI: 000001001fdbb6a0
RBP: 000001003d64c000 R08: 0000000000000025 R09: 0000000000000000
R10: 0000000000000000 R11: ffffffff80170638 R12: 000001001fdbb6a0
R13: 000000000069b4f7 R14: 000001001fdbb760 R15: 00000000006782b0
FS:  0000000041401960(005b) GS:ffffffff804c1700(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000030345f4e CR3: 0000000000101000 CR4: 00000000000006e0
Process clvmd (pid: 14792, threadinfo 000001001e742000, task 0000010037d5c7f0)
Stack: ffffffff8016da67 000001001fdbb678 000001003d64c000 000001003a608408
        ffffffff80170649 0000000000000000 ffffffff80181672 000001003a00d6d8
        000001003ffec200 00000010010889cc
Call Trace:<ffffffff8016da67>{mpol_free_shared_policy+53}
<ffffffff80170649>{shmem_destroy_inode+17}
        <ffffffff80181672>{sys_unlink+261} <ffffffff8011003e>{system_call+126}


Code: 48 83 78 18 00 74 06 48 8b 40 18 eb f3 48 89 c2 48 89 d0 c3
RIP <ffffffff801dced5>{rb_first+10} RSP <000001001e743ea0>
CR2: 0000000030345f4e
 <0>Kernel panic - not syncing: Oops


Version-Release number of selected component (if applicable):
[root@link-01 ~]# rpm -qa | grep lvm2
lvm2-2.01.08-1.0.RHEL4
lvm2-cluster-2.01.09-3.1.RHEL4


How reproducible:
Still trying

Comment 1 Corey Marthaler 2005-06-08 20:25:19 UTC
reproduced again with exact same above senario.

Comment 2 Corey Marthaler 2005-06-08 20:57:02 UTC
This is caused by a force remove of an active lv.

[root@link-02 ~]# lvscan
  ACTIVE            '/dev/stripe_8_4096_4/stripe_8_4096_40' [924.00 GB] anywhere


lvremove -f /dev/stripe_8_4096_4/stripe_8_4096_40

strace:

[...]
stat("/dev/sdf1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 81), ...}) = 0
stat("/dev/sdf1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 81), ...}) = 0
open("/dev/sdf1", O_RDWR|O_DIRECT|0x40000) = 5
fstat(5, {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 81), ...}) = 0
ioctl(5, BLKBSZGET, 0x67f9a0)           = 0
lseek(5, 2048, SEEK_SET)                = 2048
read(5, "_\332\24\f LVM2 x[5A%r0N*>\1\0\0\0\0\10\0\0\0\0\0\0"..., 512) = 512
lseek(5, 4096, SEEK_SET)                = 4096
read(5, "stripe_8_4096_4 {\nid = \"ADcb5J-K"..., 512) = 512
close(5)                                = 0
lseek(4, 0, SEEK_SET)                   = 0
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
lseek(4, 2048, SEEK_SET)                = 2048
read(4, "_\332\24\f LVM2 x[5A%r0N*>\1\0\0\0\0\10\0\0\0\0\0\0"..., 512) = 512
lseek(4, 4096, SEEK_SET)                = 4096
read(4, "stripe_8_4096_4 {\nid = \"ADcb5J-K"..., 512) = 512
close(4)                                = 0
brk(0x6db000)                           = 0x6db000
open("/proc/devices", O_RDONLY)         = 4
fstat(4, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2a97c3b000
read(4, "Character devices:\n  1 mem\n  4 /"..., 1024) = 445
close(4)                                = 0
munmap(0x2a97c3b000, 4096)              = 0
open("/proc/misc", O_RDONLY)            = 4
fstat(4, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2a97c3b000
read(4, " 60 dlm_clvmd\n 61 gnbd_ctl\n 62 d"..., 1024) = 94
close(4)                                = 0
munmap(0x2a97c3b000, 4096)   
stat("/dev/mapper/control", {st_mode=S_IFCHR|0600, st_rdev=makedev(10, 63),
...}) = 0
open("/dev/mapper/control", O_RDWR)     = 4
ioctl(4, DM_VERSION, 0x6ba260)          = 0
ioctl(4, DM_DEV_STATUS, 0x6a41c0)       = 0
brk(0x6d3000)                           = 0x6d3000
uname({sys="Linux", node="link-01", ...}) = 0
open("/etc/lvm/archive/.lvm_link-01_9269_145392622",
O_WRONLY|O_APPEND|O_CREAT|O_EXCL, 0666) =5
fcntl(5, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
fcntl(5, F_GETFL)                       = 0x8401 (flags
O_WRONLY|O_APPEND|O_LARGEFILE|0x8000)
fstat(5, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2a97c3b000
lseek(5, 0, SEEK_CUR)                   = 0
uname({sys="Linux", node="link-01", ...}) = 0
write(5, "# Generated by LVM2: Wed Jun  8 "..., 2292) = 2292
close(5)                                = 0
munmap(0x2a97c3b000, 4096)              = 0
open("/etc/lvm/archive", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 5
fstat(5, {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
fcntl(5, F_SETFD, FD_CLOEXEC)           = 0
getdents64(5, /* 88 entries */, 4096)   = 3472
getdents64(5, /* 0 entries */, 4096)    = 0
close(5)                                = 0
link("/etc/lvm/archive/.lvm_link-01_9269_145392622",
"/etc/lvm/archive/stripe_8_4096_4_00008.vg") = 0
stat("/etc/lvm/archive/.lvm_link-01_9269_145392622", {st_mode=S_IFREG|0600,
st_size=2292, ...}) = 0
unlink("/etc/lvm/archive/.lvm_link-01_9269_145392622") = 0
write(3, "2\0\377\277\0\0\0\0\0\0\0\0C\0\0\0\0\30\0ADcb5JKgkAFga"..., 85) = 85
read(3,



Comment 3 Christine Caulfield 2005-06-09 07:06:58 UTC
Don't we get useful tracebacks on X86_64? oh dear. 
If it is caused by removing a volume then it could be a device-mapper bug.
Does it happen on a non-clustered system?

Comment 4 Alasdair Kergon 2005-08-10 15:01:14 UTC

*** This bug has been marked as a duplicate of 158956 ***


Note You need to log in before you can comment on or make changes to this bug.