Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 161823 - Kernel Oops while using CAE application medina (T-Systems)
Summary: Kernel Oops while using CAE application medina (T-Systems)
Keywords:
Status: CLOSED DUPLICATE of bug 73733
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Larry Woodman
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-06-27 16:08 UTC by Udo Seidel
Modified: 2007-11-30 22:07 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-07-07 20:22:52 UTC


Attachments (Terms of Use)
sysreport after the reboot (deleted)
2005-06-27 16:08 UTC, Udo Seidel
no flags Details

Description Udo Seidel 2005-06-27 16:08:18 UTC
Description of problem:
While postprocessing with the CAE application Medina the machine stop to work
with a kernel Oops on Xeon EM64T. The only way to get the machine working again
is to hard reset the machine.

Version-Release number of selected component (if applicable):
kernel-smp-2.6.9-5.EL

How reproducible:
Every time on Xeon EM64T systems, never on AMD Opteron systems

Steps to Reproduce:
1. start Medina postprocessor
2. load protocol file



  
Actual results:


Expected results:


Additional info:

Comment 1 Udo Seidel 2005-06-27 16:08:19 UTC
Created attachment 116024 [details]
sysreport after the reboot

Comment 2 Udo Seidel 2005-06-27 16:09:21 UTC
The binary Nvidia driver for the graphich adapter is loaded.


Here is a output from the kernel Oops


Jun 27 17:33:25 ibm1 kernel: medpost74: Corrupted page table at address 2aad17e000
Jun 27 17:33:25 ibm1 kernel: PML4 203c71067 PGD 203c80067 PMD 1f2f78067 PTE
7ffffe000000002f
Jun 27 17:33:25 ibm1 kernel: Bad pagetable: 000f [1] SMP 
Jun 27 17:33:25 ibm1 kernel: CPU 2 
Jun 27 17:33:25 ibm1 kernel: Modules linked in: parport_pc lp parport autofs4
i2c_dev i2c_core nfs lockd sunrpc ds yenta_so
cket pcmcia_core button battery ac nvidia(U) md5 ipv6 uhci_hcd ehci_hcd
snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_o
ss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device
snd soundcore tg3 dm_snapshot dm_zero dm_mir
ror ext3 jbd dm_mod aic79xx sd_mod scsi_mod
Jun 27 17:33:25 ibm1 kernel: Pid: 4131, comm: medpost74 Tainted: P     
2.6.9-5.ELsmp
Jun 27 17:33:25 ibm1 kernel: RIP: 0033:[<0000002a96c17f10>] [<0000002a96c17f10>]
Jun 27 17:33:25 ibm1 kernel: RSP: 002b:0000007fbfffbfa8  EFLAGS: 00010202
Jun 27 17:33:25 ibm1 kernel: RAX: 0000000000002480 RBX: 0000000004669a60 RCX:
0000002aad17e000
Jun 27 17:33:25 ibm1 kernel: RDX: 0000000000400000 RSI: 0000000000000000 RDI:
0000002aace10000
Jun 27 17:33:25 ibm1 kernel: RBP: 0000007fbfffc060 R08: 0000000000000000 R09:
0000000000000200
Jun 27 17:33:25 ibm1 kernel: R10: 0000000000000041 R11: 0000000000000003 R12:
0000000000000000
Jun 27 17:33:25 ibm1 kernel: R13: 0000000000400000 R14: 0000000000002010 R15:
0000000004669a70
Jun 27 17:33:25 ibm1 kernel: FS:  0000002a98009140(0000)
GS:ffffffff804bf400(0000) knlGS:0000000000000000
Jun 27 17:33:25 ibm1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 27 17:33:25 ibm1 kernel: CR2: 0000002aad17e000 CR3: 0000000037e2e000 CR4:
00000000000006e0
Jun 27 17:33:25 ibm1 kernel: Process medpost74 (pid: 4131, threadinfo
0000010203438000, task 00000102038a2030)
Jun 27 17:33:25 ibm1 kernel: 
Jun 27 17:33:25 ibm1 kernel: RIP [<0000002a96c17f10>] RSP <0000007fbfffbfa8>
Jun 27 17:33:25 ibm1 kernel:  <1>Unable to handle kernel paging request at
000000fe0e63e3b0 RIP: 
Jun 27 17:33:25 ibm1 kernel: <ffffffff80120224>{unmap_single+50}
Jun 27 17:33:25 ibm1 kernel: PML4 0 
Jun 27 17:33:25 ibm1 kernel: Oops: 0000 [2] SMP 
Jun 27 17:33:25 ibm1 kernel: CPU 2 
Jun 27 17:33:25 ibm1 kernel: Modules linked in: parport_pc lp parport autofs4
i2c_dev i2c_core nfs lockd sunrpc ds yenta_so
cket pcmcia_core button battery ac nvidia(U) md5 ipv6 uhci_hcd ehci_hcd
snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_o
ss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device
snd soundcore tg3 dm_snapshot dm_zero dm_mir
ror ext3 jbd dm_mod aic79xx sd_mod scsi_mod
Jun 27 17:33:25 ibm1 kernel: Pid: 4131, comm: medpost74 Tainted: P     
2.6.9-5.ELsmp
Jun 27 17:33:25 ibm1 kernel: RIP: 0010:[<ffffffff80120224>]
<ffffffff80120224>{unmap_single+50}
Jun 27 17:33:25 ibm1 kernel: RSP: 0000:0000010203439ac8  EFLAGS: 00010293
Jun 27 17:33:25 ibm1 kernel: RAX: 000001000e6e5000 RBX: 001fffffbffeb276 RCX:
0000000000000000
Jun 27 17:33:25 ibm1 kernel: RDX: ffffffffbffeb276 RSI: ffffff0000000000 RDI:
000001023fe90f30
Jun 27 17:33:25 ibm1 kernel: RBP: 0000000000000002 R08: 0000000000001000 R09:
00000101f2ee4000
Jun 27 17:33:25 ibm1 kernel: R10: 0000010000000000 R11: 0000000000000246 R12:
0000000000000000
Jun 27 17:33:25 ibm1 kernel: R13: 000001023fe90f30 R14: 000001023fe90ec0 R15:
0000010000000000
Jun 27 17:33:25 ibm1 kernel: FS:  0000000000000000(0000)
GS:ffffffff804bf400(0000) knlGS:0000000000000000
Jun 27 17:33:25 ibm1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 27 17:33:25 ibm1 kernel: CR2: 000000fe0e63e3b0 CR3: 0000000037e2e000 CR4:
00000000000006e0
Jun 27 17:33:25 ibm1 kernel: Process medpost74 (pid: 4131, threadinfo
0000010203438000, task 00000102038a2030)
Jun 27 17:33:26 ibm1 kernel: Stack: 0000000000000001 00000101f3371968
0000000000000001 ffffffff80120943 
Jun 27 17:33:26 ibm1 kernel:        0000000000000000 000001023e710fc0
00000101f3371950 00000101f3371950 
Jun 27 17:33:26 ibm1 kernel:        000000000000036e ffffffffa03d5432 
Jun 27 17:33:26 ibm1 kernel: Call Trace:<ffffffff80120943>{swiotlb_unmap_sg+191}
<ffffffffa03d5432>{:nvidia:nv_vm_free_page
s+283} 
Jun 27 17:33:26 ibm1 kernel:       
<ffffffffa03d3410>{:nvidia:nv_free_pages+734}
<ffffffffa01ca083>{:nvidia:_nv001716rm+89
} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffffa01b2acc>{:nvidia:_nv001228rm+150}
<ffffffffa01b1aea>{:nvidia:_nv001241rm+154}
 
Jun 27 17:33:26 ibm1 kernel:        <ffffffffa01b1846>{:nvidia:_nv001246rm+60}
<ffffffffa02e5713>{:nvidia:_nv004331rm+33} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffffa02e4eaf>{:nvidia:_nv004179rm+121}
<ffffffffa01cf9d2>{:nvidia:_nv001226rm+96} 
Jun 27 17:33:26 ibm1 kernel:       
<ffffffffa01d0d50>{:nvidia:rm_free_unused_clients+128} 
Jun 27 17:33:26 ibm1 kernel:       
<ffffffffa03d1144>{:nvidia:nv_kern_ctl_close+175}
<ffffffffa03d1282>{:nvidia:nv_kern_cl
ose+252} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffff80172bcf>{__fput+99}
<ffffffff80171810>{filp_close+103} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffff80136dec>{put_files_struct+101}
<ffffffff801375b8>{do_exit+665} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffff80226ba4>{do_unblank_screen+97}
<ffffffff80121e04>{do_page_fault+0} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffff80122393>{do_page_fault+1423}
<ffffffff80165be8>{do_mmap_pgoff+1593} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffff801dccf8>{__up_write+19}
<ffffffff80110a6d>{error_exit+0} 
Jun 27 17:33:26 ibm1 kernel:        
Jun 27 17:33:26 ibm1 kernel: 
Jun 27 17:33:26 ibm1 kernel: Code: 4c 8b 0c d0 0f 94 c2 85 c9 0f 94 c0 09 d0 a8
01 74 09 fc 4c Jun 27 17:33:26 ibm1 kernel: RIP
<ffffffff80120224>{unmap_single+50} RSP <0000010203439ac8>
Jun 27 17:33:26 ibm1 kernel: CR2: 000000fe0e63e3b0
Jun 27 17:33:26 ibm1 kernel:  <1>Unable to handle kernel NULL pointer
dereference at 0000000000000048 RIP: 
Jun 27 17:33:26 ibm1 kernel: <ffffffff8013331d>{mm_release+70}
Jun 27 17:33:26 ibm1 kernel: PML4 203c71067 PGD 0 
Jun 27 17:33:26 ibm1 kernel: Oops: 0000 [3] SMP 
Jun 27 17:33:26 ibm1 kernel: CPU 2 
Jun 27 17:33:26 ibm1 kernel: Modules linked in: parport_pc lp parport autofs4
i2c_dev i2c_core nfs lockd sunrpc ds yenta_so
cket pcmcia_core button battery ac nvidia(U) md5 ipv6 uhci_hcd ehci_hcd
snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_o
ss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device
snd soundcore tg3 dm_snapshot dm_zero dm_mir
ror ext3 jbd dm_mod aic79xx sd_mod scsi_mod
Jun 27 17:33:26 ibm1 kernel: Pid: 4131, comm: medpost74 Tainted: P     
2.6.9-5.ELsmp
Jun 27 17:33:26 ibm1 kernel: RIP: 0010:[<ffffffff8013331d>]
<ffffffff8013331d>{mm_release+70}
Jun 27 17:33:26 ibm1 kernel: RSP: 0000:00000102034398c8  EFLAGS: 00010202
Jun 27 17:33:26 ibm1 kernel: RAX: 00000102038a2030 RBX: 00000102038a2030 RCX:
0000000000000004
Jun 27 17:33:26 ibm1 kernel: RDX: 0000000000000008 RSI: 0000000000000000 RDI:
0000002a980091d0
Jun 27 17:33:26 ibm1 kernel: RBP: 0000000000000000 R08: 000000000000000f R09:
0000000000000001
Jun 27 17:33:26 ibm1 kernel: R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000000
Jun 27 17:33:26 ibm1 kernel: R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
Jun 27 17:33:26 ibm1 kernel: FS:  0000000000000000(0000)
GS:ffffffff804bf400(0000) knlGS:0000000000000000
Jun 27 17:33:26 ibm1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 27 17:33:26 ibm1 kernel: CR2: 0000000000000048 CR3: 0000000037e2e000 CR4:
00000000000006e0
Jun 27 17:33:26 ibm1 kernel: Process medpost74 (pid: 4131, threadinfo
0000010203438000, task 00000102038a2030)
Jun 27 17:33:26 ibm1 kernel: Stack: 0000000000000000 0000000000000000
000000fe0e63e3b0 00000102038a2030 
Jun 27 17:33:26 ibm1 kernel:        0000000000000009 ffffffff80137467
ffffffff803c7508 0000000000000046 
Jun 27 17:33:26 ibm1 kernel:        ffffffff803bc66c ffffffffffffffef 
Jun 27 17:33:26 ibm1 kernel: Call Trace:<ffffffff80137467>{do_exit+328}
<ffffffff80111796>{oops_end+38} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffff80122286>{do_page_fault+1154}
<ffffffff80157130>{free_pages_bulk+682} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffff80157130>{free_pages_bulk+682}
<ffffffff80110a6d>{error_exit+0} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffff80120224>{unmap_single+50}
<ffffffff80120943>{swiotlb_unmap_sg+191} 
Jun 27 17:33:26 ibm1 kernel:       
<ffffffffa03d5432>{:nvidia:nv_vm_free_pages+283}
<ffffffffa03d3410>{:nvidia:nv_free_pag
es+734} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffffa01ca083>{:nvidia:_nv001716rm+89}
<ffffffffa01b2acc>{:nvidia:_nv001228rm+150} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffffa01b1aea>{:nvidia:_nv001241rm+154}
<ffffffffa01b1846>{:nvidia:_nv001246rm+60} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffffa02e5713>{:nvidia:_nv004331rm+33}
<ffffffffa02e4eaf>{:nvidia:_nv004179rm+121} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffffa01cf9d2>{:nvidia:_nv001226rm+96}
<ffffffffa01d0d50>{:nvidia:rm_free_unused_cl
ients+128} 
Jun 27 17:33:26 ibm1 kernel:       
<ffffffffa03d1144>{:nvidia:nv_kern_ctl_close+175}
<ffffffffa03d1282>{:nvidia:nv_kern_cl
ose+252} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffff80172bcf>{__fput+99}
<ffffffff80171810>{filp_close+103} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffff80136dec>{put_files_struct+101}
<ffffffff801375b8>{do_exit+665} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffff80226ba4>{do_unblank_screen+97}
<ffffffff80121e04>{do_page_fault+0} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffff80122393>{do_page_fault+1423}
<ffffffff80165be8>{do_mmap_pgoff+1593} 
Jun 27 17:33:26 ibm1 kernel:        <ffffffff801dccf8>{__up_write+19}
<ffffffff80110a6d>{error_exit+0} 
Jun 27 17:33:26 ibm1 kernel:        
Jun 27 17:33:26 ibm1 kernel: 
Jun 27 17:33:26 ibm1 kernel: Code: 41 8b 45 48 ff c8 7e 53 48 c7 83 08 02 00 00
00 00 00 00 65 
Jun 27 17:33:26 ibm1 kernel: RIP <ffffffff8013331d>{mm_release+70} RSP
<00000102034398c8>




Comment 3 Dave Jones 2005-07-07 20:22:52 UTC
This looks to me like a bug in the NVidia driver judging from the call trace.
We've heard no other reports of page table corruption which lends more
credibility to this.

I also recommend updating to the U1 kernel which fixes a large number of bugs.

*** This bug has been marked as a duplicate of 73733 ***


Note You need to log in before you can comment on or make changes to this bug.