Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1176087 - [6.6-3.5]kernel panic occurred when boot hypervisor from UEFI machine
Summary: [6.6-3.5]kernel panic occurred when boot hypervisor from UEFI machine
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-node
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.5.0
Assignee: Fabian Deutsch
QA Contact: Virtualization Bugs
URL:
Whiteboard: node
Depends On:
Blocks: rhev35rcblocker rhev35gablocker
TreeView+ depends on / blocked
 
Reported: 2014-12-19 11:09 UTC by cshao
Modified: 2016-02-10 20:03 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-12-22 10:39:37 UTC
oVirt Team: Node
Target Upstream Version:


Attachments (Terms of Use)
r510.log (deleted)
2014-12-22 07:30 UTC, cshao
no flags Details
r510-new-output.log (deleted)
2014-12-22 08:47 UTC, cshao
no flags Details
init.log (deleted)
2014-12-22 08:47 UTC, cshao
no flags Details

Description cshao 2014-12-19 11:09:28 UTC
Created attachment 971101 [details]
kernel-panic-r510.png

Description of problem:
[6.6-3.5]kernel panic occurred when boot from UEFI machine(Dell-R510)

Version-Release number of selected component (if applicable):
rhev-hypervisor6-6.6-20141218.0.el6ev
ovirt-node-3.1.0-0.37.20141218gitcf277e1.el6.noarch

How reproducible:
I tested about 4 times, 2 times encountered this bug)

Steps to Reproduce:
1. Enter UEFI mode on Dell-r510.
2. Attach virtual-media and boot from it.
3. Reinstall the hypervisor on dell-r510 with uefi mode.
4. Reboot
5. Boot the hypervisor with uefi mode.

Actual results:
kernel panic occurred when boot from UEFI machine(Dell-R510)

Expected results:
Boot the hypervisor can succeed on UEFI mode.

Additional info:
We didn't met this issue on rhev-hypervisor6-6.6-20141119.0.iso(6.6-3.5), so consider it is a regression bug.

Due to met kernel panic issue, so I can provide more log info,

Comment 1 Fabian Deutsch 2014-12-19 11:24:49 UTC
Mike, the change between the previous RHEV-H which was not affected by this bug, and the build which is affected is (beyond others):

    -kernel-2.6.32-504.1.3.el6.src.rpm
    +kernel-2.6.32-504.3.3.el6.src.rpm

I saw that many dm patches went in between those two versions. Can you tell if this might be related to those patches?

Comment 2 Mike Snitzer 2014-12-19 14:13:15 UTC
(In reply to Fabian Deutsch from comment #1)
> Mike, the change between the previous RHEV-H which was not affected by this
> bug, and the build which is affected is (beyond others):
> 
>     -kernel-2.6.32-504.1.3.el6.src.rpm
>     +kernel-2.6.32-504.3.3.el6.src.rpm
> 
> I saw that many dm patches went in between those two versions. Can you tell
> if this might be related to those patches?

The changes that went in were focused on improving DM thin-provisioning.

$ git log rhel-6.6.z/master -- drivers/md | grep "RHEL6.7 PATCH" | tac
    O-Subject: [RHEL6.7 PATCH 01/25] dm thin: fix DMERR typo in pool_status error path
    O-Subject: [RHEL6.7 PATCH 02/25] dm thin: cleanup noflush_work to use a proper completion
    O-Subject: [RHEL6.7 PATCH 03/25] dm thin metadata: do not allow the data block size to change
    O-Subject: [RHEL6.7 PATCH 04/25] dm bufio: use kzalloc when allocating dm_bufio_client
    O-Subject: [RHEL6.7 PATCH 05/25] dm bufio: update last_accessed when relinking a buffer
    O-Subject: [RHEL6.7 PATCH 06/25] dm bufio: switch from a huge hash table to an rbtree
    O-Subject: [RHEL6.7 PATCH 07/25] dm bufio: evict buffers that are past the max age but retain some buffers
    O-Subject: [RHEL6.7 PATCH 08/25] dm bio prison: switch to using a red black tree
    O-Subject: [RHEL6.7 PATCH 09/25] dm thin metadata: change dm_thin_find_block to allow blocking, but not issuing, IO
    O-Subject: [RHEL6.7 PATCH 10/25] dm transaction manager: add support for prefetching blocks of metadata
    O-Subject: [RHEL6.7 PATCH 11/25] dm thin: prefetch missing metadata pages
    O-Subject: [RHEL6.7 PATCH 12/25] dm thin: throttle incoming IO
    O-Subject: [RHEL6.7 PATCH 14/25] dm thin: adjust max_sectors_kb based on thinp blocksize
    O-Subject: [RHEL6.7 PATCH 15/25] dm: improve documentation and code clarity in dm_merge_bvec
    O-Subject: [RHEL6.7 PATCH 16/25] dm thin: implement thin_merge
    O-Subject: [RHEL6.7 PATCH 17/25] dm thin: grab a virtual cell before looking up the mapping
    O-Subject: [RHEL6.7 PATCH 18/25] dm thin: performance improvement to discard processing
    O-Subject: [RHEL6.7 PATCH 19/25] dm thin: factor out remap_and_issue_overwrite
    O-Subject: [RHEL6.7 PATCH 20/25] dm thin: defer whole cells rather than individual bios
    O-Subject: [RHEL6.7 PATCH 21/25] dm thin: remap the bios in a cell immediately
    O-Subject: [RHEL6.7 PATCH 22/25] dm thin: direct dispatch when breaking sharing
    O-Subject: [RHEL6.7 PATCH 23/25] dm thin: sort the deferred cells
    O-Subject: [RHEL6.7 PATCH 24/25] dm thin: optimize retry_bios_on_resume
    O-Subject: [RHEL6.7 PATCH 25/25] dm thin: refactor requeue_io to eliminate spinlock bouncing
    O-Subject: [RHEL6.7 PATCH 26/25] dm thin: fix potential for infinite loop in pool_io_hints
    O-Subject: [RHEL6.7 PATCH v2 27/25] dm thin: fix pool_io_hints to avoid looking at max_hw_sectors

I see you're using old DM snapshot (which has nothing to do with dm-thinp).. and there are errors about trying to use "DM_snapshot_cow" has a filesystem type when mounting.  But beyond that I have no context to be able to _really_ say what the system was doing.

But I really doubt these DM changes have anything to do with you your UEFI boot problem.

Comment 3 Mike Snitzer 2014-12-19 14:28:32 UTC
(In reply to Mike Snitzer from comment #2)

> I see you're using old DM snapshot (which has nothing to do with dm-thinp)..

NOTE: dm-snapshot does use dm-bufio.  And there were a handful of dm-bufio changes listed in comment#2.  But I'm not aware of any potential for dm-snapshot regression with these dm-bufio changes.

I think you need to first silence the "mount: unknown filesystem type 'DM_snapshot_cow'" errors.

Comment 4 cshao 2014-12-22 07:30:10 UTC
Created attachment 971897 [details]
r510.log

Hi fabiand, 

I just obtain the panic log info via serial console, provides for you to debug.
Thanks!

Comment 5 Ying Cui 2014-12-22 07:44:40 UTC
(In reply to shaochen from comment #4)
> Created attachment 971897 [details]
> r510.log
> 
> Hi fabiand, 
> 
> I just obtain the panic log info via serial console, provides for you to
> debug.
> Thanks!

Chen, Thanks.
We also need more, please add _rdshell_ _rdinitdebug_ and removing _quiet_ to get /init.log for helps, btw rdsosreport is not available a on rhel 6.6.

Thanks
Ying

Comment 6 cshao 2014-12-22 08:46:47 UTC
(In reply to Ying Cui from comment #5)
> (In reply to shaochen from comment #4)
> > Created attachment 971897 [details]
> > r510.log
> > 
> > Hi fabiand, 
> > 
> > I just obtain the panic log info via serial console, provides for you to
> > debug.
> > Thanks!
> 
> Chen, Thanks.
> We also need more, please add _rdshell_ _rdinitdebug_ and removing _quiet_
> to get /init.log for helps, btw rdsosreport is not available a on rhel 6.6.
> 
> Thanks
> Ying

OK, I have added "rdshell" "rdinitdebug" to CMD and obtain the new log , Please check "r510-new-output.log" & "init.log" for more details.

Thanks!

Comment 7 cshao 2014-12-22 08:47:22 UTC
Created attachment 971916 [details]
r510-new-output.log

Comment 8 cshao 2014-12-22 08:47:59 UTC
Created attachment 971917 [details]
init.log

Comment 11 cshao 2014-12-22 10:39:37 UTC
Test version:
rhev-hypervisor6-6.6-20141218.0.el6ev
ovirt-node-3.1.0-0.37.20141218gitcf277e1.el6.noarch

Test 5 times after pull out the usb disk, didn't met kernel panic issue any more, so close this bug as WORKSFORME.

Thanks!

Comment 12 Ying Cui 2014-12-22 10:54:40 UTC
Due to env. issue, I consider to close it as notabug. Thanks.


Note You need to log in before you can comment on or make changes to this bug.