|Summary:||CPUID driver does not support cpuid.4 and cpuid.0xb instruments|
|Product:||Red Hat Enterprise Linux 5||Reporter:||Song, Youquan <youquan.song>|
|Component:||kernel||Assignee:||Peter Martuccelli <peterm>|
|Status:||CLOSED ERRATA||QA Contact:||Red Hat Kernel QE team <kernel-qe>|
|Version:||5.2||CC:||andriusb, austin.zhang, chaohong.guo, cward, davej, dzickus, jane.lv, jvillalo, keve.a.gabbert, luyu, mgahagan, mjenner, peterm|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2009-09-02 08:51:50 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
Description Song, Youquan 2008-07-11 09:07:41 UTC
Description of problem: RHEL5 serial on all Intel platform with cpuid level > 4, cpuid.4 instrument will all 0 if we get cpu information by cpuid driver. Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2A: Instruction Set Reference say that cpuid.4 leaf output depends on ECX initial value, but RHEL5.2 x86_64 cpuid driver do not implement it. (RHEL52 i386 cpuid driver do not have this bug) Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Download test_cpuid4.c test case as attachment. 2. gcc -o test_cpuid4 test_cpuid4.c; 3. ./test_cpuid4 0 4. It will always report all 0. 5. Another way to reproduce the bug,we also can get run Dave's x86info, http://www.codemonkey.org.uk/projects/x86info/ 6. tar; make; ./x86info 7. It will report the wrong cores number for package. it is always report "Number of cores per physical package=1". Actual results: return all 0 for registers: eax,ebx,ecx,edx. Expected results: return the CPU cache information by registers: eax,ebx,ecx,edx. Additional info:
Comment 1 Song, Youquan 2008-07-11 09:12:16 UTC
Created attachment 311558 [details] Test case test_cpuid4.c which get cpuid.4 information by cpuid driver
Comment 2 Song, Youquan 2008-07-11 09:19:25 UTC
Created attachment 311559 [details] The patch to fix cpuid.4 bug.
Comment 3 Brian Maly 2008-08-07 05:03:09 UTC
The patch in Comment #2 just makes the cpuid() call into the cpuid_count() call, the patch makes the 2 functions identical. Since we already have this functionality it makes more sense to change the code to call cpuid_count() instead of cpuid(). More investigation reveals quite a few updates to arch/i386/kernel/cpuid.c upstream that are not in RHEL5. It seems likely that this is the code that should be fixed instead. That being said, can we confirm a newer upstream kernel (such as 2.6.25) works properly on this hardware? Can this be tested? Ideally this is fixed upstream and can just be backported. Worst case we can just call cpuid_count() instead of cpuid().
Comment 4 Song, Youquan 2008-08-07 10:04:18 UTC
Yes. This bug just exit on RHEL5 serial X86_64, now upstream merge the x86_64 and i386. so the bug do not exist on upstream kernel. The upstream commit id 2347d933b158932cf2b8aeebae3e5cc16b200bd1 will fix the bug. But it need more backport effort.
Comment 5 Brian Maly 2008-08-07 22:26:14 UTC
Created attachment 313758 [details] test patch Test patch (backport from upstream). Can you please test this patch on the affected hardware and let me know if it helps any? Thanks.
Comment 6 Song, Youquan 2008-08-11 07:03:32 UTC
Add add the patch, it will cause the following bug information in dmesg information. BUG: warning at arch/x86_64/kernel/smp.c:379/smp_call_function_single() (Tainted: G ) Call Trace: [<ffffffff8004c9fb>] smp_call_function_single+0x4f/0x10e [<ffffffff8002203b>] __up_read+0x19/0x7f [<ffffffff800668a2>] do_page_fault+0x4fe/0x830 [<ffffffff8005009f>] cpuid_read+0x80/0xbd [<ffffffff8000b3d2>] vfs_read+0xcb/0x171 [<ffffffff800117bf>] sys_read+0x45/0x6e [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Comment 7 Song, Youquan 2008-08-25 04:38:24 UTC
This crash information root cause is that the backport patch call smp_call_function_single. Because 2.6.18 kernel API smp_call_function_single impelmentation is different with upstream kernel 2.6.26 kernel API smp_call_function_single. The attachment cpuid.patch can fix this bug.
Comment 8 Song, Youquan 2008-08-25 04:43:47 UTC
Created attachment 314897 [details] cpuid.patch which can fix the crash bug for backport patch RHEL5-cpuid-update.patch
Comment 9 Song, Youquan 2008-09-03 14:50:35 UTC
Change the severity to high, the reason as following: cpuid driver can not support CPUID.4 CPUID.0xB CPUID.0xD instruments etc. because these instruments depend on ecx input except eax. Without the patch, cpuid driver will provide wrong information for some important CPU information: such as Maximun number addressed ID of CPU package, Cache type, Cache level, X2APIC ID and processor topology relationship etc. I have validate the patch, there is no issue found.
Comment 10 Brian Maly 2008-09-03 17:15:28 UTC
Created attachment 315667 [details] rework of initial test patch Here is a rework which combines the last 2 patches. Can this just be tested on the affected hardware for sanity sake? If results look good I will get the patch posted ASAP so it can be included in RHEL5.
Comment 11 Song, Youquan 2008-09-04 10:17:12 UTC
Yes. I test it different machine, it also can fix the bug.
Comment 12 Song, Youquan 2008-09-04 10:19:40 UTC
Here is the update test case for it. Run by command "test_cpuid [cpu] [eax] [ecx]"
Comment 13 Song, Youquan 2008-09-04 10:21:15 UTC
Created attachment 315732 [details] test_cpuid.c test case.
Comment 14 Song, Youquan 2008-09-09 09:32:01 UTC
Attachment 315667 [details]: RHEL5-cpuid-update.patch call "smp_call_function_single" which just exist on X86_64 but is not be define on i386. It will report error when compile in i386 kernel. I rework the patch and validate on x86_64 and i386 platform, the patch RHEL5_cpuid_new.patch will fix all the issues.
Comment 15 Song, Youquan 2008-09-09 09:44:11 UTC
Created attachment 316164 [details] Rework patch RHEL5_cpuid_new.patch
Comment 16 Song, Youquan 2008-09-23 07:12:36 UTC
Hi bmaly, This patch have pass test for different platform for both x86 and x86_64, please integerate it in RHEL5.3.
Comment 17 Song, Youquan 2008-10-09 09:41:07 UTC
bmaly, What's the status for this bug?
Comment 18 Song, Youquan 2008-10-16 15:50:55 UTC
Do the patch POST?
Comment 19 John Villalovos 2008-11-12 02:11:23 UTC
Brian, Any status on this? We at Intel would love to know. Thanks, John
Comment 20 Jane Lv 2008-12-05 06:33:43 UTC
(In reply to comment #10) > Created an attachment (id=315667) [details] > rework of initial test patch > > Here is a rework which combines the last 2 patches. Can this just be tested on > the affected hardware for sanity sake? > > If results look good I will get the patch posted ASAP so it can be included in > RHEL5. Bmaly, We have tested the patch and result is good. Will you put this patch into RHEL5.3? Thanks.
Comment 21 Brian Maly 2008-12-09 00:03:06 UTC
I wull push the patch into RHEL5 (z-stream and 5.4). Since this isnt upstream it should go into RHEL at the very beginning of the development cycle so we have enough testing coverage. Pushing this in at the the end of the cycle (i.e. the last possible moment) greatly increases the risk of regressions.
Comment 24 RHEL Product and Program Management 2009-02-16 15:34:47 UTC
Updating PM score.
Comment 25 Luming Yu 2009-02-24 03:17:46 UTC
Brian, Are you going to post Youquan's patch? If does, please mark POST after posting. If not, please assign the bug owner to me.. I will help with review, testing, and posting. Thanks, Luming
Comment 27 RHEL Product and Program Management 2009-03-02 07:41:03 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Comment 29 Don Zickus 2009-04-27 15:57:52 UTC
in kernel-2.6.18-141.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified.
Comment 32 Chris Ward 2009-06-14 23:15:27 UTC
~~ Attention Partners RHEL 5.4 Partner Alpha Released! ~~ RHEL 5.4 Partner Alpha has been released on partners.redhat.com. There should be a fix present that addresses this particular request. Please test and report back your results here, at your earliest convenience. Our Public Beta release is just around the corner! If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. If you have verified the request functions as expected, please set your Partner ID in the Partner field above to indicate successful test results. Do not flip the bug status to VERIFIED. Further questions can be directed to your Red Hat Partner Manager. Thanks!
Comment 33 Luming Yu 2009-06-26 06:08:15 UTC
Verified RHEL 5.4 alpah, it is fixed.
Comment 34 Chris Ward 2009-07-03 18:04:30 UTC
~~ Attention - RHEL 5.4 Beta Released! ~~ RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner! If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value. Questions can be posted to this bug or your customer or partner representative.
Comment 36 errata-xmlrpc 2009-09-02 08:51:50 UTC
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1243.html