Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1517057 - Hercules missing some architecture features, can no longer boot Fedora 27
Summary: Hercules missing some architecture features, can no longer boot Fedora 27
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: hercules
Version: 26
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Dan Horák
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-24 06:10 UTC by Dan Callaghan
Modified: 2017-11-27 15:05 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-24 09:51:54 UTC


Attachments (Terms of Use)

Description Dan Callaghan 2017-11-24 06:10:25 UTC
Description of problem:


Version-Release number of selected component (if applicable):
hercules-3.13-1.fc26.x86_64

How reproducible:
with a bit of practice

Steps to Reproduce:
1. Try to boot the Fedora 27 s390x installer in Hercules

Actual results:
[...]
The Linux kernel requires more recent processor hardware                                                                                                                                                                                                        
Detected machine-type number: 2097                                                                                                                                                                                                                              
Missing facilities: 49,52                                                                                                                                                                                                                                       
See Principles of Operations for facility bits                                                                                                                                                                                                                  
HHCCP011I CPU0000: Disabled wait state                                                                                                                                                                                                                          
          PSW=00020001 80000000 000000008BADCCCC

Expected results:
Should boot.

Additional info:
Fedora 26+ switched the minimum target architecture from z9 to zEC12, according to the wiki. From what I can tell, Hercules 3.13 is *supposed* to emulate zEC12 but I guess there are some architecture features which it doesn't actually support that the kernel is now requiring.

I also noticed when I try to install a Fedora 27 mock chroot on RHEL7 on Hercules, I get some some mysterious signal 11 crashes which the kernel reports as "User process fault: interruption code 0x40011 in ld-2.26.9000.so[3fffd492000+24000]". I'm assuming it's the same underlying cause, the newer packages are built expecting some CPU features that are not fully implemented in Hercules.

Comment 1 Dan Callaghan 2017-11-24 06:11:29 UTC
I dug up the "Principles of Operations" document mentioned by the kernel, the two facility bits in question are apparently:

49
The execution-hint, load-and-trap, miscellaneous-
instruction-extensions and processor-assist
facilities, are installed in the z/Architecture
architectural mode.

52
The interlocked-access facility 2 is installed.

Comment 2 Dan Horák 2017-11-24 09:51:54 UTC
I suspect the stable Hercules 3.x doesn't emulate enough for zEC12, the development 4.x is also missing implementation of some facilities, but it can "fake" them with ARCHLVL option in config file. Although it's not safe, because wrong code-path can be used based on the bit set.

What's your use-case that you are trying hercules?

Comment 3 Dan Callaghan 2017-11-26 22:09:22 UTC
I use Hercules as a koji builder for s390x on a small, third-party Koji instance. There is no way I can get a real mainframe guest for that.

I would disagree that this is NOTABUG, seems like a fairly serious bug in the sense it means eventually Hercules will be unable to run any modern distros as they start switching over to require zEC12+ features. At which point we will be back to the situation where there is no way to test/fix/experiment with s390x distros if you don't own a mainframe, which is not very friendly to potential contributors.

I did notice that the Hercules 4.x branch (fork?) seemed to have implemented the missing facilities, although I didn't get as far as successfully building it yet. Will you switch the Fedora package to that in future? Is it possible to backport the missing facilities?

Do you have any more details about what exactly is not "safe" in case someone involved in Hercules development is reading this bug in future?

Comment 4 Dan Horák 2017-11-27 15:05:57 UTC
If your project is open-source, then there is chance to get a real HW. Is your project a RH project? I guess we could even make koji to run more builder daemons to share a guest among more projects.

I agree the current state is not good, but we follow the stable upstream Hercules branch. I don't know all the details, but the 4.x branch should be the development branch for next version, not a fork. When there will be an official release, I would rebase the Fedora package to the new version. Until then I could try to package it for COPR. Backporting is not feasible.

The un-safe scenario would be when you for example set a fake "Transactional execution" facility bit, then kernel sets HWCAP_S390_TE bit in the ELF HW capabilities, glibc then starts to use the transactional execution instruction and the app will crash very early with a SIGILL.


Note You need to log in before you can comment on or make changes to this bug.