-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector Facility for z/Architecture #650
Comments
Not necessarily! What I would like to do is determine which way -- yours or mine -- is more efficient. Thus I would be inclined to leave your current implementation as-is for the time being. Implementing my shared registers proposal might be less efficient. I don't know. Maybe. Maybe not. I proposed it only because I thought (believed) it would be more efficient, but that remains to be seen! I would prefer to see BOTH designs implemented (controlled via a temporary #define build option) so that we could then compare the performance of each one. It might well be that your current design is more efficient! I don't know! It might be. It might not be. It remains to be seen. The idea here is I don't want to paint ourselves into a corner. I don't want to commit to one technique or the other until we know which one is best. |
That's something else we will need to eventually do too: verify the correctness of each implemented instruction on real hardware. I seem to recall that one of our developers (I forget who) has access to a real mainframe. We will eventually need to test/debug our implementation on a real machine. Then, once verified, we will of course ALSO need to develop a QA (Quality Assurance) runtime test ("runtest" So there is definitely enough "meat" in this project for multiple people to bite off their own piece of it. The more people we have contributing the greater the chance of our succeeding in our effort. |
@Fish-Git I agree with the double implementation design.
Yes, this is how it was proposed, for byte size instructions it is not necessary or some logical operations (AND/OR), moves.... But in general you always have to be doing: BIG -> LIT -> BIG.
I think FP must continue with the current behaviour with regard to endianness. It will be easier to adapt zVector to the double treatment proposed by Fish.
Yes! it's a common short name for Salvador. Ian, tell me what you think! Regards, salva. |
Can you please have a look at the attached proposal of the Hercules changes for shared zVector/FP registers. All comments. suggestions, etc are welcome. |
Nice! I like it! |
Can you please have a look at the revised attached proposal of the Hercules changes for shared zVector/FP registers. I had forgotten we would need to move data between the instruction processors variables and the zVector registers preserving host endianness. Again, all comments. suggestions, etc are welcome: p.s. Fish, how did you add the bullet point before the link? I can't see it in the Github formatting syntax. |
I too am interested to participate in the Vector Facility and have read the proposal text with interest. So far I only have some probably very basic questions which I'm seeking an answer to:
Thanks ! Cheers, Peter P.S.: I'll be off-line next week. |
@mcisho: Can you explain to me why this line for LITTLE-ENDIAN?
|
@salva-rczero: Whoops, confusion on my part, taking little endian way too far! You are quite right, the register number does not need to be flipped. Well spotted.
|
Can you please have a look at the revised attached proposal of the Hercules changes for shared zVector/FP registers, with the corrections for the errors pointed out by @salva-rczero. Yet again, all comments. suggestions, etc are welcome. p.s. Fish, how did you add the bullet point before the link? I can't see it in the Github formatting syntax. |
@mcisho While I appreciate your effort, I really don't understand the need for all these macros.
We would only need to add a lit-endian mode:
|
@Peter-J-Jansen The first goal is to get it working, but yes, I have thought about using x86 SIMD for performance. In fact, a couple of Galois arithmetic instructions already use it. |
Asterisk or dash (minus sign) followed by a blank, which is the markdown code for an unordered list:
|
I believe Steve Orso (@srorso) would probably be the best person to answer this question, but as I recall, it was basically because of 2 things:
But those are just guesses. The truth is, I don't remember what the real reaso(s) was/were. Ask Steve. He might remember the details better than me since I believe he did a lot of work on our SoftFloat code. |
I proposed the macros as an aid for endianness, but if you think they are superfluous that's fine, I'll forget about them. The most important thing is we all agree on how the shared VR/FPR are defined in REGS. |
Can you please have a look at the fourth and hopefully final revision of the proposal of the Hercules changes for shared zVector/FP registers. The superfluous stuff has been removed, and the suggestions from @salva-rczero have been incorporated. As always, all comments. suggestions, etc are welcome. |
As it appears that no one disagrees with the proposal I will proceed. In the next few days I will branch the SDL-Hercules-390 hyperion develop branch into a branch named sharedvfp, where the changes to the floating-point instructions will be implemented. The z/Architecture Principles of Operation says:
However, in a March 2015 presentation to SHARE titled "z13 Vector Extension Facility (SIMD)", IBM said:
Empirical evidence from instructions executed on a z15 shows that use of a FPR changes bits 64-127 of the corresponding VR to zero. So should Hercules set bits 64-127 of the corresponding VR to zero, or leave them unchanged (i.e. unpredictable), when an instruction writes to a FPR? Leaving the bits unchanged is simpler and less prone to coding error, but Hercules wouldn't be emulating the actions of real machines (or at least the machines to date). |
@mcisho Great! As soon as you make the branch and push the changes to On 64-127 bits, I would prefer to leave them unchanged. IMHO, Hercules should mimic z/Arch not real machines. Regards, salva. |
I agree 100% with Salva. Hercules does not -- and indeed IMHO should not -- try to emulate any particular model of mainframe, whether manufactured by IBM or anyone else. It's sole responsibility is to only try to accurately emulate the published mainframe architecture as defined in the Principles of Operation. The behavior of mainframes varies from model or model. The behavior of the architecture does not. Stick to the architecture. |
The sharedvfp branch has been created, and the esa390.h and hstructs.h changes have been pushed. Please note that the REGS structure still contains the old U32 fpr[32] variable. It will be removed when the numerous references to it have all been changed to the new shared QW vfp[32] variable. |
I've just created a pull request for the changes needed for vector instruccions (E7xx). |
@Fish-Git Will you provide the changes for U128, vfetch16, vstore16... from swap128 or should I do it myself? Thanks in advance. |
I will have to review my original implementation. What I originally coded might no longer be correct/appropriate for our current design. Maybe it is. Maybe it isn't. I don't know. I'll have to brush off the dust and take a look at it. If you want to do it, please feel free to do so! You might actually be able to do it faster than me. Personal issues have been affecting my ability to contribute as of late. (Don't worry, it's nothing serious.) |
The FP instructions using the shared zVR/FPR are complete, and the tests that we have pass. All of the changes have been committed to the sharedvpr branch. |
After several days of testing I haven't discovered any problem with FP instructions using the shared VR/FPR. I would like to pull the FP changes into the develop branch, so that the changes can be exposed to a wider range of environments than I have available. Does anyone object, feel it's premature, etc? |
No objection here! Sounds like a good plan to me! |
I have attached my changes to |
QUICK QUESTION: Is the |
Looks okay to me, Ian! And IMO yes, it seems to be a valid working path that we should probably continue on. I'm thinking the bulk of the Vector instructions should of course continue to be in |
No. The develop branch doesn't have zVector support. If you want to try zVector you need to use the sharedvfp branch, and the latest commit of progress by @salva-rczero was to the sharedvfp branch. |
For my part, I believe that my contribution to this project has come to an end. I have already warned that I do not have the necessary skills and I find everything related to the discussion/design very difficult. It is better to leave that task to those of you who know it. Farewell and thank you very much for your time and advice (especially to @Fish-Git). Good luck and long live to Hercules! |
I'm working on the E6 z/vector instructions which has a lot of change to the infrastructure just as the E7 z/vector instructions did. My work is based on the The E6 instructions will be in Do we have a consistent type definition for Jim |
Thank you, James! I still say you should consider becoming an official Hercules developer. Your contributions over the past many months (past year?) have been invaluable.
AFAIK, type |
We will miss you, Salva!
You are VERY welcome, Salva! We all thank you from the bottom of our hearts for all of the tremendous contributions you have made to Hercules! You are a true Herculean in my book! If you send me your full real name, I will be very happy to add you to our Herculeans list.
Abso-fricking-lutely! |
That's a pity, I thought you were doing a great job.
Don't worry, you're not alone there. |
As part of pull request [https://github.com//pull/661], I have enabled the following features in feat900.h: #define FEATURE_134_ZVECTOR_PACK_DEC_FACILITY
#define FEATURE_135_ZVECTOR_ENH_FACILITY_1
#define FEATURE_148_VECTOR_ENH_FACILITY_2
#define FEATURE_152_VECT_PACKDEC_ENH_FACILITY
#define FEATURE_165_NNET_ASSIST_FACILITY
#define FEATURE_192_VECT_PACKDEC_ENH_2_FACILITY as all/most of the E6 instructions are defined as part of or enhanced with these facilities. I suspect that is causing some of the windows build problems, as you are referencing Hope I haven't caused too many problems, but I wanted to get the basics in for the E6 instructions to minimize merge conflicts. Jim |
FYI: James's changes to the |
The z/vector E6 instructions, for example VECTOR FP CONVERT TO NNP, reference NNP-Data-Type-1 Format. From z/Architecture Principles of Operation, SA22-7832-13, page 26-1 states:
But the NNP-data-type-1 format is not described. Does anyone have additional reference information on the format? The closest that I've found is a DLFLOAT presentation: https://pdfs.semanticscholar.org/5359/1b203af986668ca6586f80d30257d3ee52d7.pdf Thanks, |
I'm not aware of any, no. But then I haven't tried looking for it either.
THAT looks to me like that's probably it! Great find, James! I say go with it! |
Fish, I've coded initial versions of the five E6 vector "neural network processing assist" instructions (VCNF, ...). As part of this implementation, I use SoftFloat
Whoa... Got to be my problem! Yep, the routines are in the source, the Why these routines? These vector instructions convert to/from Tiny (F16) Binary Floats. I would appreciate if the softfloat libraries could be refreshed to include Thanks, |
10-4. I'll get right on it. Can you provide for me your Hercules changes in the form of a patch, so I can test my softfloat changes before actually committing them? That is to say, I'd like to try building Hercules with your changes for myself, so I can see (recreate) your reported link error, and then temporarily make my softfloat changes and then rebuild Hercules (with your changes again), to verify that the problem is now fixed. Then I can commit my changes with absolute confidence to the softfloat repository. Thanks. |
Fish, I'll post a patch tomorrow. I'm in the middle of moving the zvector instructions to a new file nnpa.c which will include NNPA: Function Code 0: NNPA-QAF (Query Available Functions). All the NNPA stuff will then be in one source file. Jim |
Fish, As requested, here is a patch with my current As always, comments / suggestions are appreciated. Jim |
Thanks. I'm on it! It looks like this "simple" change is going to take me longer than originally expected though. My first attempt to just move the
So now I'm going to have to do the same thing for the source files containing those functions too. I'm hoping this "simple" change doesn't end up snowballing into some huge complicated mess! In any case, I'll let you know when I eventually have something for you to test with. |
SoftFloat fix committed! "Fix for GitHub z/Arch Issue #650" Tested on both Windows and Linux (with your You should now be good to go! |
NOTE:You will of course need to git update your SoftFloat external package repo and rebuild it in order for your Or you can simply use Bill's Hercules Helper, of course. |
Fish, Thank you for the SoftFloat update.. I'm currently just using the SoftFloat X64 libraries that are part of the 'develop' branch. I have build the external packages but it has been a while. Just to be clear, the 'develop' branch does not have updated SoftFloat libraries. Once I commit the nnpa code, everyone doing X86-64 development on the 'develop' branch will have to update their SoftFloat libraries with a new version from Jim |
Then you should be okay. The last commit I made was to update those lib files. So for Windows, you should be okay, as well as any x86 Linux user that is able to use the Herc libs. It's just for some Linux users that might have to update and rebuild their softfloat repo/libs if they're unable to use the ones that come with Herc, such as those who have a non-x86 system (such as ARM for example). Make sense? |
That was true, but is now no longer true as of a couple hours ago, since, as I said, Herc's libraries have since been updated:
|
ONLY if they're running on non-x86 hardware (or otherwise are unable to use the libs that come with Hercules).
Possibly. If they build Herc themselves the hard way, then yes, they will have to update and rebuild their softfloat external package libraries. If they build Herc using Bill's Hercules Helper however, then probably not. I believe Bill's Hercules Helper builds Hercules just fine for most all non-x86 systems. @wrljet Bill? Is that true? Does your script always refresh (git pull) for all of the external package repos each time? (and rebuild them if they've changed?) But if they, like you, simply link with the libs that come delivered with Hercules, then no, they should be unaffected. |
Fish, Hercules-Helper rebuilds the extpkgs from source, with a fresh git clone, on all systems except Windows. Bill |
Fish, Thank you. Thank you for your last commit to update the SoftFloat libraries! My nnpa.c code compiles and links. Now to work on some tests. Whenever I've used hercules-helper to install hercules on my Raspberry PI 5, all the external packages are built. Jim |
I came across this patent from IBM that describes the whole workings of the neural networks assist processing. It seems it also explains the NNP-data-type-1. https://patents.justia.com/patent/11669331 Edit: This links to a pdf version that also has the images: |
i download develop branch today, reinstall it in RHEL 9.4 and finally z/os 2.5 java -version return a successful result |
I have confirmed that z/OS V3.1 runs on Ubuntu 24.04+ develop branch. |
This issue was created for discussing development of the z/Architecture Vector Facility.
All discussion regarding this effort should take place HERE, in THIS GitHub Issue, and not in Issue #77, which is a generic GitHub Issue regarding all yet-to-be-developed z/Architecture facilities.
Please refrain from discussing z/Architecture Vector Facility development anywhere else, and discuss it here instead.
Thank you.
The text was updated successfully, but these errors were encountered: