-
Notifications
You must be signed in to change notification settings - Fork 2.6k
s390x: vectorize crc32 #1057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
s390x: vectorize crc32 #1057
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Way better than the last (first) version. Although I would prefer to just have the option and an add_subdirectory/ to contrib/s390x and then everything else in contrib/s390x/CMakeLists.txt. set source_files_properties should work there also, but the sources would be needed to be added to the targets itself if you don't want to deal with PARENT_SCOPE (you don't want to), so the add_subdirectory call will need to go after the targets are created.
i decided to go with an object library for the target specific code. it seemed to be the cleanest way. however i had to add it to the install target for the static library, :/ |
You added it to the link not the install, but yes this is the way to use this. If you want to use it like this, a cleaner way would be |
My bad, we're not linking against anything atm. But it looks much cleaner and we are prepared for linking to other stuff. I'm happy. |
i mean this line: background: if i do however, with i don't really know what that install lines actually does, as this is a OBJECT library and nothing at all should be installed. should i remove it? |
It can be that on some plattforms object-libs are just static libs and then linking on static to the other fails (or better The easiest way would be to move add_subdirectory to line 233(ish) and then calling target_sources for both libs instead of the object lib. Then they will be appended to the other sources. As all sources are compiled twice, 2 more files shouldn't be that big problem. If you set the compile_definitions along with the compile_options for the file, this could also be shorter. |
Open points before merge
personally I think that the functable.h is a good method to hook in contrib code. I'll let the POWER guys know about it and i guess they'll jump on the same train for their implementation. Might be an idea to check if the ARM optimization should also be moved out of crc32.c into its own contrib directory. |
from my side this code is ready for final review and to be merged. |
3cafbb7
to
7ccadea
Compare
I have to say with that foce-pushing it's hard to track your changes as it's always one big patch. i.e. CMakeLists.txt line 232 and 233 could be one and I can't tell if this came with the last commits or if I missed it before. |
I am sorry i try to create a nice consistent history, therefor i force push on my branches. The reason to split the dependency into two lines was a preparation for the accelerated deflate/inflate patch where i introduce the accelerator as another contrib dependendy .
it seemed easier and more logical to make the functable its own entity rather then handling all cases and taking care to not double the symbols. |
the current state of the dfltcc patch(that builds upon this crcvx patch) is here. I split it up into consistent commit that can be build and tested independently and hopefully are easier to review. My current plan is to finish it up, do some extensive testing and then, if everything goes all right to open the PR. |
I don't judge about the split (I don't even read the code). It's just target_link_libraries(zlib PRIVATE $<TARGET_NAME_IF_EXISTS:zlib_s390x_functable>) vs. target_link_libraries(zlib PRIVATE $<TARGET_NAME_IF_EXISTS:zlib_s390x_functable> But if it's just that functable is needed by the other 2 and needs to be present in the link, it's better to declare it as PUBLIC (maybe INTERFACE, didn't check) on these and remove it here completely.
Acceptable goal, therefor squash-merger was invented. For reviewing work-in-progress it's a pain to keep track of what's changed. And yes, getting intercepted by reviews while not finished is also a pita. |
@fneddy: Good job! |
Use vector extensions when compiling for s390x and binutils knows about them. At runtime, check whether kernel supports vector extensions (it has to be not just the CPU, but also the kernel) and choose between the regular and the vectorized implementations. Co-authored-by: Eduard Stefes <[email protected]>
3c3b435
to
36594a1
Compare
it seems that the PR does not resonate well so here a new shoot. I minimized the changes in the old code. |
I thinks @madler is just waiting for you telling him you're really done and ready to merge. |
Use vector extensions when compiling for s390x and binutils knows about them. At runtime, check whether kernel supports vector extensions (it has to be not just the CPU, but also the kernel) and choose between the regular and the vectorized implementations.