LibCrypto: Fix up stuff from the SHA PR I can't live with #24668

DanShaders · 2024-07-05T05:47:38Z

No description provided.

Hendiadyoin1

Just a few small nits otherwise,
lgtm

Userland/Libraries/LibCrypto/Hash/SHA1.cpp

Userland/Libraries/LibCrypto/Hash/SHA2.cpp

AK/SIMDExtras.h

AK/CPUFeature.h

Userland/Libraries/LibCrypto/Hash/SHA1.cpp

Userland/Libraries/LibCrypto/Hash/SHA2.cpp

nico

First three commits are fine.

From what I understand, the motivation for the 4th commit is to make the "gcc and clang currently can't do sha resolvers, so we have to do that manually" bug reports into a general framework, which a) seems like a somewhat questionable motivation to me b) I don't love the implementation, see below.

As always, it's well possible I'm misunderstanding things though!

AK/CPUFeature.h

This warning is triggered when one accepts or returns vectors from a function (that is not marked with [[gnu::target(...)]]) which would have been otherwise passed in register if the current translation unit had been compiled with more permissive flags wrt instruction selection (i. e. if one adds -mavx2 to cmdline). This will never be a problem for us since we (a) never use different instruction selection options across ABI boundaries; (b) most of the affected functions are actually TU-local. Moreover, even if we somehow properly annotated all of the SIMD helpers, calling them across ABI (or target) boundaries would still be very dangerous because of inconsistent and bogus handling of [[gnu::target(...)]] across compilers. See llvm/llvm-project#64706 and https://www.reddit.com/r/cpp/comments/17qowl2/comment/k8j2odi .

This necessitates marking bit_cast as ALWAYS_INLINE since emitting it as a function call there will create an unnecessary potential SSE registers -> plain registers/memory round-trip.

This also removes `[[gnu::target("sse4.2")]]` from nested functions: since we don't explicitly use any of the SSE4.2 intrinsics there, the said target is unneeded. Note that this won't prevent compiler from choosing SSE4.2 intrinsics as the affected functions are always inlined into `transform_impl_sha1` that has `[[gnu::target("sse4.2")]]`.

These replace `#if ARCH(...)` macros that were added to conditionally include different hand-vectorized SIMD-implementations.

The helper doesn't use __builtin_cpu_supports (and instead makes raw cpuid calls) because of three reasons: - __builtin_cpu_supports only works on x86_64, so its usage need to be guarded with the preprocessor similarly to the current code. Moreover, we will have to use custom mechanisms to detect features on ARM, since there isn't such thing as cpuid there (and __builtin_cpu_* are not provided). - __builtin_cpu_supports doesn't support "sha" feature on all targeted toolchains currently. - And, of course, NIH.

Note: AVX target clone does not bring any significant (>0.5%) performance change.

It turns out we cannot use function multi-versioning with "sha" feature or even just plain ifunc resolvers without preprocessor guards. So, instead of feeding ifdef-soup monster, we just use static member function pointer. Moving the kernel into the SHA1 class makes it possible to not pass class members as parameters to it. This, however, requires us to disambiguate different target "clones" of the kernel using some kind of template.

See previous commit for rationale.

These do not bring any noticeable (>0.5%) performance improvements.

Hendiadyoin1 · 2024-07-11T09:28:33Z

AK/CMakeLists.txt

    CircularBuffer.cpp
    ConstrainedStream.cpp
    CountingStream.cpp
    DOSPackedTime.cpp
    DeprecatedFlyString.cpp
-    ByteString.cpp


How did that get there

This is called "expand selection to parenthesis", "sort lines".

Hendiadyoin1 · 2024-07-11T09:29:46Z

AK/CPUFeatures.cpp

Maybe cross-check/coalesce with the Kernel CPUID file

I think this will be a bit too much for a file than should be gone when fmv eventually matures.

kleinesfilmroellchen · 2024-07-12T20:59:57Z

Haven't reviewed in detail, but this fixes the current build failures on Alpine Linux, since ifunc is not supported there.

DanShaders requested a review from alimpfard as a code owner July 5, 2024 05:47

github-actions bot added the 👀 pr-needs-review PR needs review from a maintainer or community member label Jul 5, 2024

DanShaders requested review from gmta and Hendiadyoin1 July 5, 2024 05:47

DanShaders force-pushed the sha-actually-okay branch from ee3a6d9 to 7670fcf Compare July 5, 2024 05:58

Hendiadyoin1 reviewed Jul 5, 2024

View reviewed changes

Userland/Libraries/LibCrypto/Hash/SHA1.cpp Outdated Show resolved Hide resolved

Userland/Libraries/LibCrypto/Hash/SHA1.cpp Outdated Show resolved Hide resolved

Userland/Libraries/LibCrypto/Hash/SHA2.cpp Outdated Show resolved Hide resolved

AK/SIMDExtras.h Outdated Show resolved Hide resolved

Hendiadyoin1 mentioned this pull request Jul 5, 2024

LibCrypto: Implement AES by using x86 intrinsics #24538

Open

MarekKnapek reviewed Jul 5, 2024

View reviewed changes

AK/CPUFeature.h Outdated Show resolved Hide resolved

Userland/Libraries/LibCrypto/Hash/SHA1.cpp Outdated Show resolved Hide resolved

Userland/Libraries/LibCrypto/Hash/SHA2.cpp Outdated Show resolved Hide resolved

DanShaders force-pushed the sha-actually-okay branch 2 times, most recently from 5c7b3e8 to e5edeb1 Compare July 5, 2024 17:22

nico reviewed Jul 6, 2024

View reviewed changes

AK/CPUFeature.h Outdated Show resolved Hide resolved

DanShaders force-pushed the sha-actually-okay branch 2 times, most recently from 21f3149 to 27b93f4 Compare July 7, 2024 20:43

DanShaders added 9 commits July 11, 2024 02:01

AK: Use bit_cast in SIMDExtras.h/AK::Detail::byte_reverse_impl

a4305ff

This necessitates marking bit_cast as ALWAYS_INLINE since emitting it as a function call there will create an unnecessary potential SSE registers -> plain registers/memory round-trip.

AK: Introduce AK_CAN_CODEGEN_FOR_<FEATURE> macros

490b25c

These replace `#if ARCH(...)` macros that were added to conditionally include different hand-vectorized SIMD-implementations.

LibCrypto: Use AK::detect_cpu_features in ifunc resolvers

07941a8

Note: AVX target clone does not bring any significant (>0.5%) performance change.

LibCrypto: Use static function pointer to choose SHA256 SIMD kernel

ff4fcc2

See previous commit for rationale.

LibCrypto: Remove FIXMEs regarding possible optimizations in SHA{1,256}

828ece1

These do not bring any noticeable (>0.5%) performance improvements.

DanShaders force-pushed the sha-actually-okay branch from 27b93f4 to 828ece1 Compare July 11, 2024 06:04

Hendiadyoin1 reviewed Jul 11, 2024

View reviewed changes

nico merged commit 9bbadf7 into SerenityOS:master Jul 12, 2024
15 checks passed

github-actions bot removed the 👀 pr-needs-review PR needs review from a maintainer or community member label Jul 12, 2024

DanShaders deleted the sha-actually-okay branch July 12, 2024 22:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LibCrypto: Fix up stuff from the SHA PR I can't live with #24668

LibCrypto: Fix up stuff from the SHA PR I can't live with #24668

DanShaders commented Jul 5, 2024

Hendiadyoin1 left a comment

nico left a comment

Hendiadyoin1 Jul 11, 2024

DanShaders Jul 11, 2024

Hendiadyoin1 Jul 11, 2024

DanShaders Jul 11, 2024

kleinesfilmroellchen commented Jul 12, 2024

LibCrypto: Fix up stuff from the SHA PR I can't live with #24668

LibCrypto: Fix up stuff from the SHA PR I can't live with #24668

Conversation

DanShaders commented Jul 5, 2024

Hendiadyoin1 left a comment

Choose a reason for hiding this comment

nico left a comment

Choose a reason for hiding this comment

Hendiadyoin1 Jul 11, 2024

Choose a reason for hiding this comment

DanShaders Jul 11, 2024

Choose a reason for hiding this comment

Hendiadyoin1 Jul 11, 2024

Choose a reason for hiding this comment

DanShaders Jul 11, 2024

Choose a reason for hiding this comment

kleinesfilmroellchen commented Jul 12, 2024