WIP: Implement OpenSSL's AES GCM acceleration for armv8 #16601
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is WIP still (but it works), and if you agree I'd like to split this PR up into smaller PRs next before merging:
kfpu_allowed()
for newer kernels to make use of NEON registersImportant TODO (putting this high up here so I don't forget):
I'll post some benchmarks on #12171 next.
Motivation and Context
Encrypted ZFS is currently very slow on aarch64 devices.
This due to missing hardware acceleration support: #12171
Also the use of NEON is disabled on newer kernels: #14555
Description
Assembly generated from these files in the OpenSSL source:
(ASM diffs at time of creation, will try to update those)
with slight modifications to support the ICP interface.
For the chunked AES-GCM codepath the AVX code was reused by renaming
avx
->hardware
.Kernel FPU support was re-added by naively storing/restoring vector registers.
And a bunch of support/wiring code around that.
How Has This Been Tested?
Definitely not well enough yet, do not merge yet.
The code was tested on a Raspberry Pi 5. I've created an encrypted fs, benchmarked it, swapped AES implementations, benchmarked it again and vice versa.
I did not test the changes on x86_64 yet, which is also affected by this PR.
Types of changes
Checklist:
Signed-off-by
.