compiler-rt: memmove optimisation #22606

dweiller · 2025-01-25T13:23:04Z

This PR seeks to improve memmove performance and fix some issues with generated code size of the current compiler-rt memmove.

I haven't yet benchmarked this implementation, though I expect the impact to be similar to #18912.

Here is a table of code sizes for ReleastFast (targets chosen somewhat randomly, feel free to suggest additions/removals from the list):

target	cpu	master (B)	`3642e26` (B)
`thumb-freestanding-eabihf`	`cortex_m3`	16362	438
`thumb-freestanding-eabihf`	`cortex_m4`	16362	438
`thumb-freestanding-eabihf`	`cortex_m33`	16362	438
`thumb-freestanding-eabihf`	`cortex_m52`	2644	420
`aarch64-linux`	`cortex_a53`	1472	380
`aarch64-linux`	`cortex_a75`	832	568
`aarch64-linux`	`cortex_x1`	836	584
`aarch64-linux`	`cortex_x4`	832	584
`x86_64-linux`	`x86_64`	1402	564
`x86_64-linux`	`x86_64_v2`	1402	564
`x86_64-linux`	`x86_64_v3`	1348	826
`x86_64-linux`	`x86_64_v4`	1348	826

I've marked this a ready for review as I'm not sure when I'll get to benchmarking in earnest and I think this should be merged before 0.14. I think there's no problem merging this as-is (modulo any reviews) and doing the following todos in a follow-up post 0.14 if I don't get it done before hand.

Resolves #22603 (at least for the target discussed there, but presumably for any others as well).

Todo:

benchmark memmove implementation
investigate sharing parts of implementation with memcpy

andrewrk · 2025-01-25T20:45:21Z

lib/compiler_rt/memcpy.zig

@@ -18,7 +18,7 @@ comptime {
    }
 }

-const Element = if (std.simd.suggestVectorLength(u8)) |vec_size|
+pub const Element = if (std.simd.suggestVectorLength(u8)) |vec_size|


maybe just put them back in the same file, like it was before?

I could, especially if it ends up making sense to share significant parts of their implementations. I was actually planning to move Element into common.zig (with a more descriptive name) since memset is going to want it as well.

I've move Element's definition to PreferredLoadStoreElement in common.zig - I anticipate using it in memset and possibly memcmp in the future.

alexrp · 2025-01-29T11:56:07Z

Are you aiming to get this one in for 0.14.0?

dweiller · 2025-01-29T12:24:03Z

Are you aiming to get this one in for 0.14.0?

Yes, I'd say it's basically mergable as is (there's one or two small things I can think of that I'd change first), which would fix the code size issue we currently have, The thing that will take more time is benchmarking and fine-tuning things based on benchmarks; that might leave things a bit close to the release date. Benchmarking could always be spun off into followup work if we're happy to merge without proper benchmarking.

marnix mentioned this pull request Jan 25, 2025

STM32 embedded debug binaries much larger with 0.14.0-dev.2851+b074fb7dd #22603

Open

andrewrk reviewed Jan 25, 2025

View reviewed changes

dweiller changed the title ~~Memmove opt~~ compiler-rt: memmove optimisation Jan 25, 2025

dweiller mentioned this pull request Jan 26, 2025

compiler-rt memcpy followup #22615

Merged

alexrp added this to the 0.14.0 milestone Jan 29, 2025

dweiller force-pushed the memmove-opt branch from 1985545 to 3642e26 Compare January 30, 2025 08:21

dweiller added 4 commits January 30, 2025 19:56

compiler-rt: optimize memmove

4db01eb

compiler-rt: workaround miscompilation in memmove

63f5a80

compiler-rt: remove manual unroll code from memmove

7e7c36f

compiler-rt: only check dest/src start address in memmove

3294ef7

dweiller force-pushed the memmove-opt branch from 3642e26 to 3294ef7 Compare January 30, 2025 08:57

dweiller marked this pull request as ready for review January 30, 2025 09:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compiler-rt: memmove optimisation #22606

compiler-rt: memmove optimisation #22606

dweiller commented Jan 25, 2025 •

edited

Loading

andrewrk Jan 25, 2025

dweiller Jan 26, 2025

dweiller Jan 30, 2025

alexrp commented Jan 29, 2025

dweiller commented Jan 29, 2025 •

edited

Loading

compiler-rt: memmove optimisation #22606

Are you sure you want to change the base?

compiler-rt: memmove optimisation #22606

Conversation

dweiller commented Jan 25, 2025 • edited Loading

andrewrk Jan 25, 2025

Choose a reason for hiding this comment

dweiller Jan 26, 2025

Choose a reason for hiding this comment

dweiller Jan 30, 2025

Choose a reason for hiding this comment

alexrp commented Jan 29, 2025

dweiller commented Jan 29, 2025 • edited Loading

dweiller commented Jan 25, 2025 •

edited

Loading

dweiller commented Jan 29, 2025 •

edited

Loading