Mark critical, stable, functions as unroll_safe.#124
Merged
ptersilie merged 1 commit intoykjit:mainfrom May 14, 2025
Merged
Conversation
These functions contain loops and (1) tend to appear in tight loops (2) are largely or wholly consistent from one iteration to the next. On our benchmark suite, we see the following speed-ups (as percentages, where > 0 is better): ``` BigLoop 0.22 Bounce 6.61 CD 4.84 DeltaBlue 2.75 HashIds 9.81 Havlak 2.72 Heightmap 0.48 Json 3.85 List -2.08 LuLPeg Mandelbrot -0.74 NBody -0.73 Permute -6.45 Queens 8.74 Richards 8.01 Sieve 1.35 Storage 1.02 Towers 2.87 ``` Pretty much only bigloop isn't impacted (quelle surprise). 4 benchmarks get slower, but only Permute noticeably so (List a little bit, but less so). 5 benchmarks get much faster (5-10%), 3-4 get fairly faster (2.5-5%). Overall, this seems meaningfully better overall to me. However! LuLPeg borks with: ``` LuLPeg/lulpeg.lua:979: bad argument ykjit#2 to 's_byte' (number expected, got string) ``` This commit shouldn't be introducing a new bug (per se), but it probably is highlighting an existing bug that's been previously hidden.
Contributor
Author
|
Assuming we agree (and we should at least wait until it merges!) that ykjit/yk#1747 is the fix holding up this PR, I think this PR is now ready for review. |
Contributor
|
What's the nature of (e.g.) Permute's slowdown? Does this change cause a "trace too long" (or similar) for that benchmark? |
Contributor
Author
|
I no longer remember (if I ever knew). But I think I would put this down as a "we don't expect every optimisation to benefit every program" thing. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
These functions contain loops and (1) tend to appear in tight loops (2) are largely or wholly consistent from one iteration to the next. On our benchmark suite, we see the following speed-ups (as percentages, where > 0 is better):
Pretty much only bigloop isn't impacted (quelle surprise). 4 benchmarks get slower, but only Permute noticeably so (List a little bit, but less so). 5 benchmarks get much faster (5-10%), 3-4 get fairly faster (2.5-5%). Overall, this seems meaningfully better overall to me.
However! LuLPeg borks with:
This commit shouldn't be introducing a new bug (per se), but it probably is highlighting an existing bug that's been previously hidden.
@ptersilie Do you think you can have a look to see what the cause of this might be? Nothing obvious has jumped out at me TBH. The "critical" thing seems to be the annotation to luaD_precall -- remove that and the bug goes away. However, that might not tell us very much because the problem is probably somewhere in (or beneath)
precallC(but I might be wrong).