Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize shadow stack instruction sequences #928

Closed
wants to merge 51 commits into from
Closed

Conversation

Robbepop
Copy link
Member

@Robbepop Robbepop commented Feb 6, 2024

Closes #920.

TODOs

  • Fuse i32.add_imm with global.set 0
  • Fuse global.get 0 with i32.add_imm
  • Fuse fused global.get 0 + i32.add_imm with i32.local_tee and global.set 0
  • Support fusion of i32.add with a function local constant rhs register.

@paritytech-cicd-pr

This comment was marked as outdated.

@codecov-commenter
Copy link

codecov-commenter commented Feb 6, 2024

Codecov Report

Attention: Patch coverage is 77.72727% with 49 lines in your changes missing coverage. Please review.

Project coverage is 80.49%. Comparing base (978a58f) to head (27264e3).
Report is 102 commits behind head on main.

Files with missing lines Patch % Lines
...rates/wasmi/src/engine/translator/instr_encoder.rs 72.41% 16 Missing ⚠️
crates/wasmi/src/engine/executor/instrs/global.rs 27.77% 13 Missing ⚠️
crates/wasmi/src/engine/translator/visit.rs 55.00% 9 Missing ⚠️
crates/wasmi/src/engine/executor/instrs.rs 33.33% 4 Missing ⚠️
...ates/wasmi/src/engine/translator/visit_register.rs 0.00% 3 Missing ⚠️
crates/wasmi/src/engine/translator/mod.rs 90.00% 2 Missing ⚠️
...rates/wasmi/src/engine/translator/relink_result.rs 33.33% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #928      +/-   ##
==========================================
+ Coverage   80.48%   80.49%   +0.01%     
==========================================
  Files         270      270              
  Lines       25079    25273     +194     
==========================================
+ Hits        20184    20343     +159     
- Misses       4895     4930      +35     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Robbepop Robbepop mentioned this pull request Feb 8, 2024
This fixes a bug with a niche value when rhs=i16::MIN since there is no symmetric equivalent for -i16::MIN that fits inside an i16 value. This made the optimization fail for rhs=i16::MIN. By always compiling isub as iadd with negated rhs value we avoid this situation gracefully.
@Robbepop
Copy link
Member Author

Robbepop commented Mar 14, 2024

Comment on why this has not yet been merged despite passing all CI tests and working: it introduces a lot of complexity into the translation pipelined compared to the gains we see and can verify. The runtime gains are mostly single digits and restricted to sets of benchmarks that actually make use of the shadow stack. The global_bump benchmark is heavily affected with roughly 30% performance improvement but it is also very artificial. On the flip side the improvements to memory consumption also exist but are also just single digits for practical Wasm binaries.

So all in all the question is whether the gains are worth the added complexities.

@Robbepop
Copy link
Member Author

The miri CI job fails because of this issue: rust-lang/miri#3404

@Robbepop
Copy link
Member Author

We might want to block this until we have multiple look-back translation feature in the Wasmi bytecode translator. This allows to get rid of intermediate optimized instructions.

@Robbepop
Copy link
Member Author

Robbepop commented Oct 4, 2024

I am closing this now since too much has happened in the code base so that a full rewrite would make more sense. However, I am very uncertain that this optimization is a clear improvement to Wasmi as a whole.

@Robbepop Robbepop closed this Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimization: Special handling for common shadow stack instruction sequences
3 participants