-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cranelift: implement "precise store traps" in presence of store-tearing hardware. #8221
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
//! Precise-store-traps pass. | ||
//! | ||
//! On some instruction-set architectures, a store that crosses a page | ||
//! boundary such that one of the pages would fault on a write can | ||
//! sometimes still perform part of its memory update on the other | ||
//! page. This becomes relevant, and problematic, when page | ||
//! protections are load-bearing for Wasm VM semantics: see [this | ||
//! issue] where a partially-out-of-bounds store in Wasm is currently | ||
//! defined to perform no side-effect, but with a common lowering on | ||
//! several ISAs and on some microarchitectures does actually perform | ||
//! a "torn write". | ||
//! | ||
//! [this issue]: https://github.com/WebAssembly/design/issues/1490 | ||
//! | ||
//! This pass performs a transform on CLIF that should avoid "torn | ||
//! partially-faulting stores" by performing a throwaway *load* before | ||
//! every store, of the same size and to the same address. This | ||
//! throwaway load will fault if the store would have faulted due to | ||
//! not-present pages (this still does nothing for | ||
//! readonly-page-faults). Because the load happens before the store | ||
//! in program order, if it faults, any ISA that guarantees precise | ||
//! exceptions (all ISAs that we support) will ensure that the store | ||
//! has no side-effects. (Microarchitecturally, once the faulting | ||
//! instruction retires, the later not-yet-retired entries in the | ||
//! store buffer will be flushed.) | ||
//! | ||
//! This is not on by default and remains an "experimental" option | ||
//! while the Wasm spec resolves this issue, and serves for now to | ||
//! allow collecting data on overheads and experimenting on affected | ||
//! machines. | ||
use crate::cursor::{Cursor, FuncCursor}; | ||
use crate::ir::types::*; | ||
use crate::ir::*; | ||
|
||
fn covering_type_for_value(func: &Function, value: Value) -> Type { | ||
match func.dfg.value_type(value).bits() { | ||
8 => I8, | ||
16 => I16, | ||
32 => I32, | ||
64 => I64, | ||
128 => I8X16, | ||
_ => unreachable!(), | ||
} | ||
} | ||
|
||
/// Perform the precise-store-traps transform on a function body. | ||
pub fn do_precise_store_traps(func: &mut Function) { | ||
let mut pos = FuncCursor::new(func); | ||
while let Some(_block) = pos.next_block() { | ||
while let Some(inst) = pos.next_inst() { | ||
match &pos.func.dfg.insts[inst] { | ||
&InstructionData::StackStore { | ||
opcode: _, | ||
arg: data, | ||
stack_slot, | ||
offset, | ||
} => { | ||
let ty = covering_type_for_value(&pos.func, data); | ||
let _ = pos.ins().stack_load(ty, stack_slot, offset); | ||
} | ||
&InstructionData::DynamicStackStore { | ||
opcode: _, | ||
arg: data, | ||
dynamic_stack_slot, | ||
} => { | ||
let ty = covering_type_for_value(&pos.func, data); | ||
let _ = pos.ins().dynamic_stack_load(ty, dynamic_stack_slot); | ||
} | ||
&InstructionData::Store { | ||
opcode, | ||
args, | ||
flags, | ||
offset, | ||
} => { | ||
let (data, addr) = (args[0], args[1]); | ||
let ty = match opcode { | ||
Opcode::Store => covering_type_for_value(&pos.func, data), | ||
Opcode::Istore8 => I8, | ||
Opcode::Istore16 => I16, | ||
Opcode::Istore32 => I32, | ||
_ => unreachable!(), | ||
}; | ||
let _ = pos.ins().load(ty, flags, addr, offset); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can this be skipped if |
||
} | ||
&InstructionData::StoreNoOffset { | ||
opcode: Opcode::AtomicStore, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think atomics can be exempted here because they are already required to be aligned? At least at the wasm layer that's validated, I'm not sure about the Cranelift layer. This isn't needed for the wasm-level problem though. |
||
args, | ||
flags, | ||
} => { | ||
let (data, addr) = (args[0], args[1]); | ||
let ty = covering_type_for_value(&pos.func, data); | ||
let _ = pos.ins().atomic_load(ty, flags, addr); | ||
} | ||
&InstructionData::AtomicCas { .. } | &InstructionData::AtomicRmw { .. } => { | ||
// Nothing: already does a read before the write. | ||
} | ||
&InstructionData::NullAry { | ||
opcode: Opcode::Debugtrap, | ||
} => { | ||
// Marked as `can_store`, but no concerns here. | ||
} | ||
inst => { | ||
assert!(!inst.opcode().can_store()); | ||
} | ||
} | ||
} | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
test optimize | ||
set opt_level=speed | ||
set ensure_precise_store_traps=true | ||
target x86_64 | ||
|
||
function %f0(i64) { | ||
block0(v0: i64): | ||
v1 = iconst.i64 0 | ||
store.i64 v1, v0 | ||
return | ||
} | ||
|
||
; check: load.i64 v0 | ||
; check: store v1, v0 | ||
|
||
function %f1(i64, i8x16) { | ||
block0(v0: i64, v1: i8x16): | ||
store.i64 v1, v0 | ||
return | ||
} | ||
|
||
; check: load.i8x16 v0 | ||
; check: store v1, v0 | ||
|
||
function %f2(i64, i64) { | ||
block0(v0: i64, v1: i64): | ||
istore8 v1, v0 | ||
return | ||
} | ||
|
||
; check: load.i8 v0 | ||
; check: istore8 v1, v0 | ||
|
||
function %f3(i64, i64) { | ||
block0(v0: i64, v1: i64): | ||
atomic_store.i64 v1, v0 | ||
return | ||
} | ||
|
||
; check: atomic_load.i64 v0 | ||
; check: atomic_store v1, v0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stack stores are guaranteed to succeed when stack probing (or some other kind of stack size check) is used.