-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Check for use of InstId
s from the wrong SemIR::File
#5997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check for use of InstId
s from the wrong SemIR::File
#5997
Conversation
Found by WIP validation for this type of issue ongoing in carbon-language#5997 I'm not entirely sure how the one test update falls out of this change - but it is from the same test that I originally reduced the problem from, which is reassuring. The reduced test case I investigated the issue with was this: a.carbon: library "lib"; interface I1(Other:! type) { let Result:! type; } b.carbon: import library "lib"; class T1 { } impl T1 as I1(Self) where .Result = Self { } The SemIR dump diff looked like this: 89c89 < %Main.import_ref.b6f = import_ref Main//lib, inst28 [no loc], unloaded --- > %Main.import_ref.b6f = import_ref Main//lib, inst27 [no loc], unloaded 96c96 < %Main.import_ref.f7b: @i1.%I1.type (%I1.type.e87) = import_ref Main//lib, inst28 [no loc], loaded [symbolic = @i1.%Self (constants.%Self.c47)] --- > %Main.import_ref.f7b: @i1.%I1.type (%I1.type.e87) = import_ref Main//lib, inst27 [no loc], loaded [symbolic = @i1.%Self (constants.%Self.c47)] Which is a difference, but given the `inst28`/`inst27` don't appear anywhere else than these two lines, it doesn't give a terribly meaningful diff/story about what changed - but perhaps it's sufficient... Not sure if this test ^ is sufficiently more interesting than the diff update already in this patch. If so, happy to add the above as a new test case. Open to ideas.
…#5998) Found by WIP validation for this type of issue ongoing in #5997 I'm not entirely sure how the one test update falls out of this change - but it is from the same test that I originally reduced the problem from, which is reassuring. The reduced test case I investigated the issue with was this: `a.carbon`: ``` library "lib"; interface I1(Other:! type) { let Result:! type; } ``` `b.carbon`: ``` import library "lib"; class T1 { } impl T1 as I1(Self) where .Result = Self { } ``` The SemIR dump diff looked like this: ``` 89c89 < %Main.import_ref.b6f = import_ref Main//lib, inst28 [no loc], unloaded --- > %Main.import_ref.b6f = import_ref Main//lib, inst27 [no loc], unloaded 96c96 < %Main.import_ref.f7b: @i1.%I1.type (%I1.type.e87) = import_ref Main//lib, inst28 [no loc], loaded [symbolic = @i1.%Self (constants.%Self.c47)] --- > %Main.import_ref.f7b: @i1.%I1.type (%I1.type.e87) = import_ref Main//lib, inst27 [no loc], loaded [symbolic = @i1.%Self (constants.%Self.c47)] ``` Which is a difference, but given the `inst28`/`inst27` don't appear anywhere else than these two lines, it doesn't give a terribly meaningful diff/story about what changed - but perhaps it's sufficient... Not sure if this test ^ is sufficiently more interesting than the diff update already in this patch. If so, happy to add the above as a new test case. Open to ideas.
e89bd87
to
9c3465d
Compare
9c3465d
to
d0734b3
Compare
Otherwise invalid InstIds passed to canstants would cause arbitrary/large memory allocation as the constant storage would be grown to fit the giant invalid index.
… of any context useful for dumping given only an id without the container or SemIR::File it might've come from
The more-informative check failure (reproducing the original bug that motivated this extra checking) looks like: CHECK failure at ./toolchain/base/value_store.h:295: index < size_: Untagged index was outside of container range. Possibly tagged index 2113929262. Best-effort decomposition: CheckIRId: 30, index: 46. The CheckIRIdTag for this container is: 29 Not sure how this should be phrased - it all feels a bit awkward. Maybe something like "CheckIRId was 30, should be 29. Index is 46"? Maybe print the tagged index as hex?
d0734b3
to
ea8fb53
Compare
…urate Add back in the original `IdToChunkIndices` but as a non-static member (since it's needed to un-swizzle the CheckIRIdTag back out of the Id to get to the raw index)
Ping @zygoloid - further thoughts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, a couple of comments but basically LGTM.
(Looks like it's worth making some updates to the first comment on the PR before that becomes the commit message.)
I forget -- have you done any performance testing here? I'm not really expecting much difference but value stores are pretty hot so it seems worth checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, a couple of comments but basically LGTM.
(Looks like it's worth making some updates to the first comment on the PR before that becomes the commit message.)
Ah, yep - updated.
I forget -- have you done any performance testing here? I'm not really expecting much difference but value stores are pretty hot so it seems worth checking.
I have not - any recommended benchmarks/mechanisms for performance testing changes?
InstId
s from the wrong SemIR::File
yeah, got some performance regressions I'll need to look into :/
|
FWIW, those are pretty small regressions, and there were up to 5% swings in the parse benchmarks so might be some noise in them too. Can look for anything obvious, but may not be worth stressing too much about this. IT's not like a 25% or 50% regression. |
Fair - yeah, better understood the perf report and how for higher numbers of iterations the difference went down, and the noise. Ran some perf diffs and didn't find anything really standout. So going ahead with this as-is for now. |
Use the
CheckIRId
as a unique identifier for the scope of anInstId
- if anInstId
is created within the scope of oneCheckIRId
it must not be used in the scope of a differentCheckIRId
.This is achieved without extra storage, but with false negatives for large inputs.
When an
InstId
is created, the original index of theInst
is XORed with a tag derived from theCheckIRId
to produce the finalInstId
. When theInstId
is used, the expected tag is XORed with theInstId
to get back to the original index - if the tags don't match, the resulting index will be corrupted, likely too large - resulting in an out of bounds index CHECK-failure.(the tag value is derived as such:
InstId
has a tag combined into it)In this way, the tag is unlikely to overlap with the index for small test cases - making it possible to separate out the
CheckIRId
from the index in these cases to provide more meaningful debugging/CHECK messages, and more informativeSemIR
textual dumping that can now include theCheckIRId
along with theInst
's index in the name of aninst
)The test churn here is improved printing as tagged
InstId
s can now, with best effort (more likely for small test cases where theCheckIRId
and theInst
index aren't at risk of overlapping from the high and low bits), render theCheckIRId
as part of the inst's name. Going frominstNN
toirMM.instNN
.