Check for use of `InstId`s from the wrong `SemIR::File` #5997

dwblaikie · 2025-08-28T18:19:33Z

Use the CheckIRId as a unique identifier for the scope of an InstId - if an InstId is created within the scope of one CheckIRId it must not be used in the scope of a different CheckIRId.

This is achieved without extra storage, but with false negatives for large inputs.

When an InstId is created, the original index of the Inst is XORed with a tag derived from the CheckIRId to produce the final InstId. When the InstId is used, the expected tag is XORed with the InstId to get back to the original index - if the tags don't match, the resulting index will be corrupted, likely too large - resulting in an out of bounds index CHECK-failure.

(the tag value is derived as such:

take the CheckIRId
left shift one bit (padding zero)
left shift another bit (padding 1 - used to signify that the resulting InstId has a tag combined into it)
reverse the bits

In this way, the tag is unlikely to overlap with the index for small test cases - making it possible to separate out the CheckIRId from the index in these cases to provide more meaningful debugging/CHECK messages, and more informative SemIR textual dumping that can now include the CheckIRId along with the Inst's index in the name of an inst)

The test churn here is improved printing as tagged InstIds can now, with best effort (more likely for small test cases where the CheckIRId and the Inst index aren't at risk of overlapping from the high and low bits), render the CheckIRId as part of the inst's name. Going from instNN to irMM.instNN.

@i1

Found by WIP validation for this type of issue ongoing in carbon-language#5997 I'm not entirely sure how the one test update falls out of this change - but it is from the same test that I originally reduced the problem from, which is reassuring. The reduced test case I investigated the issue with was this: a.carbon: library "lib"; interface I1(Other:! type) { let Result:! type; } b.carbon: import library "lib"; class T1 { } impl T1 as I1(Self) where .Result = Self { } The SemIR dump diff looked like this: 89c89 < %Main.import_ref.b6f = import_ref Main//lib, inst28 [no loc], unloaded --- > %Main.import_ref.b6f = import_ref Main//lib, inst27 [no loc], unloaded 96c96 < %Main.import_ref.f7b: @i1.%I1.type (%I1.type.e87) = import_ref Main//lib, inst28 [no loc], loaded [symbolic = @i1.%Self (constants.%Self.c47)] --- > %Main.import_ref.f7b: @i1.%I1.type (%I1.type.e87) = import_ref Main//lib, inst27 [no loc], loaded [symbolic = @i1.%Self (constants.%Self.c47)] Which is a difference, but given the `inst28`/`inst27` don't appear anywhere else than these two lines, it doesn't give a terribly meaningful diff/story about what changed - but perhaps it's sufficient... Not sure if this test ^ is sufficiently more interesting than the diff update already in this patch. If so, happy to add the above as a new test case. Open to ideas.

@i1

…#5998) Found by WIP validation for this type of issue ongoing in #5997 I'm not entirely sure how the one test update falls out of this change - but it is from the same test that I originally reduced the problem from, which is reassuring. The reduced test case I investigated the issue with was this: `a.carbon`: ``` library "lib"; interface I1(Other:! type) { let Result:! type; } ``` `b.carbon`: ``` import library "lib"; class T1 { } impl T1 as I1(Self) where .Result = Self { } ``` The SemIR dump diff looked like this: ``` 89c89 < %Main.import_ref.b6f = import_ref Main//lib, inst28 [no loc], unloaded --- > %Main.import_ref.b6f = import_ref Main//lib, inst27 [no loc], unloaded 96c96 < %Main.import_ref.f7b: @i1.%I1.type (%I1.type.e87) = import_ref Main//lib, inst28 [no loc], loaded [symbolic = @i1.%Self (constants.%Self.c47)] --- > %Main.import_ref.f7b: @i1.%I1.type (%I1.type.e87) = import_ref Main//lib, inst27 [no loc], loaded [symbolic = @i1.%Self (constants.%Self.c47)] ``` Which is a difference, but given the `inst28`/`inst27` don't appear anywhere else than these two lines, it doesn't give a terribly meaningful diff/story about what changed - but perhaps it's sufficient... Not sure if this test ^ is sufficiently more interesting than the diff update already in this patch. If so, happy to add the above as a new test case. Open to ideas.

Otherwise invalid InstIds passed to canstants would cause arbitrary/large memory allocation as the constant storage would be grown to fit the giant invalid index.

… of any context useful for dumping given only an id without the container or SemIR::File it might've come from

The more-informative check failure (reproducing the original bug that motivated this extra checking) looks like: CHECK failure at ./toolchain/base/value_store.h:295: index < size_: Untagged index was outside of container range. Possibly tagged index 2113929262. Best-effort decomposition: CheckIRId: 30, index: 46. The CheckIRIdTag for this container is: 29 Not sure how this should be phrased - it all feels a bit awkward. Maybe something like "CheckIRId was 30, should be 29. Index is 46"? Maybe print the tagged index as hex?

toolchain/base/value_store.h

toolchain/sem_ir/constant.h

toolchain/base/fixed_size_value_store.h

toolchain/sem_ir/ids.cpp

…urate Add back in the original `IdToChunkIndices` but as a non-static member (since it's needed to un-swizzle the CheckIRIdTag back out of the Id to get to the raw index)

dwblaikie · 2025-09-24T21:11:45Z

Ping @zygoloid - further thoughts?

zygoloid

Thanks, a couple of comments but basically LGTM.

(Looks like it's worth making some updates to the first comment on the PR before that becomes the commit message.)

I forget -- have you done any performance testing here? I'm not really expecting much difference but value stores are pretty hot so it seems worth checking.

toolchain/sem_ir/constant.h

toolchain/base/fixed_size_value_store.h

toolchain/sem_ir/constant.h

Co-authored-by: Richard Smith <[email protected]>

dwblaikie

Thanks, a couple of comments but basically LGTM.

(Looks like it's worth making some updates to the first comment on the PR before that becomes the commit message.)

Ah, yep - updated.

I forget -- have you done any performance testing here? I'm not really expecting much difference but value stores are pretty hot so it seems worth checking.

I have not - any recommended benchmarks/mechanisms for performance testing changes?

toolchain/sem_ir/constant.h

toolchain/base/fixed_size_value_store.h

dwblaikie · 2025-09-25T23:49:26Z

yeah, got some performance regressions I'll need to look into :/

 Benchmark                                        ┃          CPU Time           ┃          Bytes          ┃          Lines          ┃         Tokens
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━
 BM_CompileApiFileDenseDecls<Phase::Lex>/256..... │      ??          p=0.291    │ ~                       │ ~                       │ ~
                                        baseline: │    206.5   µs  ±   1.400%   │    25.51  M ±   1.420%  │   944.5   k ±   1.420%  │     5.333 M ±   1.420%
                                      experiment: │    207     µs  ±   1.032%   │    25.44  M ±   1.022%  │   942     k ±   1.022%  │     5.319 M ±   1.022%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Lex>/1024.... │      ??          p=0.26     │ ~                       │ ~                       │ ~
                                        baseline: │    315.5   µs  ±   2.879%   │   103     M ±   2.798%  │     3.103 M ±   2.798%  │    18.23  M ±   2.798%
                                      experiment: │    319.7   µs  ±   0.702%   │   101.6   M ±   0.707%  │     3.062 M ±   0.707%  │    17.99  M ±   0.707%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Lex>/4096.... │ 👎   1.086%      p=0.0137   │ ~                       │ ~                       │ ~
                                        baseline: │    751.3   µs  ±   1.788%   │   188.9   M ±   1.820%  │     5.361 M ±   1.820%  │    31.76  M ±   1.820%
                                      experiment: │    759.5   µs  ±   2.398%   │   186.8   M ±   2.341%  │     5.304 M ±   2.341%  │    31.42  M ±   2.341%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Lex>/16384... │      ??          p=0.0782   │ ~                       │ ~                       │ ~
                                        baseline: │      2.584 ms  ±   4.567%   │   236.4   M ±   4.370%  │     6.312 M ±   4.370%  │    37.45  M ±   4.370%
                                      experiment: │      2.656 ms  ±   2.913%   │   230     M ±   2.831%  │     6.141 M ±   2.831%  │    36.44  M ±   2.831%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Lex>/65536... │      ??          p=0.181    │ ~                       │ ~                       │ ~
                                        baseline: │     10.4   ms  ±   6.726%   │   242.7   M ±   6.305%  │     6.294 M ±   6.305%  │    37.37  M ±   6.305%
                                      experiment: │     10.99  ms  ±   5.948%   │   229.7   M ±   6.324%  │     5.957 M ±   6.324%  │    35.37  M ±   6.324%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Lex>/262144.. │ 👎   4.676%      p=0.0166   │ ~                       │ ~                       │ ~
                                        baseline: │     46.85  ms  ±  10.449%   │   218.2   M ±   9.467%  │     5.596 M ±   9.467%  │    33.22  M ±   9.467%
                                      experiment: │     49.04  ms  ±  10.336%   │   208.4   M ±   9.368%  │     5.345 M ±   9.368%  │    31.74  M ±   9.368%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Parse>/256... │      ??          p=0.438    │ ~                       │ ~                       │ ~
                                        baseline: │    238.4   µs  ±   5.045%   │    22.09  M ±   4.803%  │   817.9   k ±   4.803%  │     4.618 M ±   4.803%
                                      experiment: │    240     µs  ±   1.518%   │    21.94  M ±   1.495%  │   812.4   k ±   1.495%  │     4.587 M ±   1.495%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Parse>/1024.. │      ??          p=0.0572   │ ~                       │ ~                       │ ~
                                        baseline: │    461.1   µs  ±   1.573%   │    70.46  M ±   1.549%  │     2.123 M ±   1.549%  │    12.47  M ±   1.549%
                                      experiment: │    464.8   µs  ±   0.727%   │    69.89  M ±   0.732%  │     2.106 M ±   0.732%  │    12.37  M ±   0.732%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Parse>/4096.. │ 👎   1.825%      p=0.000249 │ ~                       │ ~                       │ ~
                                        baseline: │      1.302 ms  ±   0.605%   │   109     M ±   0.601%  │     3.093 M ±   0.601%  │    18.32  M ±   0.601%
                                      experiment: │      1.326 ms  ±   1.662%   │   107     M ±   1.635%  │     3.038 M ±   1.635%  │    18     M ±   1.635%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Parse>/16384. │ 👎   1.712%      p=0.0166   │ ~                       │ ~                       │ ~
                                        baseline: │      4.897 ms  ±   2.576%   │   124.8   M ±   2.644%  │     3.331 M ±   2.644%  │    19.76  M ±   2.644%
                                      experiment: │      4.981 ms  ±   2.989%   │   122.7   M ±   2.902%  │     3.275 M ±   2.902%  │    19.43  M ±   2.902%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Parse>/65536. │      ??          p=0.398    │ ~                       │ ~                       │ ~
                                        baseline: │     19.87  ms  ±   3.185%   │   127.1   M ±   3.087%  │     3.295 M ±   3.087%  │    19.56  M ±   3.087%
                                      experiment: │     20.36  ms  ±   4.434%   │   124     M ±   4.640%  │     3.216 M ±   4.640%  │    19.09  M ±   4.640%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Parse>/262144 │ 👎   5.161%      p=0.00388  │ ~                       │ ~                       │ ~
                                        baseline: │     83.73  ms  ±   2.650%   │   122.1   M ±   2.599%  │     3.131 M ±   2.599%  │    18.59  M ±   2.599%
                                      experiment: │     88.05  ms  ±   4.060%   │   116.1   M ±   4.232%  │     2.977 M ±   4.232%  │    17.67  M ±   4.232%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Check>/256... │ 👎   9.785%      p=0.000249 │ ~                       │ ~                       │ ~
                                        baseline: │     17.92  ms  ±   0.850%   │   293.9   k ±   0.857%  │    10.88  k ±   0.857%  │    61.43  k ±   0.857%
                                      experiment: │     19.68  ms  ±   0.994%   │   267.7   k ±   1.003%  │     9.911 k ±   1.003%  │    55.96  k ±   1.003%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Check>/1024.. │ 👎   9.446%      p=0.000557 │ ~                       │ ~                       │ ~
                                        baseline: │     19.28  ms  ±   0.896%   │     1.685 M ±   0.888%  │    50.77  k ±   0.888%  │   298.3   k ±   0.888%
                                      experiment: │     21.1   ms  ±   0.452%   │     1.539 M ±   0.450%  │    46.39  k ±   0.450%  │   272.6   k ±   0.450%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Check>/4096.. │ 👎   7.604%      p=0.000428 │ ~                       │ ~                       │ ~
                                        baseline: │     24.8   ms  ±   0.958%   │     5.722 M ±   0.967%  │   162.4   k ±   0.967%  │   962.2   k ±   0.967%
                                      experiment: │     26.68  ms  ±   0.828%   │     5.317 M ±   0.821%  │   151     k ±   0.821%  │   894.2   k ±   0.821%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Check>/16384. │ 👎   5.282%      p=0.000328 │ ~                       │ ~                       │ ~
                                        baseline: │     47.99  ms  ±   1.808%   │    12.73  M ±   1.776%  │   339.9   k ±   1.776%  │     2.017 M ±   1.776%
                                      experiment: │     50.53  ms  ±   2.543%   │    12.09  M ±   2.480%  │   322.8   k ±   2.480%  │     1.916 M ±   2.480%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Check>/65536. │ 👎   3.614%      p=0.00153  │ ~                       │ ~                       │ ~
                                        baseline: │    146.8   ms  ±   1.072%   │    17.2   M ±   1.061%  │   446     k ±   1.061%  │     2.648 M ±   1.061%
                                      experiment: │    152.1   ms  ±   1.689%   │    16.6   M ±   1.661%  │   430.5   k ±   1.661%  │     2.556 M ±   1.661%
                                                  │                             │                         │                         │
 BM_CompileApiFileDenseDecls<Phase::Check>/262144 │ 👎   3.955%      p=0.00743  │ ~                       │ ~                       │ ~
                                        baseline: │    586.6   ms  ±   1.675%   │    17.42  M ±   1.647%  │   446.9   k ±   1.647%  │     2.653 M ±   1.647%
                                      experiment: │    609.8   ms  ±   1.902%   │    16.76  M ±   1.867%  │   429.9   k ±   1.867%  │     2.552 M ±   1.867%
                                                  │                             │                         │                         │

chandlerc · 2025-09-26T01:03:37Z

FWIW, those are pretty small regressions, and there were up to 5% swings in the parse benchmarks so might be some noise in them too. Can look for anything obvious, but may not be worth stressing too much about this. IT's not like a 25% or 50% regression.

dwblaikie · 2025-10-02T22:37:17Z

FWIW, those are pretty small regressions, and there were up to 5% swings in the parse benchmarks so might be some noise in them too. Can look for anything obvious, but may not be worth stressing too much about this. IT's not like a 25% or 50% regression.

Fair - yeah, better understood the perf report and how for higher numbers of iterations the difference went down, and the noise. Ran some perf diffs and didn't find anything really standout. So going ahead with this as-is for now.

github-actions bot added the toolchain label Aug 28, 2025

dwblaikie mentioned this pull request Aug 28, 2025

Fix a use of an imported InstId used where a local InstId is required #5998

Merged

dwblaikie force-pushed the debug_checks_checkirid_matching_value_store branch from e89bd87 to 9c3465d Compare August 28, 2025 23:14

dwblaikie force-pushed the debug_checks_checkirid_matching_value_store branch from 9c3465d to d0734b3 Compare September 8, 2025 18:51

dwblaikie added 6 commits September 9, 2025 18:44

WIP

574bd45

Add InstStore to Constants so the id can be validated

a6d3926

Otherwise invalid InstIds passed to canstants would cause arbitrary/large memory allocation as the constant storage would be grown to fit the giant invalid index.

Fix

f4956c1

Best effort attempt to decompose an id into unit+index in the absence…

dff470f

… of any context useful for dumping given only an id without the container or SemIR::File it might've come from

Minor updates/fixes from rebase

ea8fb53

dwblaikie force-pushed the debug_checks_checkirid_matching_value_store branch from d0734b3 to ea8fb53 Compare September 9, 2025 19:47

Non-automatic test updates

c95770b

dwblaikie requested a review from zygoloid September 9, 2025 22:15

zygoloid reviewed Sep 11, 2025

View reviewed changes

dwblaikie added 4 commits September 16, 2025 23:17

Rename IdToChucknIndices to RawIndexToChunkIndices to be more acc…

728a04b

…urate Add back in the original `IdToChunkIndices` but as a non-static member (since it's needed to un-swizzle the CheckIRIdTag back out of the Id to get to the raw index)

Rename CheckIRIdTag to IdTag

8ae0288

CARBON_CHECK that used ConstantValueStores have an associated InstStore

280ed8c

Change best-effort inst name printing to irN.instM

6649f47

dwblaikie requested a review from zygoloid September 17, 2025 20:28

dwblaikie marked this pull request as ready for review September 17, 2025 21:07

github-actions bot requested a review from josh11b September 17, 2025 21:07

dwblaikie removed the request for review from josh11b September 17, 2025 21:08

zygoloid reviewed Sep 24, 2025

View reviewed changes

toolchain/sem_ir/constant.h Outdated Show resolved Hide resolved

toolchain/base/fixed_size_value_store.h Outdated Show resolved Hide resolved

toolchain/sem_ir/constant.h Show resolved Hide resolved

dwblaikie and others added 4 commits September 25, 2025 10:22

Simplify IdTag default construction

9ad3946

Co-authored-by: Richard Smith <[email protected]>

Add missing ')' and format

acbb584

Put stub GetIdTag in SpecificStore, removing ADL-based GetIdTag

306fcc6

Add a ConstantValueStore ctor that explicitly makes the store unusable

dd1911f

dwblaikie commented Sep 25, 2025

View reviewed changes

toolchain/sem_ir/constant.h Show resolved Hide resolved

toolchain/base/fixed_size_value_store.h Outdated Show resolved Hide resolved

dwblaikie changed the title ~~WIP check for use of InstIds from the wrong SemIR File~~ Check for use of InstIds from the wrong SemIR::File Sep 25, 2025

dwblaikie requested a review from zygoloid September 25, 2025 20:30

zygoloid approved these changes Sep 25, 2025

View reviewed changes

Merge branch 'trunk' into debug_checks_checkirid_matching_value_store

5589970

dwblaikie requested a review from a team as a code owner October 2, 2025 22:32

dwblaikie requested review from josh11b and removed request for a team October 2, 2025 22:32

dwblaikie enabled auto-merge October 2, 2025 22:37

dwblaikie added this pull request to the merge queue Oct 2, 2025

Merged via the queue into carbon-language:trunk with commit 12fa65e Oct 2, 2025
8 checks passed

dwblaikie deleted the debug_checks_checkirid_matching_value_store branch October 2, 2025 23:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Check for use of `InstId`s from the wrong `SemIR::File` #5997

Check for use of `InstId`s from the wrong `SemIR::File` #5997

Uh oh!

dwblaikie commented Aug 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dwblaikie commented Sep 24, 2025

Uh oh!

zygoloid left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dwblaikie left a comment

Uh oh!

Uh oh!

Uh oh!

dwblaikie commented Sep 25, 2025

Uh oh!

chandlerc commented Sep 26, 2025

Uh oh!

dwblaikie commented Oct 2, 2025

Uh oh!

Uh oh!

Uh oh!

Check for use of InstIds from the wrong SemIR::File #5997

Check for use of InstIds from the wrong SemIR::File #5997

Uh oh!

Conversation

dwblaikie commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dwblaikie commented Sep 24, 2025

Uh oh!

zygoloid left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dwblaikie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dwblaikie commented Sep 25, 2025

Uh oh!

chandlerc commented Sep 26, 2025

Uh oh!

dwblaikie commented Oct 2, 2025

Uh oh!

Uh oh!

Uh oh!

Check for use of `InstId`s from the wrong `SemIR::File` #5997

Check for use of `InstId`s from the wrong `SemIR::File` #5997

dwblaikie commented Aug 28, 2025 •

edited

Loading