Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
146 changes: 146 additions & 0 deletions text/0000-target-stages.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
- Feature Name: `target_stage`
- Start Date: 2025-10-25
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

Communication between compiler activities via the incremental system to avoid more recompilations than necessary.

# Motivation
[motivation]: #motivation

The current model for incremental recompilations doesn't share progress between compiler activities, leading to unnecessary rebuilds. Users notice redundant compilations, as
"Changes in workspaces trigger unnecessary rebuilds" was submitted as [a big complaint in compiler performance][perf-survey].

Introducing a concept of truly incremental compilation helps for these cases which share a common base and would help with:

- `target` directory size; artifacts are shared between activities with a common base (e.g. `check` and `build` on the same crate and same flags)
- Compilation times, as e.g. `check` -> `build` allows building to start right from type-checking, and allows `check` to be skipped altogether.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if we change check builds to actually store MIR, slowing down check builds in the process.

Copy link

@celinval celinval Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know / remember what the overhead would be?

Copy link
Member

@bjorn3 bjorn3 Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supposedly up to 10% faster: rust-lang/rust#49433 (comment). The perf results were wiped when moving to another perf server in 2019 though. In addition it may well be a bit larger nowadays given that we also skip calculating some other things like reachable_non_generics and exported_symbols.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually store MIR? check builds already compute MIR for borrowchecking and CTFE, is it that slow to serialize?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also be partly from skipping MIR optimization steps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing a perf run at rust-lang/rust#149457 to find out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The results are in and the perf regression for removing all should_codegen() checks outside of codegen (aside from the one to determine if object files are necessary for dependencies) is pretty bad: https://perf.rust-lang.org/compare.html?start=0ad2cd3f1357da47c3eb4acc8224a8f10dd87d0f&end=fbd25dde0af118f932dcb48e3fb6ff322b35e2ff&stat=instructions:u 70%+ percent in some cases. The perf hit seems to be caused for a large part by collect_and_partition_mono_items being called now for getting the list of generic functions to encode in the crate metadata for use by -Zshare-generics. But even just the extra MIR encoding takes on the order of 100ms extra time out of a 1s compile for the image crate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we store the MIR state in a partial state and wait for the codegen state to actually develop it? If collect_and_partition_mono_items is only used at codegen (e.g. emit-bin), then declare it as a possibility with the necessary arguments from the earliest stage that can produce it (THIR) but don't actually run it until it is necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collect_and_partition_mono_items is used to skip encoding MIR that will never be used. Skipping usage of collect_and_partition_mono_items would be effectively equivalent to -Zalways-encode-mir. I don't think THIR can be stored in the crate metadata. It contains various references to the HIR, which we don't store in the crate metadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be nonsense, but could we store analysis mir on check, then store optimized mir on build?

(Or is unoptimized analysis MIR so much bigger that it'd be slower just from that?)


Alternatively, I wonder about adding another phase, where we could have a "cleaned up mir" which doesn't do things like inlining, but has removed obvious silliness, then full optimized mir with inlining and prepare-for-codegen could be later.


The proposed solution is to add a notion of "compiler target stages", which would be served into `rustc` usually via Cargo with a `--target-stage=<ast|macro-expansion|hir|analyze-hir|...>`
flag. This flag would be denoted a special role in the compiler and it would have a special place in the incremental system.

The flag denotes which target the compiler can go up to, and serves as an early indicator of how the build was performed. This moves the burden of stage-indication from the compiler flags, and gives a reliant way for future compilations to evaluate if they should run, load a dependency graph into memory, or start from scratch altogether.

Disconnecting compiler flags from an early recompilation judgement allows us to almost completely overlap the various
compiler activities of different levels, and allow for more flexibility in activities other than `build` and `check`,
such as `clippy` or other existing and future projects that may need to partially build with the Rust compiler.

---

While not all cases overlap with "lower" compiler-adjacent activites (`check` would be lower than `build`) and those
cases would need a recompilation anyway, this allows us to [perform further optimizations][#further-optimizations] than just holding the two workflows.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation


The `--target-stage` flag is an option that instructs the compiler to only go so far when compiling your code. This
allows the compiler to perform numerous optimizations to avoid redundant recompilations.

Users don't usually have to worry about this flag, as it's handled automatically by Cargo via `cargo check` and `cargo build`.
For an explanation for RFC readers, see [Motivation](#motivation).

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

There are two parts to this feature, one handled in the compiler itself and another handled by the incremental system
and heavily interacting with Cargo.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unclear on what Cargo will do with this new compiler feature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RFC assumes involvement from Cargo for restructuring target as I couldn't get any target/ documentation from rustc, only from Cargo. So I asume that the target directory layout comes from Cargo, and its contents themselves come from rustc_incremental::*.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That says that cargo needs to be involved but doesn't help me understand what that involvement would be.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edited that portion, should I add a dedicated section about this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is this cargo check opting into an alternative behavior for the check compilation and the rest is handled inside of the incremental cache?

Looking over comments from @bjorn3 at #t-cargo > Build cache and locking design @ 💬, this would mean that so long as the -Cmetadata is the same, other builds would be able to take advantage of this incremental cache. Looking over fingerprinting, -Cmetadata would then change for different compilation modes (check vs build) and different rustc binaries (rustc vs rustc-clippy). So to leverage this, other changes would need to be made but those would likely have ripple effects in cargo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While same -Cmetadata means the incr comp session ends up in the same location, changing any cli option marked as TRACKED (which includes --emit) will invalidate the incr comp cache right now. Implementing target stages will require actually tracking query dependencies for them or some way to deny accessing them from within queries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is about improving the tracking of what inputs invalidates the incremental cache? How does callers being able to specify arbitrary target stages and cargo check specifying a specific one fit into that?

Copy link
Member Author

@blyxyas blyxyas Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is this cargo check opting into an alternative behavior for the check compilation and the rest is handled inside of the incremental cache?

Yes, instead of Cargo handling it with compiler flags, the more complex parts would be handled by this new flag.

Implementing target stages will require actually tracking query dependencies for them or some way to deny accessing them from within queries.

Indeed, we would need to stop tracking "unnecessary" fields for queries that don't depend on them, and only track fields which they actually depend on.

How does callers being able to specify arbitrary target stages and cargo check specifying a specific one fit into that?

The target stages do not directly change query behaviour (We could just make queries depend on emit and end at that), they are directives for other compiler session to know if the given cache can be used.

A compiler would check if there is any cache to be found, if there is, check if it can be built upon (because its current target stage is greater than the target stage in the cache), if it can, check up to where query dependencies are the same and perform the current mark red-green algorithm.

If the query dependencies are the same all over, just start from the end of the last compilation.
If the query dependencies differ at some point, create a new cache starting from it and designate that last compilation cache as the base for this one, so we don't store duplicates.

These two concepts go in the same RFC because they both modify the incremental system in pretty major ways, so splitting them would be unhelpful and inefficient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, instead of Cargo handling it with compiler flags, the more complex parts would be handled by this new flag.

The part in emphasis is new to me. What is Cargo no longer supposed to do? Why does the RFC not talk about this?

These two concepts go in the same RFC because they both modify the incremental system in pretty major ways, so splitting them would be unhelpful and inefficient.

From reading the RFC, it was not clear to me that the proposal is for two different features that happen to both touch incremental compilation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From reading the RFC, it was not clear

In that case, I'll make the separation clearer. I didn't separate it before because I don't consider the notion of "query specifying dependencies" to be enough "RFC-worthy" to mention with its own paragraph.



The easiest part is handled by the compiler, where "safe points to exit" are designated (such as in between of parsing
to AST and macro-expansion, or lowering THIR to MIR). Depending on the input that the `--target-stage` option takes,
we exit at one point or another.

Along with the public portion of the RFC, we also have a private new construct, called "Stage dependencies".
"Stage dependencies" are designated fields that, while they are handled by `Session`, they are not taken into account
when hashing, and thus, their values are not always loadbearing.

This means that stage dependencies only affect the hash of the necessary stage that depends on them.
Stage dependencies are also overlapping in the majority of cases, we'll look into it later.

For example, let's say that the user calls `cargo check` two times, with different `-Cstrip` values.
`cargo check` (in the local module) will not be affected by this change, as `-Cstrip` is a stage dependency on `linking` and beyond.
Being that `cargo check` doesn't change behaviour depending on the linking behaviour, it should **not** be recompiled.
Comment on lines +63 to +65
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this talking purely in terms of the incremental compilation cache? For cargo, we write new cache entries for changes in profile settings, see https://doc.rust-lang.org/nightly/nightly-rustc/cargo/core/compiler/fingerprint/index.html#fingerprints-and-unithashs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this talking purely in terms of the incremental compilation cache?

Yes, it's talking from the perspective of rustc's cache specifically.

Is there a reason why fingerprints aren't based purely on the final flags passed to rustc (including RUSTFLAGS)? A rustc invocation with two different profiles but with manually set-up flags to be identical should be ~identical for cache purposes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why fingerprints aren't based purely on the final flags passed to rustc (including RUSTFLAGS)?

Note that "fingerprint" could mean

  • fingerprint (no new cache entry on change)
  • -Cextra-filename (new cache entry on change)
  • -Cmetadata
  • or the Unit ID

A simple answer is that we want the hashes for the different fingerprints to be independent of the workspace location for build reproducibility and to ensure cache reuse when it is moved (particularly a problem for CI). I could likely come up with more reasons but that us likely sufficient that this is not something we are likely to reconsider.


Extracting compilation flags from hashing (and thus, from recompilation algorithms) allows us to also perform some post-processing,
like unifying lints or handling the reorder of flags in a command line.

---

Stage dependencies are mixed with target stages and the overlapping nature of the new incremental system, allowing us to re-use bases
and only process/store the parts that actually differ.

## Non-overlapping cases

Not all cases overlap, for example, while `-Cdebug-assertions` is marked as "codegen", it's a commonly used option for
detecting the profile of the compilation in `cfg`s. That is the reason that some stage dependencies are can affect
several stages and be enabled/disabled for some stages independently.

In the example of `-Cdebug-assertions`, the attribute parser would check for the presence of `#[cfg(debug_assertions)]`,
before and after macro expansion, and "enable" the stage dependency for all following stages if it's present at any
time during that parsing.

Note that stage dependencies are tracked even if `--target-stage` dictates to stop at that step, because future compilations
with higher target-stages can benefit from those stage dependencies being tracked.

# Drawbacks
[drawbacks]: #drawbacks

Why should we *not* do this?
- Potentially buggy release artifacts, if the compiler isn't careful enough, it might reutilize an artifact that it shouldn't and result in a buggy binary without any human error by the user.
- Coordination between teams and sub-projects is lengthy and a serious effort. I can expect that both Cargo and the compiler will be greatly involved, at least in some portion.
- A potential for bigger `target` sizes / quicker rise in `target` size, if the user uses multiple "bases", using each one with multiple, different "data dependencies". This is only theorical and depends on the final algorithm used, as the intent is for `target` sizes to be reduced.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

- The current incremental system is an alternative. But users find that unnecessary rebuilds are a problem.
- This flag would override the currently unstable and untracked `no_analysis` option. Althought this flag currently
exists, it doesn't see any real use inside the compiler, nor does it have a tracking issue. It doesn't operate with
Cargo, and isn't designed to be as loadbearing as this RFC proposes.


# Prior art
[prior-art]: #prior-art

A few examples of what this can include are:

- [GCC IncrementalCompiler Project][gccincr], currently without a Delivery Date (last update; 2008).
- Clang has the [Clang-Repl project][clang-repl], a C++ interpreter which supports incremental compilation
- [Zig Incremental compilation][zig-incr], doesn't have a functional incremental compilation system for many use-cases.
- [Go (incremental?) compiler][go-page], supposedly incremental but I cannot find documentation on this or an implementation.
- [Ocaml incremental compiler][ocaml-incr], the code is available, but not documented to an extensive degree.
- [This Project Goal][project-goal], that talks about a similar concept, but with a more primitive approach.

It seems that Rust currently has the most documented incremental compilation system.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

- New design of the incremental directory.
- How will these target stages be handled when saving a dependency graph?
- Where will the stage dependencies be stored? Should we store them along the dependency graph?
- How does this effort interact with the [Relink, Don't Rebuild][rdr] project goal?

# Future possibilities
[future-possibilities]: #future-possibilities

This effort helps push forward towards a query-driven compiler, where all big "stages" of a compilation are
querified.

It also allows for better flexibility when interacting with 3rd party programs, and ad-hoc programs
to `rustc_driver` like Clippy and thus, allowing the share of cache between these.

<!-- More information will be provided for this section in the future -->

[perf-survey]: https://blog.rust-lang.org/2025/09/10/rust-compiler-performance-survey-2025-results/#incremental-rebuilds
[rdr]: https://rust-lang.github.io/rust-project-goals/2025h2/relink-dont-rebuild.html?highlight=relink%2C%20don#relink-dont-rebuild
[gccincr]: https://gcc.gnu.org/wiki/IncrementalCompiler
[clang-repl]: https://clang.llvm.org/docs/ClangRepl.html
[zig-incr]: https://github.com/ziglang/zig/issues/21165
[go-page]:https://go.dev
[ocaml-incr]: https://ocaml.org/manual/5.4/api/compilerlibref/type_CamlinternalMenhirLib.IncrementalEngine.html
[project-goal]: https://github.com/rust-lang/rust-project-goals/pull/367