-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
-Z call-metadata: add call metadata to LLVM IR #59777
Conversation
this implements eRFC rust-lang#59412
!1 = !{!"dyn", !"Trait", !"method", !"fn", "fn(&Type) -> bool"} becomes !1 = !{!2, !3} !2 = !{!"dyn", !"Trait", !"method"} !3 = !{!"fn", !"fn(&Type) -> bool"} and !1 = !{!"drop", !"Trait1", !"drop", !"Trait2", !"drop", !"Trait3"} becomes !1 = !{!2, !3, !4} !2 = !{!"drop", !"Trait1"} !3 = !{!"drop", !"Trait2"} !4 = !{!"drop", !"Trait3"}
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you unconditionally enable the flag, we can test it out on perfbot.
If we do (b) we could compile
the std facade with -Z call-metadata so end users don't need to recompile it
themselves (which requires a tool like Xargo).
We should try this out, both compilation speed of libstd as well as the dist size. Although I'm not sure what an ok regression would be
query collect_and_partition_mono_items(_: CrateNum) | ||
-> (Arc<DefIdSet>, Arc<Vec<Arc<CodegenUnit<'tcx>>>>) { | ||
query collect_and_partition_mono_items(_: CrateNum) -> ( | ||
Arc<DefIdSet>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has collected quite the list of output arguments. I think it would be better to use a struct with named fields here
src/librustc/ty/instance.rs
Outdated
} | ||
|
||
AssociatedItemContainer::ImplContainer(impl_def_id) => { | ||
if let Some(trait_ref) = tcx.impl_trait_ref(impl_def_id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use the ?
operator here
@@ -54,6 +55,38 @@ impl<'a, 'tcx> Instance<'tcx> { | |||
) | |||
} | |||
|
|||
pub fn trait_ref_and_method( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document this method
@@ -282,21 +285,36 @@ impl<'tcx> InliningMap<'tcx> { | |||
} | |||
} | |||
|
|||
pub fn collect_crate_mono_items<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like how this file has been affected by the change. It causes some significant readability overhead that I feel is irrelevant to what the collector actually does. Implementation wise I think I'd be fine with it if you created a new type that has all the logic and datastructures in a single place (in another file preferrably). If there are just some calls of the call_metadata.foo()
sort flying around this file that seems much better to me.
@@ -122,6 +123,14 @@ pub enum ModuleKind { | |||
Allocator, | |||
} | |||
|
|||
#[derive(Copy, Clone, Debug, PartialEq)] | |||
pub enum CallKind<'tcx> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document this type. It should be clear both here and at the use sites that this is solely for debug info and does not actually affect codegen
src/librustc_codegen_llvm/common.rs
Outdated
dyn_: Option<(ExistentialTraitRef<'tcx>, Symbol)>, | ||
fn_: Option<FnSig<'tcx>>, | ||
llfn: &'ll Value) { | ||
unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the llvm::*
functions here should have safe wrappers on CodegenCx
instead of requiring this code to use unsafe so freely.
it wasn't being included in the case it has zero methods this commit only fixes the non-const case
don't create a tuple if there's a single node (e.g. `!{!1}}`) don't emit an empty tuple (i.e. `{}`)
We should probably also wait on a change like this: #59412 (comment). |
I have added the rustc_metadata encoding / decoding (it's still missing a small Consider these two crates: // crate: a
pub fn foo() {
// ..
} // crate: b
fn main() {
let f: fn() = a::foo();
// ..
} In this example we want to attach "fn" metadata to (*) For some reason I though that generating LLVM IR / bitcode only happened at I'm a bit stumped now. Some ideas: a. Is it possible to override the b. Would it be feasible to add new c. Discard the idea of d. Something else? |
Am I understanding it correctly, that you'll need to know all the call metadata when codegening to llvm? If so, we could delay codegen to the final crate (so basically MIR-only rlibs). |
@oli-obk that's correct.
can we implement that in this PR? Or do you mean to say that this is blocked on MIR-only rlibs? |
Only the "delay codegen to the final crate" solution is blocked on MIR-only rlibs. I don't remember what the blockers were. IIRC it was just that the codegen was slower or worse than with the current setup. I don't know of another way to do this nicely. |
@japaric I think there's a flag you can use on nightly ( |
That flag will only encode the MIR for everything. It doesn't change the way llvm is invoked. |
I have limited bandwidth right now so I'll come back to this PR at a later date. |
This PR implements eRFC #59412 in spirit but there are some
differences between the implementation and the original proposal.
What
Here's a summary of what the
-Z call-metadata
feature does:receives a single
!rust
metadata node when translated into LLVM IR (turnsout you can't attach several
!rust
metadata nodes as I originally thought).The node describes in which of those categories the function falls into (it may
belong to more than one category).
As an example consider this piece of code:
Using the
-Z call-metadata
flag produces the following (unoptimized) IR:Functions that may be called via a function pointer receive metadata of the
form:
!"fn" !"fn() -> i32"
, where the second node is the signature of thefunction. Example:
!363
Trait methods that may be called via dynamic dispatch receive metadata of the
form:
!"dyn" !"Bar<bool>" !"bar"
, where the second node is a concrete traitand the third node is the name of the method. Example:
!377
Note that a single function can receive both
"fn"
and"dyn"
metadata asit's the case of
<Baz as Bar<i32>::bar
in the example above (!370
).Destructors that may be invoked by trait object drop glue receive metadata of
the form:
!"drop" !"Bar<i32>"
, where the second node is a concrete trait.Example:
!180
A single destructor may be invoked by different trait objects (because a type
can implement several traits); in that case the "drop" metadata will contain
one node for each trait, as it's the case of
Baz
's destructor in the aboveexample (
!190
).any of these groups:
Using our previous example, this is the part of the IR that shows the call site
metadata added by the
-Z call-metadata
flag:A function invocation may only belong to a single category. The syntax used for
call site metadata is as follows:
!"fn" "fn() -> i32"
for function pointer calls. The second node is thesignature of the function being invoked. Example:
!363
!"dyn" !"Bar<bool>" !"bar"
for dynamic dispatch. The second node is aconcrete trait and the third node is the name of the method being dispatched.
Examples:
!377
and!430
!"drop"
!"Bar" for drop glue. The second node is a concrete trait. Inthe case of call site "drop" metadata only a single trait will listed. Example:
!211
How
And here's a high level description of how the flag is implemented.
In the MIR monomorphize collector pass we take note of:
Functions converted into function pointers. These occur in two places: in
non-const context (e.g.
let x: fn() = foo
), and in const-context (static X: fn() = foo
). These hit two different code paths in the compiler (whilewalking the MIR of functions and when looking into MIR values of static variables) but in
both cases we keep track of which
Instance
s (rustc
type used to trackconcrete functions) get converted into a function pointer in a
FxHashSet
.Traits that can get converted into trait objects. Again these occur in two
places: non-const context (e.g.
let x: &dyn Foo = &Bar
) and in const-context(
static X: &'static dyn Foo = &Bar
). As in the previous bullet these hit twodifferent code paths in the compiler but in both cases we keep track of all
methods (
Instance
s) that belong to these traits in aFxHashSet
.Drop glue of trait objects. While performing bullet 2 we also note down the
types that implement traits that get converted into trait objects. In this
case we store in the information in a
FxHashMap
that goes from theimplementer destructor (the
Instance
e.g.real_drop_inplace(&mut Bar)
)to the set of concrete traits (
ExistentialTraitRef
s) implemented by theimplementer type (only the ones that get converted into trait objects).
(Now I realize that for the second bullet it would use less memory to keep
a set of
ExistentialTraitRef
s (traits) instead of a set ofInstance
s (traitmethods)).
Then in the LLVM codegen pass:
When declaring functions (lowering them to
define
items in LLVM IR) weattach "define"-level metadata if the function (
Instance
) is in one of thesets / maps we previously collected. The functions get either "drop"
metadata, a mixture of "dyn" and "fn" metadata or no
!rust
metadata.When lowering function invocations to
call
/invoke
instructions we look upthe callee (
Instance
) in the sets / maps we previously collected and ifthere's a match we add either "fn", "dyn" or "drop" metadata.
TODO / questions
This doesn't fully work cross crate. Call site metadata is always complete, even
if the function being invoked comes from a different crate; and in the case of
LTO, invocations in functions defined in dependencies will also get call site
metadata. The "define"-level metadata, however, is incomplete; only functions
defined (or monomorphize) in the top crate get this kind of metadata.
To get complete "define"-level metadata I think we would have to store the call
metadata ("is this function converted into function pointer?", "is this trait
converted into a trait object?", etc.) in the rlib metadata (like we do for
types and traits). However, that would mean either (a) always building and
storing the call metadata, which would impact everyone's compile times and
increase rlib size, or (b) only storing the metadata when the
-Z call-metadata
flag is used, which would worsen end-users' experience as they would need to
recompile
core
/std
to get full information. If we do (b) we could compilethe
std
facade with-Z call-metadata
so end users don't need to recompile itthemselves (which requires a tool like Xargo).
My gut feeling (and hope) is that always building and storing the extra maps /
sets won't have much impact on compile time given that it's the existing MIR
pass plus a few extra trait lookups and map/set insertions per indirect function
call. So perhaps we can always build and store the call metadata?
r? @eddyb or @oli-obk