-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
Do not deduplicate captured args while expanding format_args!
#149926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Some changes occurred in src/tools/clippy cc @rust-lang/clippy Some changes occurred in compiler/rustc_ast_lowering/src/format.rs cc @m-ou-se |
|
r? @spastorino rustbot has assigned @spastorino. Use |
This comment has been minimized.
This comment has been minimized.
|
@rustbot author |
fb523ee to
3ebdaa4
Compare
|
@rustbot ready |
|
Nominating as per #145739 (comment) |
|
It'd be worth adding a test for the drop behavior. |
3ebdaa4 to
af89685
Compare
|
Given that this makes more sense for the language, along with the clean crater results and the intuition that it'd be surprising if anything actually leaned on this, I propose: @rfcbot fcp merge lang |
|
Team member @traviscross has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns. |
|
I don't think we should do this. It will make the generated code for I don't want to end up in a situation where it would make sense for Clippy to suggest something like: Adding @rust-rfcbot concern equivalence |
|
I shared @m-ou-se's concern from above:
@rfcbot concern equivalence |
|
@BurntSushi, @the8472, perhaps you could speak to the suggestion that we can optimize things at other layers, e.g. as @dianne described:
|
|
Also, could you both speak to whether, for these optimization reasons, you'd similarly expect that we would desugar |
This comment has been minimized.
This comment has been minimized.
|
It's more that my expectation is that these ought to be pure and we should optimize accordingly. And I haven't seen anyone making the argument that real-world code actually expects or relies on side-effects happening in print expressions, so why should we go out of our way to make code less efficient to enable something nobody needs?
I guess that'd be fine, yeah. |
|
We already enabled that -- you can have drop glue in consts, and you can print consts. So it's not a question of whether we allow side-effects during printing, that ship has sailed. The question is, do we have nice regular behavior where these side-effects work the way they do everywhere else, or do we add weird exceptions to the semantics of the language? |
|
Arguably macros are not semantics of the language, but arbitrary stuff a macro does to rewrite your code, no? |
|
Insofar as format_args is a builtin macro, its behavior is part of the language. |
|
Finished benchmarking commit (0282858): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 2.1%, secondary -2.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeResults (primary -0.0%, secondary -0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 474.946s -> 472.809s (-0.45%) |
|
First, I think @m-ou-se's concern above is worth considering. It's not uncommon to use a variable multiple times in a format string, and it would be unfortunate if we were forced to include the same reference multiple times, since you would never do that with hand-written assembly. With that said, @dianne's suggestion resolves the concern for me. We can optimize away the common case of taking a reference to the same variable multiple times. What pushes me over the edge toward thinking we should do this is not the const example by itself; it's an odd edge case for sure. What I find more motivating is to think about what should happen in a generalization of the current behavior. For example, if we ever accept expressions with function calls, I would not expect Regarding the wording in the RFC, it's useful as a fallback option, but the only reason I would find it persuasive is if we had discussed this case explicitly and decided to go with the current behavior. We could still change our minds with an FCP, but knowing that would be important context. With all that said, @rfcbot reviewed |
For this we need to know whether the |
I believe so. My understanding is that |
|
We discussed this in the @rust-lang/libs-api meeting. The consensus was that the semantics should ideally be to evaluate each placeholder separately since this presents the most opportunities for forwards-compatibility with arbitrary expressions. However many people were concerned about the performance regression and we would want to re-visit this decision if the optimization turns out to not be possible. |
|
All righty, I'm happy to resolve my concern given libs-api meeting consensus. But should that be blocked on fixing the perf problem here? |
Oh god, please no. We shouldn't do such magical things in format_args lowering. That will make it much harder to maintain and refactor. I don't want to get stuck again in a place where we can't update/improve format_args!() because nobody knows how it fully works and there is too much fragile magic involved. |
|
This still causes the problem that @m-ou-se cited: The performance reasoning is undermined, therefore code authors who actually care about performance must change their code. The ones who are guaranteed to benefit from the optimization are the ones who don't care. The equivalence is still gone for the people who care the most about it.
...Okay, but what is persuasive then? How should authors of RFCs approach the exercise? Are we agreeing on doing something, just making a vague suggestion, or something else? |
|
I think it would be a worrying trend for Rust to add odd non-compositional special cases to the observable runtime semantics (essentially undermining correctness reasoning) in the name of performance. At the very least that should come with overwhelming evidence that the performance benefit is worth such a nontrivial cost.
|
|
If people can't rely on the behavior of whatever RFCs specify the moment T-lang has a mood about it, then we should just slam in the |
This is why I argued earlier that this is, as far as users are concerned, a macro invocation and macros usually are a library things where the implementation can swizzle your code however it likes. So if it is not specified what the macro does... or the RFC even explicitly does mention deduplication then users should not apply "correctness reasoning" based on unrelated concepts such as function argument evaluation. That's incorrect correctness reasoning. If this conflicts with desired future extensions to the format_args syntax (supporting full expression substitution was brought up in the libs-meeting) then we could consider syntactically distinguishing them from things where we do ignore the potential for side-effects. |
|
I don't see how being a macro exempts anything we offer in the default Rust distribution from the expectation of having consistent, compositional semantics without unexpected traps. The technical vehicle (macro or keyword or whatever) does not matter for why such traps are problematic. |
|
Well, I tried to address the compositional part by suggesting we give such future extensions a different syntax, then nobody should expect those being related things that ought to compose, they're just different kinds of substitutions. I do think future extensions that we expect to do have non-negligible effects do need careful design. I'm skeptical about the corner-cases we're looking at right now. Tangentially, this isn't the first time where things would be a lot easier if we could require or check purity. |
The semantics we have today break compositionality. When I write a const twice I expect it to be instantiated twice. It seems people derive this as expected behavior from the RFC. Taking a look, I don't think I agree. The RFC has no example where a variable is captured twice. Even the part of the RFC which people now interpret to specify the deduplication fails to make this explicit. Nowhere does it say that this changes program behavior compared to other situations where a constant is mentioned multiple times. Nothing in the RFC points at this being a deliberate decision, it looks more like an oversight. This is not t-lang "having a mood", this is someone showing up with an interpretation of the RFC that is new at least to me. |
But macros don't guarantee constants being evaluated, they might just be referred by name, incorporated into some generated functions that are never called or completely dropped or perhaps deduplicated with some horrendous stringification-gymnastics or something. A trivial adaption from the linked issue... unsurprisingly it prints nothing. #![feature(decl_macro)]
use std::fmt::{self, Display, Formatter};
use std::cell::Cell;
struct PrintCounter(Cell<usize>);
impl Display for PrintCounter {
fn fmt(&self, f: &mut Formatter) -> fmt::Result {
let counter = 1 + self.0.get();
self.0.set(counter);
write!(f, "{counter}")
}
}
macro foo($a:expr, $b:expr) {
{}
}
const ZERO: PrintCounter = PrintCounter(Cell::new(0));
fn main() {
foo!(ZERO, ZERO);
} |
As I already tried to express before, I really don't get this argument. Why does it matter that a macro could do arbitrary nonsense with its tokens? The same is true for every keyword in the language, for the parser itself! That doesn't mean it's reasonable to do that. We hold the native syntax of the language to a higher standard than "we can do with your tokens whatever we please so we will"; I would expect the same for macros that we ship with the Rust toolchain. |
This change improves the consistency of behavior between parts of Rust. Write something twice, it gets evaluated twice. If you write
|
|
@BurntSushi wrote:
I think that we shouldn't block on the performance optimization going in, but we should wait for confirmation that it is as straightforward to implement as suggested. It sounds like @dianne has a proposed strategy that sounds potentially straightforward to implement, so it'd be helpful to see a PR. And if it's that straightforward, there's also not much harm in blocking, either. |
|
I think I've personally been swayed by the arguments in favor of not deduplicating captured arguments. In particular, even if we end up with a Clippy lint that fires for @rfcbot resolve equivalence |
Resolves #145739
I ran crater with #149291.
While there are still a few seemingly flaky, spurious results, no crates appear to be affected by this breaking change.
The only hit from the lint was
https://github.com/multiversx/mx-sdk-rs/blob/813927c03a7b512a3c6ef9a15690eaf87872cc5c/framework/meta-lib/src/tools/rustc_version_warning.rs#L19-L30,
which performs formatting on consts of type
::semver::Version. These constants contain a nested::semver::Identifier(Version.pre.identifier) that has a custom destructor. However, this case is not impacted by the change, so no breakage is expected.