-
Notifications
You must be signed in to change notification settings - Fork 892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement trailing_semicolon = "Preserve"
#6149
base: master
Are you sure you want to change the base?
Implement trailing_semicolon = "Preserve"
#6149
Conversation
src/expr.rs
Outdated
let shape = if expr_type == ExprType::Statement | ||
&& semicolon_for_expr_extra_hacky_double_counted_spacing(context, expr) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot remember what the bug policy here is, but ideally we'd just remove this adjustment to shape
since it's already accounted for in format_stmt
. That causes self-formatting test to fail because some expressions actually fit onto one line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can fix the shape
adjustment if the fix is version gated to version=Two
. You could updated the logic so that we only sub 1 when version=One
is set,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed it in Version=Two, added a test for one&two, and fixed the cases that changed rustfmt's own formatting (since rustfmt apparently uses Two! 😆)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
responded to this in #6149 (comment)
src/expr.rs
Outdated
let shape = if expr_type == ExprType::Statement | ||
&& semicolon_for_expr_extra_hacky_double_counted_spacing(context, expr) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can fix the shape
adjustment if the fix is version gated to version=Two
. You could updated the logic so that we only sub 1 when version=One
is set,
ast::StmtKind::Expr(ref ex) | ast::StmtKind::Semi(ref ex) => { | ||
ast::StmtKind::Semi(ref ex) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this the cause of the double semicolon counting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comment below; I pointed out the reason for the the double semicolon counting in more detail.
I split these two branches because we need to preserve semicolons only if we have a StmtKind::Semi
, and not if we have a StmtKind::Expr
.
src/utils.rs
Outdated
/// Previously, we used to have `trailing_semicolon = always` enabled, and due to | ||
/// a bug between `format_stmt` and `format_expr`, we used to subtract the size of | ||
/// `;` *TWICE* from the shape. This means that an expr that would fit onto a line | ||
/// of, e.g. 99 (limit 100) after subtracting one for the semicolon would still be | ||
/// wrapped. | ||
/// | ||
/// This function reimplements the old heuristic of double counting the "phantom" | ||
/// semicolon that should have already been accounted for, to not break existing | ||
/// formatting with the `trailing_semicolon = preserve` behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind showing an example where things are wrapping early, maybe even adding a version=One
test case if we go the route of fixing things for version=Two
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn main() {
return hellooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo(arg);
}
The return statement line is exactly 100 chars long. When formatting the expression, we first subtract one from the shape here:
Lines 119 to 128 in 7289391
ast::StmtKind::Expr(ref ex) | ast::StmtKind::Semi(ref ex) => { | |
let suffix = if semicolon_for_stmt(context, stmt, is_last_expr) { | |
";" | |
} else { | |
"" | |
}; | |
let shape = shape.sub_width(suffix.len())?; | |
format_expr(ex, expr_type, context, shape).map(|s| s + suffix) | |
} |
(See let shape = shape.sub_width(suffix.len())?;
)
And then once again here:
Lines 67 to 71 in 7289391
let shape = if expr_type == ExprType::Statement && semicolon_for_expr(context, expr) { | |
shape.sub_width(1)? | |
} else { | |
shape | |
}; |
(in format_expr
).
In practice, for example, we need to format return helloo...o(arg)
which is an expression that is 95 characters long (100 - 4 identation - 1 semicolon) but after subtracting from the shape
twice, we only have have a budget of 94 characters, even though the whole expression should fit in the line.
I can add it as a version=One
test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, i discovered this because it happens twice in practice in the rustfmt codebase, so the self-formatting test failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we use version=Two
within rustfmt as a way to dogfood unstable formatting (at least that's my assumption)
/// (`return`/`break`/`continue`). | ||
pub enum TrailingSemicolon { | ||
/// Always rewrite `return;` and `break;` expressions to have a trailing semicolon, | ||
/// unless the block is a single-line block, e.g. `let PAT = e else { return }`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of weird, but preexisting behavior, I believe.
I don't exactly understand what rustfmt does to avoid adding a semicolon for single-line-style blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are different kinds of blocks. What kinds of blocks are you referencing? Just let-else
blocks?
To the best of my knowledge all let-else
blocks are handled when rewriting ast::Local
nodes. The code removes 1 from the shape to account for the trailing semicolon (line 127). Maybe it shouldn't?
Lines 49 to 188 in 4b1596f
impl Rewrite for ast::Local { | |
fn rewrite(&self, context: &RewriteContext<'_>, shape: Shape) -> Option<String> { | |
debug!( | |
"Local::rewrite {:?} {} {:?}", | |
self, shape.width, shape.indent | |
); | |
skip_out_of_file_lines_range!(context, self.span); | |
if contains_skip(&self.attrs) { | |
return None; | |
} | |
let attrs_str = self.attrs.rewrite(context, shape)?; | |
let mut result = if attrs_str.is_empty() { | |
"let ".to_owned() | |
} else { | |
combine_strs_with_missing_comments( | |
context, | |
&attrs_str, | |
"let ", | |
mk_sp( | |
self.attrs.last().map(|a| a.span.hi()).unwrap(), | |
self.span.lo(), | |
), | |
shape, | |
false, | |
)? | |
}; | |
let let_kw_offset = result.len() - "let ".len(); | |
// 4 = "let ".len() | |
let pat_shape = shape.offset_left(4)?; | |
// 1 = ; | |
let pat_shape = pat_shape.sub_width(1)?; | |
let pat_str = self.pat.rewrite(context, pat_shape)?; | |
result.push_str(&pat_str); | |
// String that is placed within the assignment pattern and expression. | |
let infix = { | |
let mut infix = String::with_capacity(32); | |
if let Some(ref ty) = self.ty { | |
let separator = type_annotation_separator(context.config); | |
let ty_shape = if pat_str.contains('\n') { | |
shape.with_max_width(context.config) | |
} else { | |
shape | |
} | |
.offset_left(last_line_width(&result) + separator.len())? | |
// 2 = ` =` | |
.sub_width(2)?; | |
let rewrite = ty.rewrite(context, ty_shape)?; | |
infix.push_str(separator); | |
infix.push_str(&rewrite); | |
} | |
if self.kind.init().is_some() { | |
infix.push_str(" ="); | |
} | |
infix | |
}; | |
result.push_str(&infix); | |
if let Some((init, else_block)) = self.kind.init_else_opt() { | |
// 1 = trailing semicolon; | |
let nested_shape = shape.sub_width(1)?; | |
result = rewrite_assign_rhs( | |
context, | |
result, | |
init, | |
&RhsAssignKind::Expr(&init.kind, init.span), | |
nested_shape, | |
)?; | |
if let Some(block) = else_block { | |
let else_kw_span = init.span.between(block.span); | |
// Strip attributes and comments to check if newline is needed before the else | |
// keyword from the initializer part. (#5901) | |
let init_str = if context.config.version() == Version::Two { | |
&result[let_kw_offset..] | |
} else { | |
result.as_str() | |
}; | |
let force_newline_else = pat_str.contains('\n') | |
|| !same_line_else_kw_and_brace(init_str, context, else_kw_span, nested_shape); | |
let else_kw = rewrite_else_kw_with_comments( | |
force_newline_else, | |
true, | |
context, | |
else_kw_span, | |
shape, | |
); | |
result.push_str(&else_kw); | |
// At this point we've written `let {pat} = {expr} else' into the buffer, and we | |
// want to calculate up front if there's room to write the divergent block on the | |
// same line. The available space varies based on indentation so we clamp the width | |
// on the smaller of `shape.width` and `single_line_let_else_max_width`. | |
let max_width = | |
std::cmp::min(shape.width, context.config.single_line_let_else_max_width()); | |
// If available_space hits zero we know for sure this will be a multi-lined block | |
let assign_str_with_else_kw = if context.config.version() == Version::Two { | |
&result[let_kw_offset..] | |
} else { | |
result.as_str() | |
}; | |
let available_space = max_width.saturating_sub(assign_str_with_else_kw.len()); | |
let allow_single_line = !force_newline_else | |
&& available_space > 0 | |
&& allow_single_line_let_else_block(assign_str_with_else_kw, block); | |
let mut rw_else_block = | |
rewrite_let_else_block(block, allow_single_line, context, shape)?; | |
let single_line_else = !rw_else_block.contains('\n'); | |
// +1 for the trailing `;` | |
let else_block_exceeds_width = rw_else_block.len() + 1 > available_space; | |
if allow_single_line && single_line_else && else_block_exceeds_width { | |
// writing this on one line would exceed the available width | |
// so rewrite the else block over multiple lines. | |
rw_else_block = rewrite_let_else_block(block, false, context, shape)?; | |
} | |
result.push_str(&rw_else_block); | |
}; | |
} | |
result.push(';'); | |
Some(result) | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure all block rewriting eventually calls into rewrite_block_inner
, which eventually delegates to a few other block rewrite functions, e.g rewrite_empty_block
or rewrite_single_line_block
, and I'm fairly certain none of the block rewriting code handles trailing semicolons.
/// Always rewrite `return;` and `break;` expressions to have a trailing semicolon, | ||
/// unless the block is a single-line block, e.g. `let PAT = e else { return }`. | ||
Always, | ||
/// Always return `return` and `break` expressions to remove the trailing semicolon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to avoid any confusion with the docs. Since this is the Never
variant it might be better to not use the word "Always" in the description. One potential rewording:
/// Never add a trailing semicolon. This removes trailing semicolon's from `return` and `break` expressions.
Also, this won't remove them from continue;
?
```rust | ||
fn foo() -> usize { | ||
return 0; | ||
} | ||
|
||
fn bar() -> usize { | ||
return 0 | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These aren't changes we necessarily need to make, but it might also be nice to include examples with break;
and continue;
to further demonstrate how this will work. Possibly adding a let-else
example.
/// semicolon that should have already been accounted for, to not break existing | ||
/// formatting with the `trailing_semicolon = Preserve` behavior for Version = One. | ||
#[inline] | ||
pub(crate) fn semicolon_for_expr_extra_hacky_double_counted_spacing( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need this function if we're version gating the fix and only applying the old behavior for version=One
?
@@ -0,0 +1,49 @@ | |||
// rustfmt-trailing_semicolon: Never |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value should be Preserve
here, or filename should be different if not
Overall lgtm, my impression is that there's only a couple outstanding nits so this is definitely a change we can and will make for 2024. From a sequencing pov, I'd like to make sure we get the style guide text tweaked and (once merged) this PR in a subtree sync so that the style and formatter changes are as together as possible. Doesn't have to literally be the same day but close proximity would be ideal. AFAIK we're still blocked on the next sync (upstream issue in a dependency) so this will be on-hold for a short period. The off-by-one bug fix is something we should capture on #5577 as well to ensure we're tracking and can announce it as part of 2024 style edition changes, but I'd also like to see how this changes other codebases on the diff check to get a feel for the impact. Related comment albeit not salient to merging this PR.. the original off by one issue looks like another case where one part of the rustfmt codebase was doing something it probably shouldn't have had the responsibility for doing in the first place, and then another part of the codebase was trying to reuse it and needed some internal logic accounting for that extra step. There's too many cases where functions are doing things adding/removing spaces or other tokens that other functions are then trying to account for or peel back, so let's use this as a reminder to ourselves to try to avoid taking routes that appear more expedient in the moment. If it feels/smells wrong, then it probably is and some broader refactoring is likely warranted |
@calebcartwright I think you might have missed this #6140 (comment), but we're not blocked on dependencies anymore. |
Implement
trailing_semicolon = "Preserve"
to preserve existing semicolons but also don't add new ones even if, e.g., we break a block.I discovered some bugs in the formatting of
return REALLY_LONG_EXPR;
where the expression itself would fit onto one line, but due to double counting (subtracting the size of the semicolon in bothformat_stmt
AND informat_expr
) we end up breaking up the statement at LIMIT - 1 rather than LIMIT.See the comment on
semicolon_for_expr_extra_hacky_double_counted_spacing
. Bikeshed that name if you want.