-
Notifications
You must be signed in to change notification settings - Fork 2.7k
feat(build-analysis): JSONL-based logging infra #16150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/cargo/util/logger.rs
Outdated
| for json_line in rx { | ||
| let _ = writer.write_all(json_line.as_bytes()); | ||
| let _ = writer.write_all(b"\n"); | ||
| } | ||
| let _ = writer.flush(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to flush after each json_line to avoid problems if this doesn't get cleaned up properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it depends on whether we want to flush every single jsonline. If yes, they we probably also want to remove the BufWriter. Otherwise I believe BufWriter will flush every 8KiB.
I don't think the log atm is critical to always flush though. If cargo crashes they might probably want to look at something else than this log. However, this log can expand to serve more purposes.
src/cargo/util/log_message.rs
Outdated
| use jiff::Timestamp; | ||
| use serde::Serialize; | ||
|
|
||
| pub trait LogMessage: Serialize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nice thing about an enum is we have a single place to look for and handle the schema for processing these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I don't understand. What change you're suggesting/looking for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW this is modeled after machine_message.rs. If we're talking about make a giant enum for all kind of log message schema, I am totally okay with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Let me know if it is desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if there is a "right" answer here. We can try something and see how it works.
For example, if we decide to forward all content from machine_message.rs, then we'll run into issues with composing that with the enum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're fine duplicating them a bit.
- Once we have this logging mechanism, the unstable timing JSON message may go away
- We should be cautious not changing any stable message schema during the experiment of build-analysis
deb8758 to
0ba73bc
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Remove HTML timing output generation in favor of the upcoming JSONL logging system.
The logging infrastructure for `-Zbuild-analysis`. Some highlights: - JSONL logs stored in `~/.cargo/log/`. - Unique log file for each cargo invocation, with a run ID based on workspace hash and timestamp. - Uses background thread for log writes and serialization. Open to use other battle-tested crates in the future.
The build-started message ought to be pretty much the same as the header in timings HTML report, but some data is only available after the entire build or dependency resolution. Let's collect these data first.
Should produce one logfile per call
|
This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
Update cargo submodule 11 commits in 344c4567c634a25837e3c3476aac08af84cf9203..6c1b6100343691341b9e76c5acc594e78220f963 2025-10-15 15:01:32 +0000 to 2025-10-28 16:27:52 +0000 - feat(build-analysis): JSONL-based logging infra (rust-lang/cargo#16150) - feat: support array of any types in Cargo config (rust-lang/cargo#16103) - test(git): add more fetch-index backend interop (rust-lang/cargo#16162) - feat(git): support shallow fetch for Git CLI backend (rust-lang/cargo#16156) - Fix mdman to not incorrectly strip `<p>` tags (rust-lang/cargo#16158) - chore(triagebot): enable range-diff and review-changes-since (rust-lang/cargo#16152) - Avoid specifying which version will change behavior (rust-lang/cargo#16153) - Make shell completion variables private. (rust-lang/cargo#16144) - More warning conversions (rust-lang/cargo#16143) - Bump openssl-src to 300.3.5.4+3.5.4 (rust-lang/cargo#16140) - build: remove duplicate dependency, consolidate over unicode-ident (rust-lang/cargo#16137)
What does this PR try to resolve?
This adds A new JSONL-based logging infrastructure for
-Zbuild-analysis.Some highlights:
~/.cargo/log/.with a run ID based on workspace hash and timestamp.
Open to use some other battle-tested crates in the future.
build-startedlog message is added.Note that this is completely different than the original SQLite based design.
I realized this is better as we do writes and zero read during the build. We shouldn't pay the cost if we don't read those metrics.
See the design doc https://hackmd.io/K5-sGEJeR5mLGsJLXqsHrw and #15844
How to test and review this PR?
CARGO_BUILD_ANALYSIS_ENABLED=true cargo -Zbuild-analysis check