-
Notifications
You must be signed in to change notification settings - Fork 1.3k
pkg/report: get only the thread local reports #6376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
d565a9a
to
1e097c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this code should also handle reports from the CPU context.
vm/vm.go
Outdated
func getReportChain(reps []*report.Report, threadID string) []*report.Report { | ||
var res []*report.Report | ||
for _, rep := range reps { | ||
if rep.ThreadID == threadID { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't pkg/report already filtered reports by context?
If yes, then why do we need this change here?
If no, then how have pkg/report parsed output with intermixed reports from different context? Are we careful enough during parsing to properly split all intermixed reports from different contexts? Or if not, then isn't our parsing broken?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't pkg/report already filtered reports by context?
No. It extracts the reports one-by-one paying attention to the "current report ThreadID".
If no, then how have pkg/report parsed output with intermixed reports from different context? Are we careful enough during parsing to properly split all intermixed reports from different contexts? Or if not, then isn't our parsing broken?
Current logic is trying to get report assuming it may be starting at any line.
Suppressed/broken reports are ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. It extracts the reports one-by-one paying attention to the "current report ThreadID".
Then isn't it broken for what you want to do?
What will it extract from the following log?
[C1] report0_first_line
[C1] report0_middle_line
[C1] report0_last_line
[C2] report1_first_line
[C2] report1_middle_line
[C1] report2_first_line
[C1] report2_middle_line
[C1] report2_last_line
[C2] report1_last_line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will work correctly because report1 will be parsed from report1_first_line
to report1_last_line
ignoring the lines from another context.
Once done, we'll start looking for the next report from the line [C2] report1_middle_line
.
It is a bit ~quadratish algorithm, but the correctness should be Ok.
Let me add the synthetic test to cover this case. It should be helpful to double check your concern and prevent these problems in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer (see pkg/report/testdata/linux/parse_all/3
) is:
- report0 concatenated with report2.
- report1.
- report2.
Point 1 is a bug, but I don't think it is critical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main purpose of report is being included in every bug email inline. I think we generally want to include only one report to not make emails way too long. Will anything here affect emails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will anything here affect emails?
After this PR and #6332 the plan is to disable panic_on_warn.
We'll start receiving bloated first report after that. That is the problem.
Thanks. It means point 1 is critical and should be fixed.
Please extend the commit/PR description, I have only read what you provided here (I have no other context) and thus I struggle to understand the problem it's supposed to solve. |
d932998
to
003ba11
Compare
After this change we'll see only the reports coming from the same thread.
003ba11
to
712a9bf
Compare
// TestParseAll's focus points are the ability to: | ||
// 1. parse multiple reports | ||
// 2. extract correct ThreadID/CpuID. | ||
func TestParseAll(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we parse all of our existing reports with this? How large will be the diff for existing reports? I think they generally did not include multiple reports, so perhaps not much will change. That may be better than creating 2 parallel test suites and 2 different tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This testing suite may benefit from the refactoring.
Do you feel concerned only about the parse_all
or also about the guilty
, guilty_raw
, symbolize
etc.?
I think all this transformation/extraction testing may be simplified a lot if we'll agree to have 2 files per test:
The code will be: Transform(input) -> serialize what we care about -> compare to the output file.
Current -update
flag will work the same way allowing us to do the large scale tuning.
In this case we don't need any parsing logic. Only the function_under_test + logging the parts we care about.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you feel concerned only about the parse_all or also about the guilty, guilty_raw, symbolize etc.?
guilty accepts different inputs (symbolized and cleaned output, basically reports).
I dunno what is symbolize, and why it's separated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One file per test was intentional, it makes diff reviews possible.
For example, if we split them into 2 files, if somebody changes something that changes output, PR will contain only the output files, but it will be hard to understand if these are legit changes or not, b/c you won't see the input files in the PR.
After this change we'll see only the reports coming from the same thread.