-
-
Notifications
You must be signed in to change notification settings - Fork 372
fix: Use correct parsing for stackframes #6908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
73c4c82 to
3985141
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #6908 +/- ##
=============================================
- Coverage 85.057% 85.017% -0.040%
=============================================
Files 454 455 +1
Lines 27739 27692 -47
Branches 12162 12157 -5
=============================================
- Hits 23594 23543 -51
- Misses 4100 4105 +5
+ Partials 45 44 -1
... and 4 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
3985141 to
ced502e
Compare
Performance metrics 🚀
|
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 762a701 | 1225.95 ms | 1244.49 ms | 18.54 ms |
| 61414e8 | 1225.49 ms | 1254.28 ms | 28.79 ms |
| 588dd7c | 1235.11 ms | 1241.76 ms | 6.65 ms |
| f5666e7 | 1227.08 ms | 1260.18 ms | 33.10 ms |
| b9aacb6 | 1230.42 ms | 1251.00 ms | 20.58 ms |
| 16f6edc | 1234.02 ms | 1269.67 ms | 35.65 ms |
| b709887 | 1193.52 ms | 1216.74 ms | 23.22 ms |
| 5fc3364 | 1222.36 ms | 1252.33 ms | 29.96 ms |
| 1fe932f | 1231.92 ms | 1253.44 ms | 21.52 ms |
| 80538ca | 1216.70 ms | 1253.92 ms | 37.22 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 762a701 | 23.75 KiB | 1022.78 KiB | 999.03 KiB |
| 61414e8 | 23.75 KiB | 867.69 KiB | 843.94 KiB |
| 588dd7c | 23.75 KiB | 938.33 KiB | 914.58 KiB |
| f5666e7 | 23.75 KiB | 963.18 KiB | 939.43 KiB |
| b9aacb6 | 23.75 KiB | 913.64 KiB | 889.89 KiB |
| 16f6edc | 23.74 KiB | 1022.94 KiB | 999.19 KiB |
| b709887 | 23.75 KiB | 1.01 MiB | 1016.14 KiB |
| 5fc3364 | 23.75 KiB | 1.00 MiB | 1006.00 KiB |
| 1fe932f | 23.75 KiB | 913.63 KiB | 889.88 KiB |
| 80538ca | 23.75 KiB | 989.99 KiB | 966.24 KiB |
Previous results on branch: mxRewriting
Startup times
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| bf3e245 | 1222.17 ms | 1257.09 ms | 34.92 ms |
| 44da6c5 | 1234.22 ms | 1266.02 ms | 31.80 ms |
| c5b3f7a | 1220.18 ms | 1259.44 ms | 39.25 ms |
| 8824c94 | 1226.98 ms | 1262.65 ms | 35.67 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| bf3e245 | 24.14 KiB | 1.03 MiB | 1.00 MiB |
| 44da6c5 | 24.14 KiB | 1.03 MiB | 1.00 MiB |
| c5b3f7a | 24.14 KiB | 1.01 MiB | 1013.45 KiB |
| 8824c94 | 24.14 KiB | 1.01 MiB | 1014.88 KiB |
d978079 to
169cf94
Compare
432f4ad to
a7949ab
Compare
a7949ab to
0c4f788
Compare
0c4f788 to
db09d78
Compare
| // that just contains the most common callstack. | ||
| func sentryMXBacktrace(inAppLogic: SentryInAppLogic?, handled: Bool) -> [SentryThread] { | ||
| callStacks.map { callStack in | ||
| let thread = SentryThread(threadId: 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: All threads incorrectly assigned thread ID 0
In sentryMXBacktrace(), all threads are assigned threadId: 0 regardless of their position in the callStacks array. The old Objective-C code used an incrementing counter to assign unique thread IDs (@(i)). When callStackPerThread is true (e.g., for crash diagnostics), multiple call stacks represent different threads and need distinct IDs. Using callStacks.enumerated().map would provide the index needed for unique thread IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
h: Yes that's valid feedback. I think the backend and frontend might get confused when having multiple threads with the threadId 0. If this is intendended, we should add a comment explaining why we have 0 for all threads.
philipphofmann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @noahsmartin, I added a few comments.
| // let firstEvent = try XCTUnwrap(invocations.first).event | ||
| // let secondEvent = try XCTUnwrap(invocations.element(at: 1)).event | ||
| // let thirdEvent = try XCTUnwrap(invocations.element(at: 2)).event |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m: There is still some test code commented out. Please either fixe the test or remove them if they aren't relevant anymore.
| // it represents data that is sampled across many threads and aggregated, we do not | ||
| // know which samples came from which thread. Instead we create just one fake thread | ||
| // that just contains the most common callstack. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m: Maybe worth adding info that we plan to change this by adding a flamegraph to app hang events. If we don't have GH issue, I think we could create one and link to the Android implementation and/or the frontend work that makes this possible.
| // that just contains the most common callstack. | ||
| func sentryMXBacktrace(inAppLogic: SentryInAppLogic?, handled: Bool) -> [SentryThread] { | ||
| callStacks.map { callStack in | ||
| let thread = SentryThread(threadId: 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
h: Yes that's valid feedback. I think the backend and frontend might get confused when having multiple threads with the threadId 0. If this is intendended, we should add a comment explaining why we have 0 for all threads.
| @@ -1,3 +1,4 @@ | |||
| @_implementationOnly import _SentryPrivate | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m: I feels a bit strange to me now that we have a dependency on _SentryPrivate here. This is now a bit mixed. We have both MetricKitCallStack JSON parsing with the JSONDecoder and also toDebugMeta and public func prepare(event: Event, inAppLogic: SentryInAppLogic?, handled: Bool) .
I would prefer keeping this file focused on only parsing JSON data. I would move the conversion logic to a different file. It can be an extension, but doing this in a different file feels a bit cleaner to me.
| public let callStackRootFrames: [SentryMXFrame] | ||
| // A Sample is the standard data format for a flamegraph taken from https://github.com/brendangregg/FlameGraph | ||
| // It is less compact than Apple's MetricKit format, but contains the same data and is easier to work with | ||
| struct Sample { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m: I think plenty of the structs and methods here could be fileprivate. We should hide that complexity and only make public what's required.
Fixes #4040
The problem was that for some events such as app hangs and CPUDiagnostics the metric kit payload contains an entire flamegraph. This change parses the flame graph into samples, and then picks the longest duration sample as the one we use for a stack trace. It also fixes over-counting the CPUDiagnostics which was emitting more than one event per exception. This is also fixed by identifying the longest duration sample.
Eventually we should support showing the full flame graph on the frontend, but with this PR the bug is fixed and we can consider that a future improvement.
I tested this on test flight by triggering crashes and verifying they are still correct, as well as manually replaying hang reports received in production