Skip to content

Commit 8acac5f

Browse files
philipphofmannsilent1mezzowmak
authored
rfc: SDKs report file I/O on the main thread (#40)
This RFC aims to revisit the decision that Ingest creates file I/O on the main thread performance issues and not the SDKs. Co-authored-by: Adam McKerlie <[email protected]> Co-authored-by: William Mak <[email protected]>
1 parent 64fd295 commit 8acac5f

File tree

2 files changed

+95
-0
lines changed

2 files changed

+95
-0
lines changed

Diff for: README.md

+1
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,5 @@ This repository contains RFCs and DACIs. Lost?
2323
- [0036-auto-instrumentation-ui-thread](text/0036-auto-instrumentation-ui-thread.md): auto-instrumentation UI thread
2424
- [0037-anr-rates](text/0037-anr-rates.md): Calculating accurate ANR rates
2525
- [0038-scrubbing-sensitive-data](text/0038-scrubbing-sensitive-data.md): Scrubbing sensitive data - how to improve
26+
- [0039-sdks-report-file-IO-on-main-thread](text/0039-sdks-report-file-IO-on-main-thread): SDKs report file I/O on the main thread
2627
- [0042-gocd-succeeds-freight-as-our-cd-solution](text/0042-gocd-succeeds-freight-as-our-cd-solution.md): Plan to replace freight with GoCD

Diff for: text/0039-sdks-report-file-IO-on-main-thread.md

+94
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
* Start Date: 2022-11-28
2+
* RFC Type: decision
3+
* RFC PR: [#33](https://github.com/getsentry/rfcs/pull/33)
4+
* RFC Status: Decided
5+
* RFC Driver: [Philipp Hofmann](https://github.com/philipphofmann)
6+
* RFC Approver: [Adam McKerlie](https://github.com/silent1mezzo)
7+
8+
# Summary
9+
10+
This RFC aims to clarify if SDKs should report file I/O on the main thread (__FIOMT__) as errors or performance issues.
11+
12+
# Motivation
13+
14+
On June 21, 2022, we decided with [DACI](https://www.notion.so/sentry/Performance-Issue-Creation-POC-e521772ebccb482b83b08f4f8a3db2cb) to create performance issues in Ingest. While implementing the FIOMT for Android, the question arose as to why SDKs don't report FIOMT, as they could add more context to make the issue more actionable.
15+
16+
# Background <a name="background"></a>
17+
18+
Performance issues differ from error issues because they don't cause an exception or stop the code from running and have a perceived performance impact on end-users.
19+
20+
# Option Chosen
21+
22+
The best option for File I/O on the main thread performance issues is __Option 3: Ingest reports FIOMT as performance issues.__
23+
24+
We decided against __Option 1__ as FIOMT is not an error issue; it's a performance issue, as pointed out in [background](#background).
25+
We chose __Option 3__ over __Option 2__ because we can easily detect it via spans ([PR is already complete](https://github.com/getsentry/sentry/pull/41646/)), and the
26+
problem exists across multiple SDKs (iOS, Android, RN, Flutter, .NET, and possibly more). There's also a likely future where FIOMT is detected via profiles, and
27+
running detection in the ingestion pipeline will help combine Performance and Profiling issues.
28+
29+
# Options Considered
30+
31+
* [Option 1: SDKs report FIOMT as errors](#option-1)
32+
* [Option 2: SDKs report FIOMT as performance issues](#option-2)
33+
* [Option 3: Ingest reports FIOMT as performance issues](#option-3)
34+
35+
## Option 1: SDKs report FIOMT as errors <a name="option-1"></a>
36+
37+
SDKs report FIOMT as errors with a stacktrace.
38+
39+
To clarify the threshold and configuration, an experimental feature phase can help to get feedback.
40+
41+
### Pros
42+
43+
1. SDKs can capture a stack trace which will helps actionability and fingerprinting/grouping. It is worth noting that they are not mandatory as spans also point people to the issue. Attaching them to transactions would be too expensive.
44+
2. Users don't have to enable performance monitoring.
45+
3. No running transaction required.
46+
4. Sentry can tie together transaction and error far more easily since both objects exist at the point of time the performance issue will be created.
47+
48+
### Cons
49+
50+
1. Double dipping quotas, sending the transaction and the error created within that transaction.
51+
2. [Cons 1-3 of option 2](#option-2-cons).
52+
3. The opposit of [pros of 2-5 of option 2](#option-2-pros).
53+
54+
## Option 2: SDKs report FIOMT as performance issues <a name="option-2"></a>
55+
56+
SDKs detect and report FIOMT as a performance issue. To achieve this we need to:
57+
58+
1. answer billing questions.
59+
2. make changes in Ingest to accept performance issues.
60+
61+
### Pros <a name="option-2-pros"></a>
62+
63+
1. No double dipping of quotas.
64+
2. Sentry correctly categorizes FIOMT as a performance issue and returns it when searching inside Sentry for performance issues.
65+
3. Sentry presents FIOMT as a performance issue, highlighting root causes and resources to fix the issue.
66+
4. Performance-issue-specific quotas and thresholds apply to FIOMT.
67+
5. Planned UX and workflow changes specific to performance issues also apply to FIOMT issues.
68+
69+
### Cons <a name="option-2-cons"></a>
70+
71+
1. Need for per-SDK rollout.
72+
2. Changing the algorithm or thresholds requires SDK rollout.
73+
3. Detection is mixed between ingest and SDK.
74+
4. [Cons 1-3 of option 3](#option-3-cons).
75+
76+
## Option 3: Ingest reports FIOMT as performance issues <a name="option-3"></a>
77+
78+
This option leaves the detection of FIOMT to Ingest, which can detect FIOMT for any SDK sending a span with `blocked_main_thread=true`, as outlined in [#36](https://github.com/getsentry/rfcs/pull/36).
79+
80+
### Pros
81+
82+
1. No need for per-SDK rollout. It's impossible to scale all performance issues across the SDKs we support.
83+
2. Changing the algorithm or thresholds doesn't require SDK rollout.
84+
3. Consistent location for detector logic.
85+
86+
### Cons <a name="option-3-cons"></a>
87+
88+
1. Users have to enable performance monitoring.
89+
2. A transaction is required to be running while a performance issue is happening.
90+
3. We can't attach a stacktrace to the span that caused the issue, so identifying the root cause is more challenging.
91+
92+
## Unresolved questions
93+
94+
1. For [option 2](#option-2) we need to answer billing questions.

0 commit comments

Comments
 (0)