-
Notifications
You must be signed in to change notification settings - Fork 72
Description
Current spans and events don't provide enough context to tell you, for example, why an AppStart or activity Created / Resumed span is taking a long time. Currently the only way you could reliably figure out why a span has elevated duration is by instrumenting additional sub-spans yourself.
In a very generic sense, any of your application's spans could be delayed for the following reasons:
- You have complex or deeply-nested view hierarchies that bog down the Android UI pipeline [Android App quality article].
- In reality this would mostly affect draw times, but that's a separate issue altogether.
- There are expensive operations happening in the process / UI thread that are slowing things down. The easiest way to get a measure on this is to monitor CPU performance (through a
CPU utilization
gauge or metric).
We could benefit from a SpanProcessor that can sample relative, average CPU utilization for the duration of a span and append the following attributes:
process.cpu.avg_utilization
- The relative, average CPU utilization for the app process during the spanprocess.cpu.elapsed_time_start_millis
- The elapsed CPU time at the start of this spanprocess.cpu.elapsed_time_end_millis
- The elapsed CPU time at the end of this span
How do we do this?
The calculation is very straightforward:
cpuUtilizationAvg = 100 * (spanElapsedCpuDuration / spanDuration) / numCores
Calculations, overall strategy, taken from a Lyft Engineering article on this subject.
We would also need to take advantage of the experimental ExtendedSpanProcessor introduced in an upstream PR (open-telemetry/opentelemetry-java#6367) to append the new attributes before a span ends via the onEnding
callback provided in the new SpanProcessor type.