You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently Autometrics approaches the problem of application performance and reliability monitoring from a “high level” approach, where we monitor and gather the behaviour of individual functions to infer the performance of a deployed application. As raised in a conversation with @XAMPPRocky (feel free to correct me if anything’s off here!), this approach works really well to create alerts or observe the symptoms of performance issues (e.g. My /route endpoint is slow because the writeToFs function it calls is too slow), but it still leaves the developers with having to diagnose individual functions to diagnose the issues (e.g. Why is writeToFs slow, and can I really do something about it?).
Adding (and then exposing) profiling information could help make Autometrics a bit more proactive at pointing to possible performance issues that could later be the cause of an alert triggering, or the CPU/memory bottleneck that makes you need an additional pod or vCPU.
A concrete idea for having profiling information
Collecting profiler data
One possible way to do this would be to have an extra compilation/runtime flag for Autometrics that enables a profiler in the “pre” phase of the function, and stops the profiler when exiting the function, saving the generated report. The report generation would preferably happen on another thread if possible to save performance, and the profile info would include a span ID and a trace ID if possible.
Visualizing profiler data
Once those reports are created, we need a way to get them out of the running app and in a location where you could visualize them. I tried to make a quick search for a way to collect and view remotely profiling data (not traces 1), and I didn’t find a lot besides Grafana Pyroscope. So assuming this is the standard tool to use, Autometrics libraries would take an extra argument when being initialized, to activate a "Pyroscope backend" with an ingestion URL, and all profiliing information would automatically go there if the Autometric “profiling” runtime flag is active.
Open questions
There are still a lot of unknowns around this problematic before deciding to start working on making profiling better, and it would be immensely useful if you could give us your opinions on the matter, either here or by pinging me on Discord if you’re more comfortable there
Is this profiling information something that you currently use?
Is it something that you would want to use more if available?
If you already used/use profile information, how did you visualize the information? When I wanted to do memory profiling I used special builds with bytehound and heaptrack, but that’s not practical to embed for production deployments. I assume that for CPU profiling using pprof to collect data and then flamegraph or perfetto to visualize data is common? Has anyone tried Grafana Pyroscope and used it for that purpose? I’m curious about everyone’s experience there
Footnotes
For visualizing traces, there are tools like Jaeger or Grafana Tempo, and most tracing libraries deal both with creating spans and sending them to the collector of choice, so it’s a bit out of Autometrics scope for now. ↩
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
What Autometrics provides
Currently Autometrics approaches the problem of application performance and reliability monitoring from a “high level” approach, where we monitor and gather the behaviour of individual functions to infer the performance of a deployed application. As raised in a conversation with @XAMPPRocky (feel free to correct me if anything’s off here!), this approach works really well to create alerts or observe the symptoms of performance issues (e.g. My
/route
endpoint is slow because thewriteToFs
function it calls is too slow), but it still leaves the developers with having to diagnose individual functions to diagnose the issues (e.g. Why iswriteToFs
slow, and can I really do something about it?).Adding (and then exposing) profiling information could help make Autometrics a bit more proactive at pointing to possible performance issues that could later be the cause of an alert triggering, or the CPU/memory bottleneck that makes you need an additional pod or vCPU.
A concrete idea for having profiling information
Collecting profiler data
One possible way to do this would be to have an extra compilation/runtime flag for Autometrics that enables a profiler in the “pre” phase of the function, and stops the profiler when exiting the function, saving the generated report. The report generation would preferably happen on another thread if possible to save performance, and the profile info would include a span ID and a trace ID if possible.
Visualizing profiler data
Once those reports are created, we need a way to get them out of the running app and in a location where you could visualize them. I tried to make a quick search for a way to collect and view remotely profiling data (not traces 1), and I didn’t find a lot besides Grafana Pyroscope. So assuming this is the standard tool to use, Autometrics libraries would take an extra argument when being initialized, to activate a "Pyroscope backend" with an ingestion URL, and all profiliing information would automatically go there if the Autometric “profiling” runtime flag is active.
Open questions
There are still a lot of unknowns around this problematic before deciding to start working on making profiling better, and it would be immensely useful if you could give us your opinions on the matter, either here or by pinging me on Discord if you’re more comfortable there
Is this profiling information something that you currently use?
Is it something that you would want to use more if available?
If you already used/use profile information, how did you visualize the information? When I wanted to do memory profiling I used special builds with bytehound and heaptrack, but that’s not practical to embed for production deployments. I assume that for CPU profiling using
pprof
to collect data and then flamegraph or perfetto to visualize data is common? Has anyone tried Grafana Pyroscope and used it for that purpose? I’m curious about everyone’s experience thereFootnotes
For visualizing traces, there are tools like Jaeger or Grafana Tempo, and most tracing libraries deal both with creating spans and sending them to the collector of choice, so it’s a bit out of Autometrics scope for now. ↩
Beta Was this translation helpful? Give feedback.
All reactions