You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There can be other scenarios for why :application_controller would be busy, but the one that I've seen is where you're draining connections while shutting down using a library like https://hexdocs.pm/plug_cowboy/Plug.Cowboy.Drainer.html. Without connection draining when receiving a SIGTERM the endpoint will shut down immediately, killing any current connections. With connection draining listeners on the port are suspended, meaning no more connections are opened, but allow the existing connections to drain, and then (and only then) proceed with shutting down the endpoint.
This means :application_controller asks the application containing the endpoint to shut down and waits for it to be done. While it's waiting, it's completely blocked and can't respond to messages. Depending on how long your draining timeout is, this can be a long time. Prometheus.Metric uses Application.started_applications which sends a message to :application_controller and waits timeout (5 seconds) for a response. While draining connections this will always fail, causing Prometheus.Metric to blow up (this also means Prometheus.PlugExporter blows up when Prometheus tries to scrape). If it's helpful I can set up a repo that reproduces this.
Is it possible to avoid calling Application.started_applications? Or catching the failure?
I may also be missing something, but why is the on_load being called each time a request hits the Prometheus.PlugExporter?
The text was updated successfully, but these errors were encountered:
There can be other scenarios for why
:application_controller
would be busy, but the one that I've seen is where you're draining connections while shutting down using a library like https://hexdocs.pm/plug_cowboy/Plug.Cowboy.Drainer.html. Without connection draining when receiving aSIGTERM
the endpoint will shut down immediately, killing any current connections. With connection draining listeners on the port are suspended, meaning no more connections are opened, but allow the existing connections to drain, and then (and only then) proceed with shutting down the endpoint.This means
:application_controller
asks the application containing the endpoint to shut down and waits for it to be done. While it's waiting, it's completely blocked and can't respond to messages. Depending on how long your draining timeout is, this can be a long time.Prometheus.Metric
usesApplication.started_applications
which sends a message to:application_controller
and waitstimeout
(5 seconds) for a response. While draining connections this will always fail, causingPrometheus.Metric
to blow up (this also meansPrometheus.PlugExporter
blows up when Prometheus tries to scrape). If it's helpful I can set up a repo that reproduces this.Is it possible to avoid calling
Application.started_applications
? Orcatch
ing the failure?I may also be missing something, but why is the
on_load
being called each time a request hits thePrometheus.PlugExporter
?The text was updated successfully, but these errors were encountered: