-
Notifications
You must be signed in to change notification settings - Fork 12
Monitoring & Alerting
Goose emits metrics around Job execution timings, success rates, latency & queue sizes. A Metric plugin receives these metrics & forwards it to its respective backend. Goose provides StatsD
as a specimen metric backend.
For Prometheus, or other metrics backends, a plugin can be injected by following this guide.
Metric | Type | Description |
---|---|---|
enqueued_jobs.<my-queue>.count | gauge | Count of jobs in my-queue
|
total_enqueued_jobs.count | gauge | Total count of all jobs |
scheduled_jobs.count | gauge | Count of scheduled jobs |
batches.count | gauge | Count of batches |
cron_jobs.count | gauge | Count of cron jobs |
dead_jobs.count | gauge | Count of dead jobs |
jobs.processed | count | Count of processed jobs |
jobs.succeeded | count | Count of successful jobs |
jobs.failed | count | Count of failed jobs |
jobs.recovered | count | Count of orphan jobs which were recovered |
batch.success | count | Count of batches where all jobs succeeded |
batch.dead | count | Count of batches where all jobs died |
batch.partial-success | count | Count of batches with mix of successful and dead jobs |
execution.latency | timing | Latency(in ms) between enqueue -> start of execution |
scheduled.latency | timing | Latency(in ms) between theoretical schedule time -> start execution |
cron_scheduled.latency | timing | Latency(in ms) between theoretical schedule time -> start execution |
retry.latency | timing | Latency(in ms) between theoretical retry time -> start of execution |
job.execution_time | timing | Time taken to execute a job(in ms) |
batch.completion_time | timing | Time taken to complete a batch(in ms) |
StatsD plugin can be configured in following ways:
Key | Description |
---|---|
:enabled? |
Boolean flag for enabling/disabling metrics |
:host |
Host of StatsD Aggregator |
:port |
Port of StatsD Aggregator |
:prefix |
Prefix for all metrics. Can be a generic term like "goose." or specific to microservice's name |
:sample-rate |
Sample rate of metric collection |
:tags |
Map of key-value pairs to be attached to every metric |
Note: Goose uses clj-statsd, which uses agents internally. Post stopping a worker,
(shutdown-agents)
must be called in order to exit the program.
(ns statsd-metrics
(:require
[goose.metrics.statsd :as statsd]
[goose.worker :as w]))
(let [statsd-opts {:enabled? true
:host "localhost"
:port 8125
:prefix "maverick."
:sample-rate 0.9
:tags {:top :gun}}
statsd (statsd/new statsd-opts)
worker-opts (assoc worker-opts :metrics-plugin statsd)
worker (w/start worker-opts)]
;; When shutting down worker...
(w/stop worker)
;; clj-statsd uses agents internally. Call (shutdown-agents) to exit the program.
(shutdown-agents))
Previous: Error Handling & Retries Next: Guide to Custom Metrics Backend
Home | Getting Started | RabbitMQ | Redis | Error Handling | Monitoring | Production Readiness | Troubleshooting
Need help? Open an issue or ping us on #goose @Clojurians slack.