-
Now the data is stored in memory(correct me if I m wrong), which means we will lose all data after a system downtime. For most developer, the most important metric to get is when the system has issues(which potentially cause a fault or error), and if once the system shut down, the data are all lost, then it loses a big change to help user to improve their system. So I wonder if we can store the data in a persistent way, which not only benefits recovering but also can reduce memory load(memory are more expensive than disk). |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
The metrics are exposed in the Prometheus format. That means that the data is stored in the Prometheus instance that polls the system, not in the system being monitored. If the system fails catastrophically anyways, there's no means to guarantee that a "graceful shutdown" would happen, saving some logs or metrics in a persistent storage. That's why usually you want to store the data in a system that's different than the one you want to monitor (here, the other system storing data is "the prometheus instance"), and why information around system downtime is always "best effort". And runtime, the system being monitored only stores the current values of the metrics, not the timeseries, so the memory footprint is as low as possible. |
Beta Was this translation helpful? Give feedback.
The metrics are exposed in the Prometheus format. That means that the data is stored in the Prometheus instance that polls the system, not in the system being monitored.
If the system fails catastrophically anyways, there's no means to guarantee that a "graceful shutdown" would happen, saving some logs or metrics in a persistent storage. That's why usually you want to store the data in a system that's different than the one you want to monitor (here, the other system storing data is "the prometheus instance"), and why information around system downtime is always "best effort".
And runtime, the system being monitored only stores the current values of the metrics, not the timeseries, so the…