Metrics server dead after pushing metrics on GKE #168

yfried · 2021-10-28T11:21:49Z

Hi,
I deployed the graphite exporter to k8s (basically copied from the statsd-exporter helm chart) and tried to push metrics using the prometheus python client.

This works fine locally, but on k8s, whenever I push, log shows the incoming metrics (parsed with labels) but the metrics server starts returning 500 errors until the liveness probe kills it.

I'm using this script:

import logging
import random
import time

from prometheus_client import Counter
from prometheus_client.bridge.graphite import GraphiteBridge

_metric_sent_messages = Counter('test_metric', 'test sent messages',)

logging.getLogger().setLevel(logging.DEBUG)


def process_request(t):
    """A dummy function that takes some time."""
    _metric_sent_messages.inc()
    print(f"sleep({t})")
    time.sleep(t)


if __name__ == '__main__':
    # Start up the server to expose the Prometheus metrics.
    gb = GraphiteBridge(('prometheus-graphite-exporter.monitoring', 8090), tags=True)

    # Generate some requests.
    for i in range(10):
        process_request(random.random())
    print("push metrics")
    gb.push(prefix="demo.test")

my graphite config

    mappings:
    - match: "*.*.*"
      name: "$3"

And my input arguments:

    - --web.listen-address=:8080
    - --web.telemetry-path=/metrics
    - --graphite.listen-address=:8090
    - --graphite.mapping-config=/etc/prometheus-graphite-exporter/graphite-mapping.yaml
    - --log.level=debug

I'll just add that when the metrics don't match the mapping (*.*.*) they show up fine at the metrics server with no issue

The text was updated successfully, but these errors were encountered:

matthiasr · 2021-10-29T08:09:50Z

deployed the graphite exporter […] and tried to push metrics using the prometheus python client

Unfortunately this does not work, and I am disinclined to support it, see the discussion in #165 (I'm going to add a note to the readme). Are getting the same message when you open/curl /metrics?

Throwing 500s on the metrics endpoint is not a great thing either though, this is covered in #79.

cf. #165 #168 Using this exporter together with the Graphite bridge (at least the one in the Python client) does not work. Discourage trying this; point at alternatives instead. Signed-off-by: Matthias Rampke <[email protected]>

Closes #165 #168. Using this exporter together with the Graphite bridge (at least the one in the Python client) does not work. Discourage trying this; point at alternatives instead. Signed-off-by: Matthias Rampke <[email protected]>

yfried · 2021-10-29T10:32:18Z

Assuming I can rewrite the graphite bridge, is there an easy fix?
Like prefixing all the metrics with a specific string?

matthiasr · 2021-10-29T12:07:45Z

You can disable the process collector, this should stop the Python client from emitting the conflicting metrics.

However, I would strongly recommend instead configuring Prometheus to scrape your application directly, or to use the Grafana Agent to scrape-and-push. The Graphite bridge and exporter are not a good way to build a push-based collection system.

matthiasr · 2021-10-29T12:08:06Z

See also #170 where I try to clarify this in the README.

yfried · 2021-10-29T12:11:10Z

I want to use this for batch jobs instrumentation, and keeping the Prometheus client with the Graphite bridge means we can use the same client through the entire code base

matthiasr · 2021-10-29T13:05:11Z

Ah! For that case, the pushgateway is a better choice, I will add that to #170, thank you!

yfried · 2021-10-29T13:09:40Z

I still hope to make it work. The graphite exporter is so much like the statsd one that I practically get the helm chart OOB, and it gives me the ability to tag all of the metrics it filters by their source.
It also has a built-in expire limit.
Non of these features is available in the push gateway as far as I can tell

matthiasr · 2021-10-29T13:31:31Z

In the pushgateway, both are solved by using an appropriate grouping key. Make sure that it identifies the target, and PUT your metrics. If a metric is no longer submitted as a given group, it will disappear.

matthiasr · 2021-11-26T10:50:39Z

I'm going to close this issue, as changing the default metrics would be a breaking change for existing users, and I don't want to do that for a use-case that has better alternatives.

matthiasr mentioned this issue Oct 29, 2021

Document incompatibility with Graphite bridge #170

Merged

matthiasr closed this as completed Nov 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metrics server dead after pushing metrics on GKE #168

Metrics server dead after pushing metrics on GKE #168

yfried commented Oct 28, 2021

matthiasr commented Oct 29, 2021

yfried commented Oct 29, 2021

matthiasr commented Oct 29, 2021

matthiasr commented Oct 29, 2021

yfried commented Oct 29, 2021

matthiasr commented Oct 29, 2021

yfried commented Oct 29, 2021

matthiasr commented Oct 29, 2021

matthiasr commented Nov 26, 2021

Metrics server dead after pushing metrics on GKE #168

Metrics server dead after pushing metrics on GKE #168

Comments

yfried commented Oct 28, 2021

matthiasr commented Oct 29, 2021

yfried commented Oct 29, 2021

matthiasr commented Oct 29, 2021

matthiasr commented Oct 29, 2021

yfried commented Oct 29, 2021

matthiasr commented Oct 29, 2021

yfried commented Oct 29, 2021

matthiasr commented Oct 29, 2021

matthiasr commented Nov 26, 2021