Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics server dead after pushing metrics on GKE #168

Closed
yfried opened this issue Oct 28, 2021 · 9 comments
Closed

Metrics server dead after pushing metrics on GKE #168

yfried opened this issue Oct 28, 2021 · 9 comments

Comments

@yfried
Copy link

yfried commented Oct 28, 2021

Hi,
I deployed the graphite exporter to k8s (basically copied from the statsd-exporter helm chart) and tried to push metrics using the prometheus python client.

This works fine locally, but on k8s, whenever I push, log shows the incoming metrics (parsed with labels) but the metrics server starts returning 500 errors until the liveness probe kills it.

I'm using this script:

import logging
import random
import time

from prometheus_client import Counter
from prometheus_client.bridge.graphite import GraphiteBridge

_metric_sent_messages = Counter('test_metric', 'test sent messages',)

logging.getLogger().setLevel(logging.DEBUG)


def process_request(t):
    """A dummy function that takes some time."""
    _metric_sent_messages.inc()
    print(f"sleep({t})")
    time.sleep(t)


if __name__ == '__main__':
    # Start up the server to expose the Prometheus metrics.
    gb = GraphiteBridge(('prometheus-graphite-exporter.monitoring', 8090), tags=True)

    # Generate some requests.
    for i in range(10):
        process_request(random.random())
    print("push metrics")
    gb.push(prefix="demo.test")

my graphite config

    mappings:
    - match: "*.*.*"
      name: "$3"

And my input arguments:

    - --web.listen-address=:8080
    - --web.telemetry-path=/metrics
    - --graphite.listen-address=:8090
    - --graphite.mapping-config=/etc/prometheus-graphite-exporter/graphite-mapping.yaml
    - --log.level=debug

I'll just add that when the metrics don't match the mapping (*.*.*) they show up fine at the metrics server with no issue

@matthiasr
Copy link
Contributor

deployed the graphite exporter […] and tried to push metrics using the prometheus python client

Unfortunately this does not work, and I am disinclined to support it, see the discussion in #165 (I'm going to add a note to the readme). Are getting the same message when you open/curl /metrics?

Throwing 500s on the metrics endpoint is not a great thing either though, this is covered in #79.

matthiasr added a commit that referenced this issue Oct 29, 2021
cf. #165 #168

Using this exporter together with the Graphite bridge (at least the one
in the Python client) does not work. Discourage trying this; point at
alternatives instead.

Signed-off-by: Matthias Rampke <[email protected]>
matthiasr added a commit that referenced this issue Oct 29, 2021
Closes #165 #168.

Using this exporter together with the Graphite bridge (at least the one
in the Python client) does not work. Discourage trying this; point at
alternatives instead.

Signed-off-by: Matthias Rampke <[email protected]>
@yfried
Copy link
Author

yfried commented Oct 29, 2021

Assuming I can rewrite the graphite bridge, is there an easy fix?
Like prefixing all the metrics with a specific string?

@matthiasr
Copy link
Contributor

You can disable the process collector, this should stop the Python client from emitting the conflicting metrics.

However, I would strongly recommend instead configuring Prometheus to scrape your application directly, or to use the Grafana Agent to scrape-and-push. The Graphite bridge and exporter are not a good way to build a push-based collection system.

@matthiasr
Copy link
Contributor

See also #170 where I try to clarify this in the README.

@yfried
Copy link
Author

yfried commented Oct 29, 2021

I want to use this for batch jobs instrumentation, and keeping the Prometheus client with the Graphite bridge means we can use the same client through the entire code base

@matthiasr
Copy link
Contributor

Ah! For that case, the pushgateway is a better choice, I will add that to #170, thank you!

@yfried
Copy link
Author

yfried commented Oct 29, 2021

I still hope to make it work. The graphite exporter is so much like the statsd one that I practically get the helm chart OOB, and it gives me the ability to tag all of the metrics it filters by their source.
It also has a built-in expire limit.
Non of these features is available in the push gateway as far as I can tell

@matthiasr
Copy link
Contributor

In the pushgateway, both are solved by using an appropriate grouping key. Make sure that it identifies the target, and PUT your metrics. If a metric is no longer submitted as a given group, it will disappear.

@matthiasr
Copy link
Contributor

I'm going to close this issue, as changing the default metrics would be a breaking change for existing users, and I don't want to do that for a use-case that has better alternatives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants