Using thanos query components as federation layer for multiple mimir clusters #8220
Replies: 1 comment
-
Hello @sarthaktyagi-505! I'm Dosu, your friendly neighborhood bot here to help you tackle bugs, answer questions, and dive into contributions while we wait for a human maintainer. Let's make this journey smoother together! 🚀 I found some relevant information that might help address your questions:
Additionally, there is ongoing work in Thanos to support native histograms, which might be relevant to your use case since you mentioned promxy's lack of support for native histograms. Thanos is actively testing and providing experimental support for native histograms, which could be beneficial for your setup [2][3][4][5]. These insights might help you optimize your current setup or guide your contributions to the project. To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello community,
Intro
We operate a real large mimir cluster. Although we are able to support a billion+ active series in the cluster, we are growing at a rate which might challenge the limits of Mimir for supporting all the series in 1 cluster. So we started exploring the case of sharding data into more than 1 mimir clusters. We can do it by exposing different read endpoints to the clusters but that way we loose a unified view of our dashboards and functionality to easily look for all the data in one place. For this case we thought of federating the data using Thanos.
Thanos Read components for federation
We did an initial POC and we were able to connect 2 mimir clusters with a federation layer using thanos and were also able to correctly aggregate the results across 2 clusters. But as we started putting heavier queries we are started seeing an increase in the query latency and some limitations related to remote read api. Here are the couple of limitations we found out.
The first 2 we can still live through since the 2nd point has already been suggested to make a user designed flag, but the third point being querier being smart enough to be able to use query and query_range apis wherever necessary can absolutely make things alot faster. This is present in the latest version of promxy and yeilds better performance in terms of query latency, but promxy still doesnt support native histograms.
Questions
Thanks in advance, hopefully it also helps the others in the community.
-Sarthak
Beta Was this translation helpful? Give feedback.
All reactions