Query Frontend fails readiness probe #8392
-
I have been playing with different helm chart installs (RD K3s cluster) of thanos and had moved away from bitnami chart to one maintained by stevehipwell. Everything was working, but after one reinstall, the query-frontend service/pod got stuck in a readiness loop. Essentially these same two logs over and over again: {"caller":"intrumentation.go:67","level":"warn","msg":"changing probe status","reason":"roundtripping to downstream URL http://thanos-query.cluster-monitoring.svc.cluster.local:10902/-/ready: context deadline exceeded","status":"not-ready","ts":"2025-07-24T01:42:25.286080439Z"} I can't see anything wrong with any of the downstream systems from the logs and the thanos-query service is reporting being ready and healthy. What is strange is that if I do the same helm install on a different cluster (EKS vs local k3s), everything works as expected. I have tried resetting the cluster and rancher desktop, but with no success, and as mentioned it used to work just fine. Any suggestions on how to debug a readiness problem that might point me in the right direction? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Figured it out. Seems my coredns config was messed up which meant that full addresses that included the cluster.local portion never resolved. Not sure what changed on my system but was able to correct it. |
Beta Was this translation helpful? Give feedback.
Figured it out. Seems my coredns config was messed up which meant that full addresses that included the cluster.local portion never resolved. Not sure what changed on my system but was able to correct it.