-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
thanos-query-frontend: Enable Thanos Query Stats Propagation & cache response headers #10
base: monzo-master-v0.35.0-rc-0.65
Are you sure you want to change the base?
Conversation
a5a0575
to
11d09c7
Compare
acee5c0
to
de9c1b0
Compare
Thanos Query gives us some really nice detailed internal stats - but thanos-query-frontend annoyingly blats them by trying to parse the response as a Prometheus response, which has different fields and a different structure. In addition, thanos-query-frontend doesn't even pass the &stats=all parameter through if you try to request it. This change fixes this by propagating the stats request parameter to the downstream, and decoding/encoding the Thanos Query stats structure properly. As an extra, we also set a response header if we get a cache hit so that upstreams can use this. Signed-off-by: milesbryant <[email protected]>
de9c1b0
to
64eb68e
Compare
@@ -145,8 +145,6 @@ func registerQueryFrontend(app *extkingpin.App) { | |||
|
|||
cmd.Flag("query-frontend.log-queries-longer-than", "Log queries that are slower than the specified duration. "+ | |||
"Set to 0 to disable. Set to < 0 to enable on all queries.").Default("0").DurationVar(&cfg.CortexHandlerConfig.LogQueriesLongerThan) | |||
cmd.Flag("query-frontend.query-stats-enabled", "True to enable query statistics tracking. "+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed this now that we return query stats directly in the response
@@ -84,12 +84,22 @@ message Matrix { | |||
} | |||
|
|||
message PrometheusResponseStats { | |||
PrometheusResponseSamplesStats samples = 1 [(gogoproto.jsontag) = "samples"]; | |||
} | |||
message Timings { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the difference between Thanos stats and Prometheus stats responses
for _, existing := range headers { | ||
if existing.Name == newHeader.Name { | ||
// if headers match, overwrite with the new header | ||
existing = newHeader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be *existing = *newHeader ?
Also, below we sort the responses by response time and then the first non-null explanation is taken rather than the last one. Is consistency between these important?
func (resp *PrometheusInstantQueryResponse) AddHeader(key, value string) { | ||
resp.Headers = append(resp.Headers, &PrometheusResponseHeader{Name: key, Values: []string{value}}) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll end up with duplicate headers if the same header key is added twice - should this check for the header already existing and either replace it or append to its values?
@@ -217,9 +227,25 @@ func (prometheusCodec) MergeResponse(_ Request, responses ...Response) (Response | |||
// we need to pass on all the headers for results cache gen numbers. | |||
var resultsCacheGenNumberHeaderValues []string | |||
|
|||
var headers []*PrometheusResponseHeader | |||
// merge headers | |||
for _, res := range responses { | |||
promResponses = append(promResponses, res.(*PrometheusResponse)) | |||
resultsCacheGenNumberHeaderValues = append(resultsCacheGenNumberHeaderValues, getHeaderValuesWithName(res, ResultsCacheGenNumberHeaderName)...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt this matters given what was happening before, but this could be out of sync with the actual headers we return in the response due to the merging below.
// Copy Prometheus headers into http response | ||
for _, h := range a.Headers { | ||
for _, v := range h.Values { | ||
if strings.HasPrefix(h.Name, "X-") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the headers always upper case?
Thanos Query gives us some really nice detailed internal stats - but thanos-query-frontend annoyingly blats them by trying to parse the response as a Prometheus response, which has different fields and a different structure.
In addition, thanos-query-frontend doesn't even pass the &stats=all parameter through if you try to request it.
This change fixes this by propagating the stats request parameter to the downstream, and decoding/encoding the Thanos Query stats structure properly.
As an extra, we also set a response header if we get a cache hit so that upstreams can use this.