Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Add new metric slow_request_throughput #619

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

tinitiuset
Copy link

What this PR does:

This PR adds two configuration parameters server.throughput-config.slow-request-cutoff and server.throughput-config.unit, exposes a new metric slow_request_server_throughput and calculates throughput in units/s for the metric using information from header Server-Timing.

If server.throughput-config.slow-request-cutoff is 0 no throughput will be calculated.

Implemented code is really similar to this branch by @krajorama. But adds some flexibility to measure throughput based on different signals. It will default to total_samples as processed samples are easier to explain to users. Discussed with @dimitarvdimitrov.

Checklist

  • Tests updated
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@CLAassistant
Copy link

CLAassistant commented Nov 14, 2024

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening!

The only thing I'm not sure about is the cutoff in milliseconds. I don't think that's what the original issue discussed, can you double-check?

It would also be nice to see some tests, the logic is getting a bit more complicated and shouldn't stay untested.

CHANGELOG.md Outdated Show resolved Hide resolved
server/server.go Outdated Show resolved Hide resolved
server/metrics.go Show resolved Hide resolved
middleware/instrument.go Outdated Show resolved Hide resolved
middleware/instrument.go Outdated Show resolved Hide resolved
@@ -105,6 +110,17 @@ func (i Instrument) Wrap(next http.Handler) http.Handler {
labelValues = append(labelValues, tenantID)
instrument.ObserveWithExemplar(r.Context(), i.PerTenantDuration.WithLabelValues(labelValues...), respMetrics.Duration.Seconds())
}
if i.SlowRequestCutoff > 0 && respMetrics.Duration > i.SlowRequestCutoff {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the original (private :( ) issue it's discussed to have the cutoff be the volume ("N samples") instead of latency. Should this be the other way around?

server/metrics.go Outdated Show resolved Hide resolved
@@ -105,6 +110,17 @@ func (i Instrument) Wrap(next http.Handler) http.Handler {
labelValues = append(labelValues, tenantID)
instrument.ObserveWithExemplar(r.Context(), i.PerTenantDuration.WithLabelValues(labelValues...), respMetrics.Duration.Seconds())
}
if i.SlowRequestCutoff > 0 && respMetrics.Duration > i.SlowRequestCutoff {
parts := strings.Split(w.Header().Get("Server-Timing"), ", ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some tests for this parsing? I see that there is no instrument_test.go, but I think how we handle some edge cases especially when parsing these headers is important to put in a test.

Since here the output is in Prometheus metrics, it's may not immediately obvious how to write tests against them (it def. wasn't to me when I started). Here's an example which asserts on the metrics the code generates.

func TestRateLimitedLoggerLogs(t *testing.T) {
buf := bytes.NewBuffer(nil)
c := newCounterLogger(buf)
reg := prometheus.NewPedanticRegistry()
r := NewRateLimitedLogger(c, 1, 1, reg)
level.Error(r).Log("msg", "error will be logged")
assert.Equal(t, 1, c.count)
logContains := []string{"error", "error will be logged"}
c.assertContains(t, logContains)
require.NoError(t, testutil.GatherAndCompare(reg, strings.NewReader(`
# HELP logger_rate_limit_discarded_log_lines_total Total number of discarded log lines per level.
# TYPE logger_rate_limit_discarded_log_lines_total counter
logger_rate_limit_discarded_log_lines_total{level="info"} 0
logger_rate_limit_discarded_log_lines_total{level="debug"} 0
logger_rate_limit_discarded_log_lines_total{level="warn"} 0
logger_rate_limit_discarded_log_lines_total{level="error"} 0
`)))
}

server/server.go Outdated Show resolved Hide resolved
server/server.go Outdated Show resolved Hide resolved
@tinitiuset tinitiuset changed the title Feature: Add new metric slow_request_server_throughput Feature: Add new metric slow_request_throughput Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants