Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/telemetry: add support for histogram counters and charts #68207

Open
findleyr opened this issue Jun 26, 2024 · 1 comment
Open

x/telemetry: add support for histogram counters and charts #68207

findleyr opened this issue Jun 26, 2024 · 1 comment
Labels
NeedsFix The path to resolution is known, but the work has not been done. telemetry x/telemetry issues
Milestone

Comments

@findleyr
Copy link
Contributor

findleyr commented Jun 26, 2024

The telemetry documentation at https://go.dev/doc/telemetry#basic-counters describes how, by convention, telemetry counters can represent a histogram. For example, we currently record latency in gopls using predefined bucket names like "<10ms", "<50ms", "<1s", but in the telemetry infrastructure these are treated as unstructured strings. On the other hand, https://research.swtch.com/telemetry-design#configuration suggests histograms as a predefined chart type.

We've decided that histograms should be supported natively as both counter and chart types. Specifically, histogram counters should record scalar observations, bucketing should be handled by the counter package, and the telemetry chart rendering should be knowledgeable of units.

Here's a hypothetical new API:

package counter

// NewHistogram creates a new histogram counter for a scalar type.
//
// Bucket names are computed in terms of the provided unit.
// For example, given a unit time.Millisecond and buckets
// []time.Duration{1*time.Millisecond, 10*time.Millisecond, 100*time.Millisecond},
// the bucket names will be {0,1,10,100}.
//
// The provided unit is also used as the basis for the default bucketing.
func NewHistogram[T Integer| Float](name string, unit T) *Histogram[T]

type Histogram[T Integer | Float] struct{ /* ... */}

// SetBuckets configures the the histogram to bucket observations into the following
// half-open intervals:
//  - [0, bounds[0])
//  - [bounds[i], bounds[i+1]), for each i < len(bounds) - 1
//  - [bounds[len(bounds-1)], ∞)
//
// SetBuckets must be called before the first call to [Record].
func (h *Histogram[T]) SetBuckets(bounds []T)

// Record increments the bucket counter corresponding to the observed value.
// The given value must be non-negative.
func (h *Histogram[T]) Record(value T)

While I'm including a SetBuckets method for completeness, I'm not up to date on the latest theory for bucketing. My hope is that the default bucketing should do the right thing in most cases.

This would be used like so:

completionLatency := counter.NewHistogram("gopls/completion/latencyms", 1*time.Millisecond)
completionLatency.Record(42*time.Millisecond)

// ...

cacheHitRatio := counter.NewHistogram("build/cache/hitpercent", 0.01)
cacheHitRatio.Record(0.12)

In the telemetry chart configuration, we'd support the histogram chart type, and collected buckets would have to be explicitly enumerated, as in https://research.swtch.com/telemetry-design#configuration. I'd also suggest including a "unit" field in the chart configuration.

For example

counter: gopls/completion/latencyms:{0,10,20,50,100,200,500,1000,10000}
title: Completion Latency
description: Measured latency of successful completion requests.
type: histogram
unit: milliseconds
program: golang.org/x/tools/gopls

Open Questions

  • How can we keep buckets consistent between instrumented counter and chart?
  • What should we use as the default bucketing?
  • Do we need to support floating point units, or can we restrict to integral units?
  • Should we avoid generics, and just provide constructors for the common types of histogram we expect (latency, ratio, etc.)
  • Relatedly, should the chartconfig unit field be an enum of supported types, so that for example 1000ms can be simplified as 1s?

CC @golang/telemetry @rsc @adonovan

@gopherbot gopherbot added the telemetry x/telemetry issues label Jun 26, 2024
@gopherbot gopherbot added this to the Unreleased milestone Jun 26, 2024
@gabyhelp
Copy link

Similar Issues

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

@joedian joedian added the NeedsFix The path to resolution is known, but the work has not been done. label Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix The path to resolution is known, but the work has not been done. telemetry x/telemetry issues
Projects
None yet
Development

No branches or pull requests

4 participants