-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: trace self-observability - otlptracegrpc exporter metrics #7142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: trace self-observability - otlptracegrpc exporter metrics #7142
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7142 +/- ##
======================================
Coverage 82.9% 82.9%
======================================
Files 264 265 +1
Lines 24628 24754 +126
======================================
+ Hits 20423 20535 +112
- Misses 3822 3832 +10
- Partials 383 387 +4
🚀 New features to boost your workflow:
|
We are missing a description similar to this: https://github.com/open-telemetry/opentelemetry-go/pull/7027/files#diff-f49e6c5c367e6599dcbde1bd811bc47159e72e190936055e27f5d65739eb12d8R9-R11 |
The following indicators are missing some attributes (including unit tests):
|
Co-authored-by: Flc゛ <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds experimental self-observability metrics to the OTLP trace gRPC exporter. When enabled via the OTEL_GO_X_SELF_OBSERVABILITY
environment variable, the exporter will emit metrics tracking span export operations including in-flight spans, exported spans count, and operation duration.
- Introduces an experimental feature flag system for enabling self-observability
- Adds metric instrumentation to track span export operations in the gRPC exporter
- Updates module dependencies to include required metric packages
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
exporters/otlp/otlptrace/otlptracegrpc/internal/x/x.go |
Implements experimental feature flag system for self-observability |
exporters/otlp/otlptrace/otlptracegrpc/internal/x/x_test.go |
Tests for the experimental feature flag functionality |
exporters/otlp/otlptrace/otlptracegrpc/client.go |
Adds metric instrumentation to the gRPC client for tracking export operations |
exporters/otlp/otlptrace/otlptracegrpc/client_test.go |
Comprehensive tests for self-observability metrics functionality |
exporters/otlp/otlptrace/otlptracegrpc/go.mod |
Updates dependencies to include metric packages |
CHANGELOG.md |
Documents the new experimental feature |
These are all added now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last remaining issue.
Also, please resolve the conflicts.
Finally: Thank you for your contribution.
m := mp.Meter("go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc", | ||
metric.WithInstrumentationVersion(sdk.Version()), | ||
metric.WithSchemaURL(semconv.SchemaURL)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m := mp.Meter("go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc", | |
metric.WithInstrumentationVersion(sdk.Version()), | |
metric.WithSchemaURL(semconv.SchemaURL)) | |
m := mp.Meter( | |
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc", | |
metric.WithInstrumentationVersion(sdk.Version()), | |
metric.WithSchemaURL(semconv.SchemaURL), | |
) |
c.initSelfObservability() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flattening initSelfObservability
seems appropriate. This is the only call site and this function is scoped to setup a new client, which includes telemetry.
|
||
defer func() { | ||
duration := time.Since(start) | ||
durationAttrs := make([]attribute.KeyValue, 0, len(c.selfObservabilityAttrs)+2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is allocated every call. A pool should be used to amortize the slice allocation.
for _, ss := range ps.ScopeSpans { | ||
spanCount += len(ss.Spans) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle nil values.
for _, ss := range ps.ScopeSpans { | |
spanCount += len(ss.Spans) | |
for _, ss := range ps.GetScopeSpans() { | |
spanCount += len(ss.GetSpans()) |
durationAttrs := make([]attribute.KeyValue, 0, len(c.selfObservabilityAttrs)+2) | ||
durationAttrs = append(durationAttrs, c.selfObservabilityAttrs...) | ||
durationAttrs = append(durationAttrs, | ||
c.operationDurationMetric.AttrRPCGRPCStatusCode(otelconv.RPCGRPCStatusCodeAttr(status.Code(err)))) | ||
|
||
exportedAttrs := make([]attribute.KeyValue, 0, len(c.selfObservabilityAttrs)+1) | ||
exportedAttrs = append(exportedAttrs, c.selfObservabilityAttrs...) | ||
|
||
if err != nil { | ||
// Try to extract the underlying gRPC status error, if there is one | ||
rootErr := err | ||
if s, ok := status.FromError(err); ok { | ||
rootErr = s.Err() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exact allocations can be made here for the cost of a few more branches which is worth it.
durationAttrs := make([]attribute.KeyValue, 0, len(c.selfObservabilityAttrs)+2) | |
durationAttrs = append(durationAttrs, c.selfObservabilityAttrs...) | |
durationAttrs = append(durationAttrs, | |
c.operationDurationMetric.AttrRPCGRPCStatusCode(otelconv.RPCGRPCStatusCodeAttr(status.Code(err)))) | |
exportedAttrs := make([]attribute.KeyValue, 0, len(c.selfObservabilityAttrs)+1) | |
exportedAttrs = append(exportedAttrs, c.selfObservabilityAttrs...) | |
if err != nil { | |
// Try to extract the underlying gRPC status error, if there is one | |
rootErr := err | |
if s, ok := status.FromError(err); ok { | |
rootErr = s.Err() | |
} | |
rootErr := err | |
// Extract the underlying gRPC status error, if there is one. | |
if s, ok := status.FromError(err); ok { | |
rootErr = s.Err() | |
} | |
n := len(c.selfObservabilityAttrs) | |
var durationAttrs, exportedAttrs []attribute.KeyValue | |
if rootErr != nil { | |
durationAttrs = make([]attribute.KeyValue, n, n+2) | |
exportedAttrs = make([]attribute.KeyValue, n, n+1) | |
} else { | |
durationAttrs = make([]attribute.KeyValue, n, n+1) | |
exportedAttrs = make([]attribute.KeyValue, n, n) | |
} | |
_ = copy(durationAttrs, c.selfObservabilityAttrs) | |
scAttr := c.operationDurationMetric.AttrRPCGRPCStatusCode(otelconv.RPCGRPCStatusCodeAttr(status.Code(err))) | |
durationAttrs = append(durationAttrs, scAttr) | |
_ = copy(exportedAttrs, c.selfObservabilityAttrs) | |
if err != nil { |
// nextExporterID returns a new unique ID for an exporter. | ||
// the starting value is 0, and it increments by 1 for each call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
// nextExporterID returns a new unique ID for an exporter. | |
// the starting value is 0, and it increments by 1 for each call. | |
// nextExporterID returns a monotonically increasing int64 starting at 0 |
DataPoints: []metricdata.HistogramDataPoint[float64]{ | ||
{ | ||
Attributes: attribute.NewSet( | ||
semconv.OTelComponentName("otlp_grpc_span_exporter/1"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relies on test execution order. It is brittle and will break when test are run in parallel or new cases are added. The generator needs to be reset per test case or this needs to not be evaluated as strictly.
if tt.enabled { | ||
t.Setenv("OTEL_GO_X_SELF_OBSERVABILITY", "true") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two test cases and this conditional splits them. They should be made into their own tests to just remove the complexity being added to accommodate everything here.
} | ||
|
||
original := otel.GetMeterProvider() | ||
defer otel.SetMeterProvider(original) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer otel.SetMeterProvider(original) | |
t.Cleanup(func() { otel.SetMeterProvider(original) }) |
@@ -286,3 +289,106 @@ func TestWithEndpointWithEnv(t *testing.T) { | |||
}) | |||
} | |||
} | |||
|
|||
func Test_getServerAttrs(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func Test_getServerAttrs(t *testing.T) { | |
func TestGetServerAttrs(t *testing.T) { |
@Mojachieee PTAL #7272 |
Hi, since this process involves specification adjustments and historical review records (which may contain invalid review suggestions), it’s impossible to tell which items need attention amid the large volume of information. Could we create a new PR based on the current branch before preparing for the review, so that subsequent reviews can proceed more smoothly? Thanks~ |
fixes #7007