Proposal - HTTP request count semantic convention #1362

nerdondon · 2022-11-23T21:58:12Z

What are you trying to achieve?

Add instruments for HTTP client and server request counts.

Additional context.

The current HTTP semantic conventions only has an instrument for active requests (http.server.active_requests). This proposal is to add counters like http.server.request.count and http.client.request.count. This seemed to be part of the original PR for HTTP semantic conventions as http.{type}.requests but was lost somehow. I think this is a prime candidate to be codified as a semantic convention because of very similar instrumentation across service meshes:

This metric is also important for capturing QPS and deriving error rate.

If this is desirable, I think I would be willing to take a crack at adding it.

The text was updated successfully, but these errors were encountered:

mateuszrzeszutek · 2022-11-24T08:56:22Z

The http.server.duration metric is a histogram, and histograms capture the total count of observations. Does that not work for you?

tsloughter · 2022-11-24T16:50:52Z

An count aggregator would be useful for instances like this.

I've been meaning to send a PR to suggest removing this from the metrics SDK spec:

Customize the aggregation - if the default aggregation associated with the Instrument does not meet the needs of the user. For example, an HTTP client library might expose HTTP client request duration as Histogram by default, but the application developer might only want the total count of outgoing requests.

Because it is not currently supported to aggregate to just a count -- I guess except for a 1 bucket histogram, which isn't really intuitive or what this is really saying.

nerdondon · 2022-11-28T19:38:51Z

The http.server.duration metric is a histogram, and histograms capture the total count of observations. Does that not work for you?

@mateuszrzeszutek that seems be fine if not somewhat non-ergonomic as @tsloughter is pointing out. A developer in this case would have to report histogram measurements exactly or override and use a 1 bucket histogram.

@tsloughter I'm not sure what the exact issue is regarding supporting aggregation to a count though. Can you elaborate? I thought the histogram should just expose a property count that represents the total population of points.

Edit: Also, wanted to ask what it would look like for a client that wanted to report different sets of attributes between duration and count. Would they have to keep two different histogram measures of duration? It still seems more ergonomic that there is a dedicated count instrument.

Additionally, from the telemetry consumption side, it seems that there should be a defined count instrument. That way, clients claiming compliance with HTTP semantic conventions would need to provide a request count instrument.

cc: @jsuereth

tsloughter · 2022-11-29T17:19:16Z

@nerdondon no issue, I think a count aggregator should be proposed. I guess my side note about the spec muddied that :)

nerdondon · 2022-12-20T18:09:05Z

@SergeyKanzhelev or @jsuereth sorry for the bump but i just wanted to see ask if there was something I could do to get more movement on this?

chameleon82 · 2023-02-02T09:46:45Z

In my preference to name that metric as rate: http.server.rate, http.client.rate

As user I'm interested in current successful rate and response rate. Means every metric should send both values one with tag status=success and with tag status=fail corresponding to both Success Rate and Error Rate metrics I can visualize with tools like Grafana.

as an alternative naming can follow http.{server|client}.rate.{success|error}

However definition of the error may be vary ( timeouts / 5xx errors or status >= 400 ) and must be specified as well.

Example of metrics with grafana tooling: https://grafana.com/grafana/plugins/novatec-sdg-panel/

Response Time            | in_timesum
Request Rate             | in_count
Error Rate               | error_in
Response Time (Outgoing) | out_timesum
Request Rate (Outgoing)  | out_count
Error Rate (Outgoing)    | error_out

nerdondon · 2023-02-03T01:36:43Z

In my understanding of the current structure of semantic conventions, a proposal would involve the standardization of a particular instrument. In the case of my proposed the count, this is just a counter from the OTel API. I'm not sure what that would look like with a rate. In any case, rate is an aggregation over the instrument. Having the base instrument would allow other aggregations as desired. Also, note the prior art that I linked with regard to this proposal. Using a count, would enable an easier path to adoption of the conventions.

I don't want to muddy the waters here by bringing in a discussion on an attribute for status class (there's already another issue regarding that IIRC). This is specifically about an operations instrument count instrument that can be used to fulfill part of the use case you mentioned and others.

tsloughter · 2023-02-03T18:49:46Z

I don't think an instrument is needed, only a Count aggregation.

RangelReale · 2023-02-09T18:40:08Z

+1 for this, I'm using Datadog and I'm seeing no way of extracting the count of the duration to do this metric.

trask · 2023-02-09T21:34:24Z

+1 for this, I'm using Datadog and I'm seeing no way of extracting the count of the duration to do this metric.

it may be worth asking Datadog about this, since I believe other backends are getting the request count from the http.server.duration metric #1362

chameleon82 · 2024-08-26T07:51:16Z

With current conventions most of our concerns are solved with:

Metric	PromQL example
Latency, P99	`histogram_quantile(0.99, http.server.request.duration{})`
Request Rate	`count(http.server.request.duration{})`
Rate Increase	`rate(http.server.request.duration{})`
Error Rate	`(count(http.server.request.duration{ http.response.status_code =~ "5.*"}) or vector(0)) / count(http.server.request.duration})`
Inflight requests	`http.server.active_requests{}`

I think it worth to have some documentation on how OTEL metric can be converted to operational metrics

github-actions bot assigned SergeyKanzhelev Nov 23, 2022

arminru added the enhancement New feature or request label Nov 28, 2022

SergeyKanzhelev removed their assignment Feb 18, 2023

lmolkova transferred this issue from open-telemetry/opentelemetry-specification Aug 22, 2024

github-actions bot assigned AlexanderWert Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal - HTTP request count semantic convention #1362

Proposal - HTTP request count semantic convention #1362

nerdondon commented Nov 23, 2022 •

edited

Loading

mateuszrzeszutek commented Nov 24, 2022

tsloughter commented Nov 24, 2022

nerdondon commented Nov 28, 2022 •

edited

Loading

tsloughter commented Nov 29, 2022

nerdondon commented Dec 20, 2022

chameleon82 commented Feb 2, 2023

nerdondon commented Feb 3, 2023

tsloughter commented Feb 3, 2023

RangelReale commented Feb 9, 2023

trask commented Feb 9, 2023

chameleon82 commented Aug 26, 2024

Proposal - HTTP request count semantic convention #1362

Proposal - HTTP request count semantic convention #1362

Comments

nerdondon commented Nov 23, 2022 • edited Loading

mateuszrzeszutek commented Nov 24, 2022

tsloughter commented Nov 24, 2022

nerdondon commented Nov 28, 2022 • edited Loading

tsloughter commented Nov 29, 2022

nerdondon commented Dec 20, 2022

chameleon82 commented Feb 2, 2023

nerdondon commented Feb 3, 2023

tsloughter commented Feb 3, 2023

RangelReale commented Feb 9, 2023

trask commented Feb 9, 2023

chameleon82 commented Aug 26, 2024

nerdondon commented Nov 23, 2022 •

edited

Loading

nerdondon commented Nov 28, 2022 •

edited

Loading