New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add client semantic conventions for socket connections #756

Closed

lmolkova wants to merge 6 commits into open-telemetry:main from lmolkova:client-connection-semconv

Contributor

lmolkova commented Feb 17, 2024 •

edited

Loading

The first stab at client socket connection conventions #454

Changes

Defines connection spans and metrics

Merge requirement checklist

CONTRIBUTING.md guidelines followed.
Change log entry added, according to the guidelines in When to add a changelog entry.
- If your PR does not need a change log, start the PR title with [chore]
~~[ ] schema-next.yaml updated with changes to existing conventions.~~

lmolkova added 2 commits

February 17, 2024 14:56


          Add semantic conventions for socker connection client

d5cf2f0


          Add examples, fix lint

266b691

lmolkova changed the title ~~Add semantic conventions for socker connection client~~ Add client semantic conventions for socket connections

lmolkova added 4 commits

February 17, 2024 15:19


          nits and lint

8d9c320


          more nits

7add745

toc

5914b2a


          add connection state (in the pool

a54e1ed

lmolkova mentioned this pull request

Move HTTP client metrics out of .NET conventions #801

Merged

3 tasks

lmolkova mentioned this pull request

[API Proposal]: Allow adding links after activity creation dotnet/runtime#101146

Closed

samsp-msft commented Apr 17, 2024

In the case of an http connection that technically is built on top of a socket connection, would you expect that to parent a socket connection, or should that just be modelled as its own variation - an http.connection (or connection.http ?) span?
I think that they would share a lot of the same attributes.

Contributor Author

lmolkova commented Apr 17, 2024

In the case of an http connection that technically is built on top of a socket connection, would you expect that to parent a socket connection, or should that just be modelled as its own variation - an http.connection (or connection.http ?) span? I think that they would share a lot of the same attributes.

My proposal is to identify a common set of connection-related things and model HTTP, AMQP, DB, etc connection in the same way - they'd effectively apply to socket-level API which are the same everywhere.

The pools are more interesting and this is where we might need HTTP connection pool, DB connection pool, etc...

I don't believe though that there is a consensus on this in the community - see #703 for the DB discussion.

samsp-msft reviewed

View reviewed changes

docs/connection/connection-spans.md


		# Semantic Conventions for Connection Spans

		This document defines semantic conventions to apply when instrumenting client side of socket connections with spans.

samsp-msft Apr 17, 2024

With http/3 over quick, the connection is virtual and may span multiple UDP packets. The client IP/Network may even change during the duration of the connection, for example switching between wifi and cellular when a mobile client is moved out of range.
Rather than tying this directly to a socket, the type can be tracked by an additional type property. This same concept can then be used for database, http and a range of scenarios, but with optional attributes based on the scenario.

Contributor Author

lmolkova Apr 18, 2024 •

edited

Loading

http/3 still operates on top of UDP sockets.
I'm not an expert, but I believe from socket perspective we still have different connections established when QUIC connection migration happens, the only thing it saves is TLS handshake - it won't happen again during migration.

It's a good question how to represent QUIC logical connection, but given it's such a long lived thing, I don't see why we can't have a span for it and spans for all the underlying real socket connections it creates.

docs/connection/connection-spans.md

+              - `connect` span: describes the process of establishing a connection. It corresponds to `connect` function ([Linux or other POSIX systems](https://man7.org/linux/man-pages/man2/connect.2.html) /
+              [Windows](https://docs.microsoft.com/windows/win32/api/winsock2/nf-winsock2-connect)).
+              - `connection` span: describes the connection lifetime: it starts right after the connection is successfully established and ends when connection terminates.

samsp-msft Apr 17, 2024

Not sure if we need both, connect and connection, or if all the data can be represented in the connection.
In an HTTP case, the equivalent of connect is a wire-request - as the typical http request span tracks a logical operation rather than what happens on the wire. If there is auto-redirection for example as part of the http library, then the http request may actually result in multiple wire-requests as it retrieves the redirect and then makes a subsequent call for the data.

Contributor Author

lmolkova Apr 18, 2024

it's important to know how long it takes to establish a connection and important to know if connection was ever established and then terminated.

We can potentially have one span for connection and then indicate when the connection has happened with an event, but I'd still argue that we need two separate metrics.

docs/connection/connection-spans.md

+              | [`network.peer.port`](../attributes-registry/network.md) | int | Peer port number of the network connection. | `65123` | Conditionally Required: when applicable |
+              | [`network.transport`](../attributes-registry/network.md) | string | [OSI transport layer](https://osi-model.com/transport-layer/) or [inter-process communication method](https://wikipedia.org/wiki/Inter-process_communication). [3] | `tcp`; `udp` | Recommended |
+              | [`network.type`](../attributes-registry/network.md) | string | [OSI network layer](https://osi-model.com/network-layer/) or non-OSI equivalent. [4] | `ipv4`; `ipv6` | Recommended |
+              | [`server.address`](../attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [5] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | Conditionally Required: if available without reverse DNS lookup |

samsp-msft Apr 17, 2024

Is this where we would add network.protocol.name, network.protocol.version, tls.version?

Contributor Author

lmolkova Apr 18, 2024

Do you think they'd be useful on connection spans/metrics?

network.protocol.* describe application-level protocol, not a transport-level thing.
You can send AMQP or HTTP over the socket connection - the connection does not care and does not need to know.

For TLS and DNS we'll need a new spans not described in this PR

docs/connection/connection-spans.md

+              | `network.type`         | `"ipv4"`            |
+              | `error.type`           | `econnrefused`      |
+              ### Relationship with application protocols such as HTTP

samsp-msft Apr 17, 2024

Http becomes interesting with connection pooling, and the ability to either do sequential (http 1.1) or parallel (http2,3) requests over the same connection.
We would probably want some form or event for when the wire-requests are put on to a specific connection. Similarly for DNS lookup and TLS handshake, those are events that occur as part of connection establishment that are important to collect in some way for deeper diagnostics. Should they be events as defined by tracing, or log messages tagged with the same traceid/spanid?

Contributor Author

lmolkova Apr 18, 2024 •

edited

Loading

why events? we have links to correlate request to connection and we can put attributes on them if any is necessary.

Do you want to capture moment in time when the request is associated with the connection? We don't capture it on links yet, but we can start. Record a link and an event is an overkill.

I wonder if DNS and TLS should be spans or events. Since they involve network and have non-zero duration, spans would work better (but will be slightly less performant).
Given that connections live much longer than requests, the volume of such spans would be low and perf should not be a big deal.

samsp-msft Apr 18, 2024

I am thinking that this needs to be a dial that ops can turn based on how much data that they want to collect. Using the scenario of an HttpClient call (outgoing):

You have HttpClient spans (implemented today) - these are really tracking a logical request rather than what is physically happening on the wire.
The next level would be a chain of physical requests - in the case of redirection, it could be multiple before the final request, or it could do a continue to resume collection of data.
The connections themselves are longer lived beyond a single request. They have DNS and TLS as part of the initialization.
DNS lookup for a connection
TLS negotiation

In most cases you probably don't want to collect all of these all the time. However I can see ops turning them on when needed to collect more specific diagnostics data. So can we make it adaptive, and be able to correlate the data when applicable?

I am wondering if an "event" + optional link approach would be best. When a request is put on the wire for a connection - you'd get an event - that way you kind of know what the delay was before your request was processed. If the connections are being tracked, then that "event" would have a link to the connection span, so you could correlate them together.
Similarly a connection would have an "event" for when the dns resolution has occurred, and TLS negotiation is complete. If either are being tracked, that event would link to their respective spans - although those could probably be parented to the connection.

I use "event" in quotes as I am told the future of events on spans is unclear - it could be done with a log message instead.

docs/connection/connection-spans.md

+              ### Successful connect, but connection terminates with an error
+              Successful connection attempt to `example.com` results in the following span:
+              > Note: DNS lookup is outside of the scope of this semantic convention

samsp-msft Apr 17, 2024

While I agree that we shouldn't be trying to include all the dns info in the connection - having an event/marker for timing that indicates when it was complete would be helpful.
If dns is being tracked with its own spans - having a convention for linking from the connection span to dns would make sense.

Contributor Author

lmolkova commented Jul 9, 2024

this is partially addressed in #1192 (specifically for .NET).

I'm going to close this PR with the intention to evolve connection-level observability via #1192 and follow up PRs to generalize beyond .NET

lmolkova closed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet