-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extends GraphQL Spans #562
base: main
Are you sure you want to change the base?
Changes from all commits
eade875
186ea98
19d80b7
43518fc
2d61ae4
481e414
9a72c98
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
groups: | ||
- id: graphql | ||
prefix: graphql | ||
- id: graphql.server.request | ||
prefix: graphql.request | ||
type: span | ||
brief: > | ||
This document defines semantic conventions to apply when instrumenting the GraphQL implementation. They map | ||
|
@@ -9,7 +9,7 @@ groups: | |
- id: operation.name | ||
brief: "The name of the operation being executed." | ||
type: string | ||
examples: 'findBookById' | ||
examples: "findBookById" | ||
- id: operation.type | ||
brief: "The type of the operation being executed." | ||
type: | ||
|
@@ -24,9 +24,63 @@ groups: | |
- id: subscription | ||
value: "subscription" | ||
brief: "GraphQL subscription" | ||
examples: ['query', 'mutation', 'subscription'] | ||
- id: document | ||
brief: "The GraphQL document being executed." | ||
examples: ["query", "mutation", "subscription"] | ||
- id: document.content | ||
brief: "The content of the GraphQL document being executed." | ||
type: string | ||
note: The value may be sanitized to exclude sensitive information. | ||
examples: 'query findBookById { bookById(id: ?) { name } }' | ||
examples: "query findBookById { bookById(id: ?) { name } }" | ||
- id: document.sha256 | ||
brief: "The SHA256 hash of the document.content" | ||
type: string | ||
examples: "400483f38c08e8a3d3b972409c9dfb8e4a326e1b1940864932acd9f873d8664c" | ||
- id: document.id | ||
brief: > | ||
The id of the document beeing executed. This is a hash of the document provided by the user to identitfy | ||
persisted queries. | ||
type: string | ||
examples: "aa3e37c1bf54708e93f12c137afba004" | ||
- id: error.count | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't this be a generic "request count/duration" metric? With the error type attribute then one can find out how many errors occurred. https://github.com/open-telemetry/semantic-conventions/blob/v1.23.0/docs/http/http-metrics.md#metric-httpserverrequestduration |
||
brief: "The number of errors that occurred during the operation." | ||
type: int | ||
examples: 3 | ||
# TODO should we have something like outcome (success, failure) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was not sure if we should have a property that signals if the execution was a success or a failure. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If it's an exception we are talking about here, there's already https://github.com/open-telemetry/semantic-conventions/blob/v1.23.0/docs/exceptions/exceptions-spans.md In general if the request fails on the client side, the span status then is marked as error. See here for some example (for HTTP) https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md#status |
||
# TODO how do we specify errors? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A GraphQL Operation can have multiple errors in the response: {
"data": null,
"errors": [
{
"message": "Cannot return null for non-nullable field User",
"locations": [
{
"line": 3,
"column": 5
}
],
"path": [
"createUser",
"user"
],
"extensions": {
"code": "USERNAME_INVALID",
"message": "username is invalid"
}
}
]
} What is the best way to represent this in OpenTelemetry? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd say maybe use the attributes under the |
||
# TODO Should we ref network.transport, network.type, server.address etc? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This would have to be done with caution though. as graphql is protocol agonstic There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But those are still very protocol agnostic, no? |
||
# TODO There are more spans like validation, parsing, variable coercion and response formatting. Should we add them as separate span types? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It provides a lot of value to know how long a specific operation is spending validating, parsing or formatting. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can model how the trace should like as we want - One thing to always keep in mind is verbosity. Spans costs money in the end. Creating too many may be a problem in some cases, so if it's not essential, maybe they can be turned off or opt-in. If so, please open a separated issue explaining the value/what it should do so we can tackle it in a separate PR. |
||
|
||
- id: graphql.server.resolver | ||
prefix: graphql | ||
type: span | ||
brief: > | ||
This document defines semantic conventions to apply when instrumenting the GraphQL implementation. | ||
They map GraphQL resolvers to attributes on a Span. | ||
attributes: | ||
- id: selection.name | ||
brief: "The name of the selection that is being resolved. Either the field name or an alias." | ||
type: string | ||
examples: "findBookById" | ||
- id: selection.type # selection.field.type ? | ||
brief: "The type of the field that is beeing resolved" | ||
type: string | ||
examples: "Book" | ||
- id: selection.path | ||
brief: "The path of the selection that is beeing resolved." | ||
type: string | ||
examples: "/foo/bar/0/baz" | ||
- id: selection.field.name | ||
brief: "The name of the field that is beeing resolved." | ||
type: string | ||
examples: "findBookById" | ||
- id: selection.field.declaringType | ||
brief: "The type that declares the field that is beeing resolved." | ||
type: string | ||
examples: "Query" | ||
- id: selection.field.coordinate | ||
brief: "The coordinate of the field that is beeing resolved." | ||
Comment on lines
+52
to
+80
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All attributes should be defined in the attributes registry, not directly where it is used. https://github.com/open-telemetry/semantic-conventions/blob/main/CONTRIBUTING.md#1-modify-the-yaml-model Then here you use |
||
type: string | ||
examples: "Query.findBookById" | ||
- id: selection.field.isDeprecated | ||
brief: "Whether the field that is beeing resolved is deprecated." | ||
type: bool | ||
examples: true | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do we do with validation errors? If someone sends a invalid GraphQL requests that e.g. could not be parsed. Which span should be emitted There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same span as you would do, but in the case of an error, you set the exception attributes where the error can be recorded (as well as the span status). See HTTP for example https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another thing that is not yet specified are the subscriptions. A subscription is a long running operation that returns an event stream. Similar to websockets/signalR etc. Is there prior art in open telemetry how this could be handled? A subscription starts with a "Subscribe" call that returns a event stream. Each event in this event stream is mapped to a graphql result. This means that, the subscribe call and the execution of each event, give different insights. These insights are more important than a arbitrary long root spans that spans the whole graphql execution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This are just the
graphql.server
spans. Should we also specific thegraphql.client
spans?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is desired, yes we should. But most likely not as part of this PR. Also, maybe it's good to create an issue so maybe discussions can happen there.