-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring SPEC for DB spans #420
Conversation
💚 Build Succeeded
Expand to view the summary
Build stats
Trends 🧪 |
I think this refactor is a great idea and it's making it immediately obvious where there are unknowns or inconsistencies. One question: what do the 🔴 icons mean? |
Is there any value in linking to https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/database.md for comparison or perhaps for inspiration? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a huge improvement, generally. I'm curious if apm-server devs would think this adds a maint burden. As you said, we don't expect these specs to change much, so perhaps no real burden added here.
@trentm why did you think it might be a maintenance burden for the server devs? I can't see it being one, but I'm not sure how could be one either. |
@axw In the sense that if we added or changed semantics of |
As discussed with @AlexanderWert ,
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good to merge this as-is. It "just" documents the status quo. Any additions can and should be handled in follow-up PRs.
Should we take that opportunity to align span naming conventions ? Some of them include the service/technology, when it duplicates the
Unless we have a good reason, I think that removing the prefixes and align with what is done with relational databases is better. On the UI side, we should have a way to present the spans by using only the |
+1 on the suggestion but I really think we should separate documenting the status quo from making actual changes. |
Co-authored-by: Russ Cam <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I realize I am late and this is after the fact.
|`action`| `request` | | ||
| __**context.db._**__ |<hr/>|<hr/>| | ||
|`_.instance`| :heavy_minus_sign: | | ||
|`_.statement`| e.g. <pre lang="json">{"query": {"match": {"user.id": "kimchy"}}}</pre> | For Elasticsearch search-type queries, the request body may be recorded. Alternatively, if a query is specified in HTTP query parameters, that may be used instead. If the body is gzip-encoded, the body should be decoded first.| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Node.js (and I believe Python, because I copied from there) agents use a format that can include either or both of (a) query parms (in URL query encoded form) and (b) the request body (JSON) separated by two newlines:
https://github.com/elastic/apm-agent-nodejs/blob/v3.14.0/lib/instrumentation/elasticsearch-shared.js#L21-L28
https://github.com/elastic/apm-agent-python/blob/master/elasticapm/instrumentation/packages/elasticsearch.py#L65-L78
Is that a form worth codifying?
Dale, who is using this field from Kibana APM traces for Elasticsearch performance work, brought up this discussion ticket to reconsider this format.
In general, with Elasticsearch, you need to consider the URL path, the query params, and the request body to fully understand the query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are just documenting the status quo here, then perhaps saying "that may be used as well" rather than "that may be used instead" is more accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems to be something that's specific to the Node.js and Python agents and that it can and probably should change in the future. Personally, I think it'd make more sense to use context.http.*
to store the URL and the query parameters than to add it to db.statement
. Ralley could then have a condition on whether the db.statement starts with {
.
| __**context.db._**__ |<hr/>|<hr/>| | ||
|`_.instance`| e.g. `us-east-1` | The AWS region where the table is. | | ||
|`_.statement`| :heavy_minus_sign: | | | ||
|`_.type`|`dynamodb`| | ||
|`_.user`| :heavy_minus_sign: | | ||
|`_.link`| :heavy_minus_sign: | | ||
|`_.rows_affected`| :heavy_minus_sign: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want "db" fields for S3? If so... these values look like copypasta from DynamoDB.
Preview
Goals of refactoring: