Refactoring SPEC for DB spans #420

AlexanderWert · 2021-03-03T15:16:16Z

Preview

Goals of refactoring:

broader list of relevant fields and corresponding description (including destination fields)
make semantics of fields more explicit in general and for individual technologies
concrete examples for different technologies
align agents and reduce / eliminate inconsistencies

apmmachine · 2021-03-03T15:23:27Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Build Cause: Pull request #420 updated
Start Time: 2021-04-29T10:40:11.321+0000
Duration: 3 min 43 sec
Commit: 35110c8

Trends 🧪

specs/agents/tracing-instrumentation-db.md

basepi · 2021-03-04T16:33:22Z

I think this refactor is a great idea and it's making it immediately obvious where there are unknowns or inconsistencies.

One question: what do the 🔴 icons mean?

trentm · 2021-03-05T22:52:25Z

Is there any value in linking to https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/database.md for comparison or perhaps for inspiration?

trentm

Looks like a huge improvement, generally. I'm curious if apm-server devs would think this adds a maint burden. As you said, we don't expect these specs to change much, so perhaps no real burden added here.

specs/agents/tracing-instrumentation-db.md

axw · 2021-03-08T03:03:00Z

@trentm why did you think it might be a maintenance burden for the server devs? I can't see it being one, but I'm not sure how could be one either.

specs/agents/tracing-instrumentation-db.md

trentm · 2021-03-08T15:59:06Z

why did you think it might be a maintenance burden for the server devs? I can't see it being one, but I'm not sure how could be one either.

@axw In the sense that if we added or changed semantics of context.db.* fields, then there would be a maintenance task to update info in this document. However, that would not be a requirement on apm-server dev, so there is no concern. Thanks.

cyrille-leclerc · 2021-03-10T11:01:20Z

As discussed with @AlexanderWert ,

Can we verify we are aligned with infrastructure monitoring to be successful at correlating a database span with the instrumentation of a database by "metricbeat/filebeat"?
Can we document the mapping with ECS and explain our path forward to be consistent with ECS? This mapping will be needed for the storage of the span documents in Elasticsearch

specs/agents/tracing-instrumentation-db.md

felixbarny

I'm good to merge this as-is. It "just" documents the status quo. Any additions can and should be handled in follow-up PRs.

specs/agents/tracing-instrumentation-db.md

SylvainJuge · 2021-04-28T08:55:35Z

Should we take that opportunity to align span naming conventions ?

Some of them include the service/technology, when it duplicates the type / subtype combination. From the provided examples:

Elasticsearch: GET ... with the Elasticsearch: prefix with db / elasticsearch
DynamoDB ... for DynamoDB with db / dynamodb
S3 GetObject my-bucket for S3 with storage / s3
No prefix for MongoDB with db / mongodb
No prefix for Redis with db / redis
No prefix for relational databases with db / <db vendor name>

Unless we have a good reason, I think that removing the prefixes and align with what is done with relational databases is better.
That might be a breaking change, but we are not consistent thus going either way is breaking something.

On the UI side, we should have a way to present the spans by using only the type / subtype, which should remove the need to rely on such prefixes. If we had such visual indicators on spans (labels, icons, ...) keeping prefixes in the name would create duplication, that would be a nice way to leverage this spec & alignment across agents.

felixbarny · 2021-04-28T09:38:37Z

+1 on the suggestion but I really think we should separate documenting the status quo from making actual changes.

… in the db-spec.

Co-authored-by: Russ Cam <[email protected]>

specs/agents/tracing-instrumentation-db.md

trentm

Sorry I realize I am late and this is after the fact.

trentm · 2021-04-29T18:16:40Z

specs/agents/tracing-instrumentation-db.md

+|`action`| `request` |
+| __**context.db._**__  |<hr/>|<hr/>|
+|`_.instance`| :heavy_minus_sign: |
+|`_.statement`| e.g. <pre lang="json">{"query": {"match": {"user.id": "kimchy"}}}</pre> | For Elasticsearch search-type queries, the request body may be recorded. Alternatively, if a query is specified in HTTP query parameters, that may be used instead. If the body is gzip-encoded, the body should be decoded first.|


The Node.js (and I believe Python, because I copied from there) agents use a format that can include either or both of (a) query parms (in URL query encoded form) and (b) the request body (JSON) separated by two newlines:

https://github.com/elastic/apm-agent-nodejs/blob/v3.14.0/lib/instrumentation/elasticsearch-shared.js#L21-L28
https://github.com/elastic/apm-agent-python/blob/master/elasticapm/instrumentation/packages/elasticsearch.py#L65-L78

Is that a form worth codifying?

Dale, who is using this field from Kibana APM traces for Elasticsearch performance work, brought up this discussion ticket to reconsider this format.

In general, with Elasticsearch, you need to consider the URL path, the query params, and the request body to fully understand the query.

If we are just documenting the status quo here, then perhaps saying "that may be used as well" rather than "that may be used instead" is more accurate.

That seems to be something that's specific to the Node.js and Python agents and that it can and probably should change in the future. Personally, I think it'd make more sense to use context.http.* to store the URL and the query parameters than to add it to db.statement. Ralley could then have a condition on whether the db.statement starts with {.

trentm · 2021-04-29T18:20:33Z

specs/agents/tracing-instrumentation-db.md

+| __**context.db._**__  |<hr/>|<hr/>|
+|`_.instance`| e.g. `us-east-1` | The AWS region where the table is. |
+|`_.statement`| :heavy_minus_sign: |  |
+|`_.type`|`dynamodb`|
+|`_.user`| :heavy_minus_sign: |
+|`_.link`| :heavy_minus_sign: |
+|`_.rows_affected`| :heavy_minus_sign: |


Do we want "db" fields for S3? If so... these values look like copypasta from DynamoDB.

estolfo reviewed Mar 3, 2021

View reviewed changes

specs/agents/tracing-instrumentation-db.md Outdated Show resolved Hide resolved

estolfo reviewed Mar 3, 2021

View reviewed changes

specs/agents/tracing-instrumentation-db.md Outdated Show resolved Hide resolved

trentm reviewed Mar 5, 2021

View reviewed changes

axw reviewed Mar 8, 2021

View reviewed changes

specs/agents/tracing-instrumentation-db.md Outdated Show resolved Hide resolved

trentm mentioned this pull request Mar 15, 2021

exclude the HTTP span under an Elasticsearch span elastic/apm-agent-nodejs#2000

Closed

cyrille-leclerc mentioned this pull request Mar 16, 2021

[OpenTelemetry] Map OpenTelemetry Semantic Attributes to Elastic Common Schema when ingesting OTLP data elastic/apm-server#4714

Closed

russcam reviewed Apr 19, 2021

View reviewed changes

specs/agents/tracing-instrumentation-db.md Outdated Show resolved Hide resolved

specs/agents/tracing-instrumentation-db.md Outdated Show resolved Hide resolved

AlexanderWert marked this pull request as ready for review April 27, 2021 08:39

AlexanderWert requested review from a team as code owners April 27, 2021 08:39

AlexanderWert requested a review from felixbarny April 27, 2021 08:39

AlexanderWert marked this pull request as draft April 27, 2021 08:52

felixbarny approved these changes Apr 27, 2021

View reviewed changes

specs/agents/tracing-instrumentation-db.md Show resolved Hide resolved

SylvainJuge approved these changes Apr 28, 2021

View reviewed changes

Refactoring SPEC for DB spans

6dc1382

Resolved redundancy between AWS spec and AWS-specific db field tables…

3f255c2

… in the db-spec.

AlexanderWert force-pushed the master branch from 4fb26f4 to 3f255c2 Compare April 29, 2021 10:30

AlexanderWert marked this pull request as ready for review April 29, 2021 10:31

Apply suggestions from code review

a7c355f

Co-authored-by: Russ Cam <[email protected]>

AlexanderWert commented Apr 29, 2021

View reviewed changes

specs/agents/tracing-instrumentation-db.md Outdated Show resolved Hide resolved

clarification on child spans

35110c8

AlexanderWert merged commit bcec2df into elastic:master Apr 29, 2021

trentm reviewed Apr 29, 2021

View reviewed changes

felixbarny mentioned this pull request May 3, 2021

Cardinality for Elasticsearch span names is too high #439

Open

AlexanderWert added the apm-agents label Jun 16, 2021

trentm mentioned this pull request Sep 13, 2022

Remove context.db from S3 spans: S3 isn't a database #683

Merged

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring SPEC for DB spans #420

Refactoring SPEC for DB spans #420

AlexanderWert commented Mar 3, 2021 •

edited

Loading

apmmachine commented Mar 3, 2021 •

edited

Loading

Build stats

Trends 🧪

basepi commented Mar 4, 2021

trentm commented Mar 5, 2021

trentm left a comment

axw commented Mar 8, 2021

trentm commented Mar 8, 2021

cyrille-leclerc commented Mar 10, 2021 •

edited

Loading

felixbarny left a comment •

edited

Loading

SylvainJuge commented Apr 28, 2021

felixbarny commented Apr 28, 2021

trentm left a comment

trentm Apr 29, 2021

trentm Apr 29, 2021

felixbarny Apr 30, 2021

trentm Apr 29, 2021

Refactoring SPEC for DB spans #420

Refactoring SPEC for DB spans #420

Conversation

AlexanderWert commented Mar 3, 2021 • edited Loading

apmmachine commented Mar 3, 2021 • edited Loading

💚 Build Succeeded

Build stats

Trends 🧪

basepi commented Mar 4, 2021

trentm commented Mar 5, 2021

trentm left a comment

Choose a reason for hiding this comment

axw commented Mar 8, 2021

trentm commented Mar 8, 2021

cyrille-leclerc commented Mar 10, 2021 • edited Loading

felixbarny left a comment • edited Loading

Choose a reason for hiding this comment

SylvainJuge commented Apr 28, 2021

felixbarny commented Apr 28, 2021

trentm left a comment

Choose a reason for hiding this comment

trentm Apr 29, 2021

Choose a reason for hiding this comment

trentm Apr 29, 2021

Choose a reason for hiding this comment

felixbarny Apr 30, 2021

Choose a reason for hiding this comment

trentm Apr 29, 2021

Choose a reason for hiding this comment

AlexanderWert commented Mar 3, 2021 •

edited

Loading

apmmachine commented Mar 3, 2021 •

edited

Loading

cyrille-leclerc commented Mar 10, 2021 •

edited

Loading

felixbarny left a comment •

edited

Loading