Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common config option for max length of capture_body and context.db.statement? #488

Closed
trentm opened this issue Aug 16, 2021 · 10 comments
Closed
Assignees

Comments

@trentm
Copy link
Member

trentm commented Aug 16, 2021

tl;dr: There are a few requests for configurability of the max-length of captured request bodies and context.db.statement. Would you favour a single config option for these types of long fields? Or separate config options for each?

Background

  • .NET has a request for a configurable RequestBodyMaxLength. Some recent Slack discussion is here
    • Java has a hardcoded 10k limit.
    • .NET recently bumped from 2k to 10k.
    • Node.js currently hardcods 2k.
  • Java has the equivalent feature request. This links to a discuss topic with use cases for capturing 1MB request bodies.
  • Node.js has a request for configuring the limit on captured span.context.db.statement, currently hardcoded to a 10k limit. The use case is for capturing large Elasticsearch queries where APM is being used to get a trace for later replay of ES queries for perf testing.
  • (If/when adding these config options, the docs should mention that APM server's max_event_size (default 300k) can come into play for these use cases.)
  • Felix recently suggested that perhaps there be one common config var for both the max length of captured request bodies and span.context.db.statement (and perhaps others, e.g. Node.js also has errorMessageMaxLength). Do you prefer a single config option for these, or separate options for each field?
  • If yes, do you have preferences or suggestions on naming this config option?
    • long_string_max_length
    • long_field_max_length
    • other suggestions?
@trentm trentm self-assigned this Aug 16, 2021
@trentm
Copy link
Member Author

trentm commented Aug 16, 2021

/cc @felixbarny @gregkalapos

@gregkalapos
Copy link
Contributor

gregkalapos commented Aug 16, 2021

I agree with the one common config approach. We already have 2 things to cover with this and I think it’s very likely we’ll have more later. If we’d have separate configs for each, it’d increase the number of configs without much benefit.

long_string_max_length sounds good to me, but no strong feelings on that.

@basepi
Copy link
Contributor

basepi commented Aug 17, 2021

Common config seems reasonable. No strong opinion on naming.

@russcam
Copy link
Contributor

russcam commented Aug 17, 2021

Common configuration value sounds like a pragmatic approach; maybe context_max_length, as the field is typically related to the context of the span?

@russcam
Copy link
Contributor

russcam commented Aug 17, 2021

Do we think it's likely users would want to vary the max length of different types of spans independently e.g. max length for DB statements but truncate http bodies?

@trentm
Copy link
Member Author

trentm commented Aug 18, 2021

Do we think it's likely users would want to vary the max length of different types of spans independently

I doubt it. I think the use cases will tend to be (a) rare and (b) independently for one of the few fields covered by the config var. If that were to happen, at least for the Node.js agent api, the user could set the _max_length to the larger value, and then use an APM event filter (apm.addFilter) to reduce the other field(s) to the lesser value. Granted that is obtuse, though.

maybe context_max_length, as the field is typically related to the context of the span?

That sounds a little bit like it might be a max-length on the total of all the {span,transaction}.context.

ECS calls them "fields" (https://www.elastic.co/guide/en/ecs/current/ecs-field-reference.html), so I think I'll go with long_field_max_length unless there are objections.

@russcam
Copy link
Contributor

russcam commented Aug 18, 2021

I'm not sure about long_field_max_length - it feels quite vague and not very self-descriptive. Documentation will undoubtedly help clarify its usage, but is there something succinct, yet self-descriptive? How about context_field_max_length? Naming is hard 🙂

@felixbarny
Copy link
Member

I think context_field_max_length can be confusing as the general limit of string fields is 1024 which is enforced at the intake API level, see
https://github.com/elastic/apm-server/blob/3708a02288342b3030765003d1360a4b8055b25a/docs/spec/v2/transaction.json#L659-L666

@trentm
Copy link
Member Author

trentm commented Aug 18, 2021

Naming: It is about setting the "max-length" on some "fields". The fields we are talking about are (I haven't trawled through the APM event schema to find more):

  • span.context.db.statement
  • {transaction,error}.context.request.body
  • perhaps {transaction,error}.context.message.body?
  • perhaps error.{log,exception}.message (if we connect this to the existing errorMessageMaxLength in the node.js agent)?

What are common traits of those fields?

  • They can be/may tend to be longer.
  • They aren't just a name or an identifier.

I agree "long_..." is unsatisfyingly generic, but I don't have a better idea.

@trentm
Copy link
Member Author

trentm commented Aug 23, 2021

Calling this done. A draft spec PR is here: #493

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants