Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(nodejs-autoinstrumentation): enable overriding default histogram buckets #3448

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

CCOLLOT
Copy link

@CCOLLOT CCOLLOT commented Nov 11, 2024

Description

Fixes #3436 (default metrics are in seconds but default buckets are in ms)

I implemented this patch to specify a new list of buckets (seconds) for the nodejs autoinstrumentation NodeSDK:

  • Enable overriding the default SDK buckets using the OTEL_METRICS_EXPLICIT_BUCKET_HISTOGRAM environment variable (comma-separated list)
  • Provide sane default values for the generated metrics (all expressed in seconds AFAIK), using the same set of buckets as the .NET library (0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10)

Testing:: tested with opentelemetry operator 0.112.0 and got the expected bucket list:
image

Documentation: I will add the necessary documentation if the change is accepted

@CCOLLOT CCOLLOT requested a review from a team as a code owner November 11, 2024 13:50
Copy link

linux-foundation-easycla bot commented Nov 11, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@swiatekm
Copy link
Contributor

@open-telemetry/javascript-maintainers could you have a look? I'm reluctant to introduce non-standard environment variables for instrumentations that only the operator supports.

@CCOLLOT
Copy link
Author

CCOLLOT commented Nov 11, 2024

Yes, it would be nice to get an SDK maintainer's opinion. 👍

Fixing the unit mismatch is one thing, but we also need to allow specifying a custom list of buckets.

The nodejs SDK already supports providing a list of views.
Either it's the autoinstrumentation's concern to use the SDK's current way to customize the buckets (passing the view), or else the SDK needs to provide another way to customize the buckets that does not require changing the autoinstrumentation's code.

As you suggested in the issue, we could alternatively specify it as an attribute on the Instrumentation CR instead of an environment variable 👍

@pavolloffay
Copy link
Member

@CCOLLOT is OTEL_METRICS_EXPLICIT_BUCKET_HISTOGRAM non standard env var? Is it documented anywhere?

Copy link
Member

@pavolloffay pavolloffay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the env var documented anywhere?

If it is operator specific it makes more sense to expose the setting as CR option.

@CCOLLOT
Copy link
Author

CCOLLOT commented Nov 12, 2024

Is the env var documented anywhere?

If it is operator specific it makes more sense to expose the setting as CR option.

It is indeed operator-specific. In the case of nodejs we need autoinstrumentation.ts to pick up the custom bucket list to pass the view to the NodeSDK.

@pavolloffay
To be clear, are you suggesting specifying on the Instrumentation object a parameter like 'spec.nodejs.histogram_buckets' and then injecting (via the webhook) the bucket list as en environment variable into the init-container?
Did you have another approach in mind to pass the bucket list to autoinstrumentation.ts?

An alternative would be to just rename the environment variable to make it obvious that it is operator-specific. Maybe something like OTEL_OPERATOR_NODEJS_HISTOGRAM_BUCKETS or OTEL_OPERATOR_NODEJS_INIT_CONTAINER_HISTOGRAM_BUCKETS?

@pavolloffay
Copy link
Member

To be clear, are you suggesting specifying on the Instrumentation object a parameter like 'spec.nodejs.histogram_buckets'

I don't like using env vars for operator specific functionality, this should be really exposed as a standard CR option.

The env vars are there for upstream auto-instrumentation features or SDK, not features that the operator implements.

@CCOLLOT
Copy link
Author

CCOLLOT commented Nov 12, 2024

@pavolloffay
Fair enough, how would you recommend passing the bucket list from the CR parameter to the autoinstrumentation image (init-container) to create the View and pass it to the NodeSDK then?

function getView() {
const buckets = process.env.OTEL_METRICS_EXPLICIT_BUCKET_HISTOGRAM;
const defaultHistogramBuckets = [
0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pavolloffay
Copy link
Member

Passing through env var is fine, but in the CR I would expect people using CR properties.

Could you please as well ping someone from nodejs team to review?

@pichlermarc
Copy link
Member

pichlermarc commented Nov 14, 2024

As others on this PR have said before this is a non-standard env var.

If the PR was opened today against the OTel JS (or contrib) repos I'd reject it for that reason. It'd have to go through the specification process first, only then we'd accept the PR.

It is somewhat common practice that implementation specific non-standard env vars end up with a naming like OTEL_NODE_*, OTEL_PYTHON_*, so this one would have to prefix OTEL_OPERATOR_ and then you can go from there but IIRC env vars for this type of feature have been discussed in the spec in the past but have been rejected because of a lack of granularity, you'd face the same problems here.

FWIW, the next release of @opentelemetry/instrumentation-http will include a functional env var OTEL_SEMCONV_STABILITY_OPT_IN that can be set to http, which will use the new (stable) bucket layout for HTTP metrics and a new unit (seconds) as well.

@pichlermarc
Copy link
Member

Another thought: changing all histogram buckets with a view is likely not what users would want, because it overrides the instrument advisory parameter for each instrument, so one will end up with a lot of non-nonsensical data for any histograms that specify non-default buckets, which I expect to become the norm now that the feature has become stable. Instrument advisory parameters fro this are also implemented in OTel JS already.

There's plenty of histogram data that one could write that's not seconds (request sizes for instance). Any option to override all of them (like the View in the PR does) I'd consider quite harmful, actually.

@CCOLLOT
Copy link
Author

CCOLLOT commented Nov 14, 2024

Another thought: changing all histogram buckets with a view is likely not what users would want, because it overrides the instrument advisory parameter for each instrument, so one will end up with a lot of non-nonsensical data for any histograms that specify non-default buckets, which I expect to become the norm now that the feature has become stable. Instrument advisory parameters fro this are also implemented in OTel JS already.

There's plenty of histogram data that one could write that's not seconds (request sizes for instance). Any option to override all of them (like the View in the PR does) I'd consider quite harmful, actually.

Indeed, after taking a closer look at the spec I agree that overriding all of them is radical, to say the least.
What do you think about dropping the idea of trying to provide a default View with default values and instead just allow users of the Instrumentation CR to specify buckets for specific metrics? We could add a parameter to the Instrumentation CR like spec.nodejs.histogramBuckets such as:

spec:
  nodejs:
    histogramBuckets:
      - instrumentName: "http.server.duration.bucket"
        buckets: [0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10]
      - instrumentName: "some.string.*"
        buckets: [x,y,z]

We would then inject an OTEL_OPERATOR_NODEJS_HISTOGRAM_BUCKETS env var to pass it to autoinstrumentation.ts, and create a view for each element of this array and pass the views to the NodeSDK.

In that case, we would by default use the SDK's default values and it would be the end-user's responsibility and freedom to override what they need.
Even if the SDK improves the defaults, end-users might want to override buckets to reduce/cardinality or increase resolution when using the operator's injection feature.

@pichlermarc
Copy link
Member

@CCOLLOT Hmm, is there any other instrument other than the http.server* and http.client* instruments that you're trying to override (for your specific use-case that you're trying to address)? 🤔

If http.* metrics are the ones you're interested in and the bucket layout you're expecting is the one from your example then there's no action necessary as it'll be included in the next release of @opentelemetry/instrumentation-http via the OTEL_SEMCONV_STABILITY_OPT_IN environment variable.

If that's not what you're looking for, OpenTelemetry Declarative configuration will eventually be the more sustainable solution to the issue at hand because it includes all that functionality and more.

@CCOLLOT
Copy link
Author

CCOLLOT commented Nov 14, 2024

It's good news that the SDK will fix the unit mismatch with better bucket defaults. However, I still think that users of the auto instrumentation injection should be able to control the exact bucket list through the Instrumentation CR.

I updated the PR with the following changes:

  • Remove the default view with default values overriding all instruments.
  • Introduce spec.nodejs.histogrambuckets to the Instrumentation CR
  • Marshal spec.nodejs.histogrambuckets into JSON and the inject it as a environment variable into the container.
  • In autoinstrumentation.ts, unmarshal the environment variable and pass a view per item in the array.

As a result, I've achieved the desired behavior with the following Instrumentation CR:

---
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: default
spec:
  env:
    - name: OTEL_METRIC_EXPORT_INTERVAL
      value: "30000"
  exporter:
    endpoint: http://otel-collector.opentelemetry.svc:4317
  propagators:
    - tracecontext
    - baggage
  nodejs:
    histogramBuckets:
      - instrumentName: http*
        buckets: [0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10]
  sampler:
    type: parentbased_traceidratio
    argument: "1"

I believe this implementation is better than the one I proposed earlier. For default buckets, it's using the SDK's defaults, which is a better separation of concerns.
Therefore it will not modify the current default behavior but still give full flexibility to end-users of auto-instrumentation injection to adjust the cardinality/resolution of any metrics introduced by the SDK either by providing the full name of the instrument or a pattern expression using a * symbol.

@swiatekm swiatekm added the discuss-at-sig This issue or PR should be discussed at the next SIG meeting label Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss-at-sig This issue or PR should be discussed at the next SIG meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AutoInstrumentation] NodeJS histogram buckets should be configurable
5 participants