Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppressing autoinstrumentation properties do not have effect #3919

Closed
greatvovan opened this issue Oct 19, 2024 · 13 comments
Closed

Suppressing autoinstrumentation properties do not have effect #3919

greatvovan opened this issue Oct 19, 2024 · 13 comments

Comments

@greatvovan
Copy link

Expected behavior

The official documentation describes a number of parameters to suppress automatic instrumentation in entirety or for specific libraries. It is fair to expect that instrumenting with Microsoft's agent will follow the same logic and obey the same configuration parameters.

Actual behavior

All suppression properties are ignored except otel.javaagent.enabled.

To Reproduce

Originally faced this problem in a Databricks cluster, where alternative instrumentation approaches don't work either. I assumed that it might be an incompatibility with Spark runtime or Databricks environment, so I decided to test it on a smaller program, but it confirmed my fears.

The code: microsoft/ApplicationInsights-Java-Repros#13

% docker run -it --rm -v $(pwd)/ApplicationInsights-Java-Repros/SuppresAutoinstrumentation:/prj maven bash
# root@bc3760e5405c:/# cd /prj
# mkdir /dl
# curl -sLOJ https://github.com/microsoft/ApplicationInsights-Java/releases/download/3.6.1/applicationinsights-agent-3.6.1.jar --output-dir /dl
# mvn package
...
# export APPLICATIONINSIGHTS_CONNECTION_STRING="..."
# # java -Dotel.instrumentation.common.default-enabled=false -javaagent:/dl/applicationinsights-agent-3.6.1.jar -jar target/my-project-1.0-SNAPSHOT.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
2024-10-19 20:31:14.701Z INFO  c.m.a.a.i.c.ConfigurationBuilder - Some telemetry may be sampled out because a default sampling configuration was added in version 3.4.0 to reduce the default billing cost. You can set the sampling configuration explicitly: https://learn.microsoft.com/azure/azure-monitor/app/java-standalone-config#sampling
2024-10-19 20:31:16.744Z INFO  c.m.applicationinsights.agent - Application Insights Java Agent 3.6.1 started successfully (PID 160, JVM running for 2.861 s)
2024-10-19 20:31:16.744Z INFO  c.m.applicationinsights.agent - Java version: 21.0.4, vendor: Eclipse Adoptium, home: /opt/java/openjdk
2024-10-19 20:31:17 WARN  [main] root - Hello Azure
2024-10-19 20:31:18 INFO  [main] root - Log from a trace

Check your Application Insights instance to ensure that the logs are coming, contrary to the suppression parameter.
EXPECTED: Only the trace is coming as it was sent through OpenTelemetry API.

Let's compare this behavior with the vanilla agent from OpenTelemetry.

# curl -sLOJ https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/download/v2.9.0/opentelemetry-javaagent.jar --output-dir /dl
# export OTEL_LOGS_EXPORTER=console OTEL_METRICS_EXPORTER=none OTEL_TRACES_EXPORTER=none
# java -javaagent:/dl/opentelemetry-javaagent.jar -jar target/my-project-1.0-SNAPSHOT.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
[otel.javaagent 2024-10-19 20:43:08:770 +0000] [main] INFO io.opentelemetry.javaagent.tooling.VersionLogger - opentelemetry-javaagent - version: 2.9.0
2024-10-19T20:43:10.999Z WARN 'Hello Azure' : 00000000000000000000000000000000 0000000000000000 [scopeInfo: root:] {}
2024-10-19 20:43:11 WARN  [main] root - Hello Azure
2024-10-19T20:43:12.046Z INFO 'Log from a trace' : b8faff3761f8c71bba023991e6b7cb26 08f4ec2b7bf7b21b [scopeInfo: root:] {}
2024-10-19 20:43:12 INFO  [main] root - Log from a trace

Note the log records printed by the Console Exporter.
Now let's disable the default instrumentation:

# java -Dotel.instrumentation.common.default-enabled=false -javaagent:/dl/opentelemetry-javaagent.jar -jar target/my-project-1.0-SNAPSHOT.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
[otel.javaagent 2024-10-19 20:45:51:305 +0000] [main] INFO io.opentelemetry.javaagent.tooling.VersionLogger - opentelemetry-javaagent - version: 2.9.0
2024-10-19 20:45:52 WARN  [main] root - Hello Azure
2024-10-19 20:45:53 INFO  [main] root - Log from a trace

Console Exporter does not print any more.

Targeted suppression, like otel.instrumentation.log4j-appender.enabled=false works in the same way, disabling instrumentation in the vanilla agent, but having no effect with Microsoft's agent.

# java -Dotel.instrumentation.log4j-appender.enabled=false -javaagent:/dl/applicationinsights-agent-3.6.1.jar -jar target/my-project-1.0-SNAPSHOT.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
2024-10-19 20:55:00.879Z INFO  c.m.a.a.i.c.ConfigurationBuilder - Some telemetry may be sampled out because a default sampling configuration was added in version 3.4.0 to reduce the default billing cost. You can set the sampling configuration explicitly: https://learn.microsoft.com/azure/azure-monitor/app/java-standalone-config#sampling
2024-10-19 20:55:02.888Z INFO  c.m.applicationinsights.agent - Application Insights Java Agent 3.6.1 started successfully (PID 336, JVM running for 2.804 s)
2024-10-19 20:55:02.889Z INFO  c.m.applicationinsights.agent - Java version: 21.0.4, vendor: Eclipse Adoptium, home: /opt/java/openjdk
2024-10-19T20:55:03.488Z WARN 'Hello Azure' : 00000000000000000000000000000000 0000000000000000 [scopeInfo: root:] {thread.id=1, thread.name="main"}
2024-10-19 20:55:03 WARN  [main] root - Hello Azure
2024-10-19T20:55:04.538Z INFO 'Log from a trace' : 8cae2db0a04ff7f4bffc047bf1d02c4c 9424ef32b83aeda9 [scopeInfo: root:] {applicationinsights.internal.operation_name="My-span", thread.id=1, thread.name="main"}
2024-10-19 20:55:04 INFO  [main] root - Log from a trace

The same applies to all other kinds of telemetry and other libraries.

Now in Databricks clusters I am blocked with telemetry on all fronts. The default autoinstrumentation generates insane amounts of telemetry from all imaginable Spark internals and makes instrumentation irrational due to ingestion costs. With Azure Exporter for Java just nothing works.

System information

Please provide the following information:

  • SDK Version: applicationinsights-agent-3.6.1.jar, OpenTelemetry SDK 1.43.0 (BOM v. 2.9.0)
  • OS type and version: any OS (observed on MacOS, Linux container, Linux-like VM)
  • Application Server type and version (if applicable): N/A
  • Using spring-boot? No.
  • Additional relevant libraries (with version, if applicable): N/A

Logs

Maintainers, please note that the link provided by the template (Turn on SDK logs – https://docs.microsoft.com/en-us/azure/application-insights/app-insights-java-troubleshoot#debug-data-from-the-sdk) is no longer valid. The alternative document I found says that the logging is enabled by default into the text file, which I am attaching below. If more logging is required, please provide guidance.

2024-10-19 20:31:14.701Z INFO  c.m.a.a.i.c.ConfigurationBuilder - Some telemetry may be sampled out because a default sampling configuration was added in version 3.4.0 to reduce the default billing cost. You can set the sampling configuration explicitly: https://learn.microsoft.com/azure/azure-monitor/app/java-standalone-config#sampling
2024-10-19 20:31:16.744Z INFO  c.m.applicationinsights.agent - Application Insights Java Agent 3.6.1 started successfully (PID 160, JVM running for 2.861 s)
2024-10-19 20:31:16.744Z INFO  c.m.applicationinsights.agent - Java version: 21.0.4, vendor: Eclipse Adoptium, home: /opt/java/openjdk
@jeanbisutti
Copy link
Member

The OpenTelemetry way to suppress instrumentation is generally not supported today by the Application Insights Java agent, apart from a few cases: otel.javaagent.enabled, otel.instrumentation.logback-appender.enabled, otel.instrumentation.reactor.enabled, otel.instrumentation.reactor-netty.enabled.

With the Application Insights Java agent, you can suppress instrumentation in this way: https://learn.microsoft.com/en-us/azure/azure-monitor/app/java-standalone-config#suppress-specific-autocollected-telemetry

Maintainers, please note that the link provided by the template

Thank you for having reported this.

You can enable Application Insights self-diagnostics in this way: https://learn.microsoft.com/en-us/azure/azure-monitor/app/java-standalone-config#self-diagnostics

@greatvovan
Copy link
Author

greatvovan commented Oct 21, 2024

@jeanbisutti thank you for the useful link. I think I completely missed these configuration options. Now I tested them and, though substantially reduced, some alien telemetry is still coming:

  • Requests (from HTTP requests)
  • Dependencies (also from HTTP requests)

My current configuration file lists all instrumentations, as in the document, with false for each.
Is there a chance that something is still missed in the document? Or anything else that can be tweaked? Ideally, only our own telemetry should be coming, because we are not interested in Spark internals.

Also, is there any possibility (or any plans) to filter logs by logger name or in some other way? Currently I see two possibilities:

  • by logging library (Logback only)
  • by severity level

Both are not granular enough to pick only logs of your interest. I would like to disable a few particularly noisy loggers instead of these indiscriminate filters.

@jeanbisutti
Copy link
Member

@greatvovan

Or anything else that can be tweaked?

You could also use the sampling overrides feature. An example to suppress HTTP requests: https://learn.microsoft.com/en-us/azure/azure-monitor/app/java-standalone-sampling-overrides#example-suppress-collecting-telemetry-for-a-noisy-dependency-call

Ideally, only our own telemetry should be coming, because we are not interested in Spark internals.

Could you please elaborate on what you mean by "own telemetry"?

About the logs, can can today turn off the Logback instrumentation by setting the otel.instrumentation.logback-appender.enabled property (or the OTEL_INSTRUMENTATION_LOGBACK_APPENDER_ENABLED environment variable) to false, or you can disable all the logging instrumentation (Logback, Log4j and java.util.logging) by updating the applicationinsights.json file in the following way:

{
  "instrumentation": {
    "logging": {
      "enabled": false
    }
  }
}

When you say "Logback only", does it mean that you want to only turn off the Logback instrumentation, or only keep the Logback logging instrumentation?

To filter by severity level, you could use this feature.

@greatvovan
Copy link
Author

greatvovan commented Oct 22, 2024

Thank you @jeanbisutti,

what you mean by "own telemetry"?

Telemetry sent from our code through OpenTelemetry API. Not libraries and dependencies code.

When you say "Logback only"

When I said "Logback only" I just summarized what you said in the previous comment. For logging, you and the documentation mentioned:

  • otel.instrumentation.logback-appender.enabled – which is instrumentation of Logabck?
  • severity level: "logging": {"level": "WARN"}
  • logging in entirety: "logging": {"enabled": false}"

I was saying that these settings are indiscriminate in a way that they enable/disable very large category of logs without the ability to fine tune. In particular, one can disable what's sent through Logback, but this is not very useful in my case.

Your sampling overrides suggestion, however, provides an interesting idea. I did not think that sampling 0% is basically a filter. Logger name is sent as an attribute, which with with a regexp match could give me a way to shut off some loggers. I will give it a try.

@jeanbisutti
Copy link
Member

@greatvovan

otel.instrumentation.logback-appender.enabled – which is instrumentation of Logabck?

Yes

This comment may interest you.

@jeanbisutti
Copy link
Member

@greatvovan

Telemetry sent from our code through OpenTelemetry API. Not libraries and dependencies code.

You could use https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/monitor/azure-monitor-opentelemetry-exporter

@greatvovan
Copy link
Author

You could use https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/monitor/azure-monitor-opentelemetry-exporter

@jeanbisutti like I mentioned in the initial post, unfortunately, the Exporter does not work in Databricks due to a bug or library incompatibility.

@greatvovan
Copy link
Author

greatvovan commented Oct 23, 2024

So I am going the way of sampling overrides and mostly it works great. Just wondering, are there other ways to match the overrides besides attributes? It would be great to have more fields to analyze with strict and regex comparison, e.g. dependency's type, name, result code, request's name, URL, response code, trace's message, etc.

@jeanbisutti
Copy link
Member

The sampling override can be configured with a telemetry type (request, dependency, trace <=> log) and an attribute. Numerous attributes are available (see https://opentelemetry.io/docs/specs/semconv/http/http-spans/ for HTTP requests). The sampling override has some limitations: https://learn.microsoft.com/en-us/azure/azure-monitor/app/java-standalone-sampling-overrides#span-attributes-available-for-sampling

@jeanbisutti
Copy link
Member

If something you need does not seem possible, tell us.

@greatvovan
Copy link
Author

Thank you for guidance @jeanbisutti. I thought only custom attributes set by user are available, but now I can see that it's much more flexible. I think it is enough to cover my use case.

Now regarding this bug report, originally I opened it due to not obeying OpenTelemetry way of controlling instrumentation of libraries, but now I see that this is not a bug since there was no intention to implement this logic. Still, do you see it as a functional gap? Maybe you have it on your roadmap? I was thinking about converting this into a feature request.

@jeanbisutti
Copy link
Member

Still, do you see it as a functional gap? Maybe you have it on your roadmap? I was thinking about converting this into a feature request.

@greatvovan Yes, please convert this into a feature request. It may confuse other users.

@greatvovan
Copy link
Author

Closing this as not a bug as per comments 1, 2.

Transitioned into a feature request.

@greatvovan greatvovan closed this as not planned Won't fix, can't repro, duplicate, stale Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants