-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] No telemetry exported form a Scala notebook when using AzureMonitorExporterBuilder #41856
Comments
@lmolkova could you please take a look? |
Thank you for reporting this @greatvovan! I wonder if the problem could be around Scala notebook having different shutdown hooks and exiting before anything is exported. Could you please try calling Otherwise, @jeanbisutti @heyams or @trask should be able to investigate further. |
@lmolkova I tried |
the following code snippet works in Scala under Linux, but does NOT work in a Databricks notebook/job with a compute cluster 15.4 LTS ML (includes Apache Spark 3.5.0, Scala 2.12). it seems the issue exists in the Databricks runtime import io.opentelemetry.api.OpenTelemetry
import io.opentelemetry.api.common.Attributes
import io.opentelemetry.api.common.AttributeKey
import io.opentelemetry.sdk.autoconfigure.AutoConfiguredOpenTelemetrySdk
import com.azure.monitor.opentelemetry.exporter.AzureMonitorExporter
object Main {
def test(aiConnStr: String): String = {
val sdkBuilder = AutoConfiguredOpenTelemetrySdk.builder()
AzureMonitorExporter.customize(sdkBuilder, aiConnStr);
val openTelemetry = sdkBuilder.build().getOpenTelemetrySdk()
println("Auto-configured " + openTelemetry.toString())
val attributes = Attributes.of[java.lang.String, java.lang.Long](
AttributeKey.stringKey("foo"), "bar",
AttributeKey.longKey("code"), 42L
)
val tracer = openTelemetry.getTracer("my-notebook")
val span = tracer.spanBuilder("My-span-ScalaEA").startSpan()
val scope = span.makeCurrent()
try {
span.addEvent("XScalaEA span event", attributes)
Thread.sleep(1000L)
}
finally {
scope.close()
span.end()
}
println("trace sent")
val meter = openTelemetry.getMeter("my-notebook")
val gauge = meter.gaugeBuilder("my-gauge-ScalaEA").build()
gauge.set(111, attributes)
println("metric sent")
openTelemetry.shutdown()
Thread.sleep(5000L)
"done"
}
def main(args: Array[String]): Unit = {
test("...")
}
} libraryDependencies += "io.opentelemetry" % "opentelemetry-sdk" % "1.42.1",
libraryDependencies += "io.opentelemetry" % "opentelemetry-api" % "1.42.1",
libraryDependencies += "com.azure" % "azure-monitor-opentelemetry-exporter" % "1.0.0-beta.29" |
Yes, that's exactly what I am saying in the initial post. |
hi @greatvovan! I was able to reproduce your findings and it took me a while, but I finally found where the logs go in azure databricks 😅 and saw this:
so it looks like databricks is bringing their own (older) version of netty which is conflicting with the version used by Azure SDK libraries the good news is that Azure SDK libraries support alternate http client implementations, so I think you should be able to exclude |
Hi @trask! Thank you for investigation, at least now we understand the reason. In terms of fixing, how can we tell the library to stop using Netty and start using OkHttp? I installed
I don't quite understand how I can exclude a library in Databricks, and not sure it is possible, as it may be used internally. For better diagnostics, do you not think it would be better to check all necessary dependencies in initialization stage, e.g. in Also, could you explain how to enable/find the logging you quoted, so that I could possibly try troubleshooting on my own? |
so if this was a maven project, here's what I would do:
I didn't try it in Databricks, but it looks like they give you an option to add exclusions: so I'd try to install and then also install
yeah, open your cluster and go to the Driver logs tab, then look at the log4j logs |
@trask you saved my day! Or maybe week... Indeed, there is an option to exclude, which worked as expected. Now traces and metrics are coming (which is the main thing for me), but logs are not. Not sure if it's still related to library conflicts, I assume the same client is responsible for exporting all kinds of telemetry, and now I don't see such complaints in driver logs anyway. I activate logging this way: import org.apache.logging.log4j.{LogManager, Logger, Level}
import org.apache.logging.log4j.core.{LoggerContext}
import org.apache.logging.log4j.core.appender.ConsoleAppender
import org.apache.logging.log4j.core.appender.ConsoleAppender.{Builder => ConsoleBuilder}
import org.apache.logging.log4j.core.layout.PatternLayout
import org.apache.logging.log4j.core.config.builder.api.ConfigurationBuilderFactory
import org.apache.logging.log4j.core.config.{AppenderRef, LoggerConfig}
import io.opentelemetry.instrumentation.log4j.appender.v2_17.OpenTelemetryAppender
import io.opentelemetry.instrumentation.log4j.appender.v2_17.OpenTelemetryAppender.{Builder => OtelBuilder}
val context = LogManager.getContext(false).asInstanceOf[LoggerContext]
val config = context.getConfiguration
val layout = PatternLayout.newBuilder().withPattern("%d{yyyy-MM-dd HH:mm:ss} %-5level %logger{36} - MARKER %msg%n").build()
val consoleAppender = ConsoleAppender.newBuilder().asInstanceOf[ConsoleBuilder[_]]
.setName("MyConsoleAppender").asInstanceOf[ConsoleBuilder[_]]
.setLayout(layout).asInstanceOf[ConsoleBuilder[_]]
.setTarget(ConsoleAppender.Target.SYSTEM_OUT).asInstanceOf[ConsoleBuilder[_]]
.build()
consoleAppender.start()
val otelAppender = OpenTelemetryAppender.builder().asInstanceOf[OtelBuilder[_]]
.setName("MyOtelAppender").asInstanceOf[OtelBuilder[_]]
.setCaptureMapMessageAttributes(true).asInstanceOf[OtelBuilder[_]]
.setCaptureContextDataAttributes("*").asInstanceOf[OtelBuilder[_]]
.build()
otelAppender.start()
val rootLoggerConfig = config.getRootLogger
rootLoggerConfig.addAppender(consoleAppender, null, null)
rootLoggerConfig.addAppender(otelAppender, null, null)
context.updateLoggers() I connect a ConsoleAppender just to have an indication in stdout that my logs are handled. val sdkBuilder = AutoConfiguredOpenTelemetrySdk.builder()
AzureMonitorExporter.customize(sdkBuilder, aiConnStr)
val openTelemetry = sdkBuilder.build().getOpenTelemetrySdk()
OpenTelemetryAppender.install(openTelemetry) Cluster libraries: Logging: val logger = LogManager.getLogger("My-notebook")
logger.error("My test log") I see the result in the cell output and cluster's stdout (with formatting I set), so I conclude that logging system is set up fine. Console and Otel appenders are added identically, and should emit messages simultaneously, but not a single log record is coming to the cloud. How can we further troubleshoot the issue? |
I think I got it working. Just change It seems that Also, I found it helpful to add this to the top of the notebook:
|
Indeed, appender's To be more accurate, though, the difference seems to be not in static vs. dynamic use, but elsewhere. The following code, if put into a standalone Scala program, val configBuilder = ConfigurationBuilderFactory.newConfigurationBuilder()
val configuration = configBuilder
.add(
configBuilder
.newAppender("Console", "CONSOLE")
)
.add(
configBuilder
.newAppender("Otel", "OpenTelemetry")
.addAttribute("captureMapMessageAttributes", true) // Capture extra MapMessage attributes
.addAttribute("captureContextDataAttributes", "*") // Capture ThreadContext attributes
)
.add(
configBuilder
.newRootLogger(Level.INFO)
.add(configBuilder.newAppenderRef("Console"))
.add(configBuilder.newAppenderRef("Otel"))
)
.build(false)
Configurator.initialize(configuration)
val sdkBuilder = AutoConfiguredOpenTelemetrySdk.builder()
AzureMonitorExporter.customize(sdkBuilder, aiConnStr)
val openTelemetry = sdkBuilder.build().getOpenTelemetrySdk()
OpenTelemetryAppender.install(openTelemetry) also configures logging in run time, but it is compatible with |
I see different behavior, you can check out my Java repro: |
No, I mean, if you use It is still dynamic, but Not sure to which extent it is useful, but I think it's enough to say that the problem lies deeper than static vs dynamic split. |
btw, not sure if this might be interesting / relevant to follow: open-telemetry/opentelemetry-java-instrumentation#12468 |
@greatvovan just checking if it's ok to close this issue, or if you'd like to leave it open? thanks |
My issue is resolved, but I am thinking if we can derive moral from the story. Could the library handle the problem of missing/incorrect dependency better? Following up on this comment, wouldn't it be better to fail more loudly than posting a quiet warning? Was it the intended design to not interfere with the application in case of problems, or we get that as an undesired side effect? My thinking is that if the user added instrumentation, it become an integral part of the program and cannot be easily sacrificed. |
Describe the bug
When running a Scala notebook in a Databricks cluster, instrumentation using
AzureMonitorExporterBuilder
does not produce any output in the App Insights backend.Exception or Stack Trace
No errors.
To Reproduce
com.azure:azure-monitor-opentelemetry-exporter:1.0.0-beta.30
in the cluster.Code Snippet
Important: run the configuration code (till– fixed in version 1.0.0-beta.29.sdkBuilder.build().getOpenTelemetrySdk()
) only once (see #41859). If you ran it and gotConfigurationException
, restart the compute cluster (Run -> Restart compute resource).The above code as a gist: https://gist.github.com/greatvovan/a2148bccf2c0e0c2305e8a35e0779dc3
Alternative (with manual configuration): https://gist.github.com/greatvovan/5e9e3d8ecaf4e210ade619c5b55455b3
Expected behavior
3 entries in
traces
table (2 logs, 1 event), 1 entry incustomMetrics
, 1 entry independencies
table (span) created in App InsightsScreenshots
N/A
Additional context
Setup (please complete the following information):
Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
EDIT: Updated the gists to look more like linear Scala programs.
EDIT: Updated the code for version 1.0.0-beta.29 (the gists remain for version 28).
EDIT: Bumped version to 1.0.0-beta.30.
The text was updated successfully, but these errors were encountered: