-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APM UI doesn't show connected traces sent from AWS Distro for OTel (Lambda) #458
Comments
@michaelhyatt would you be able to run the opentelemetry-collector with the logging exporter, and capture the logged spans? Then we can compare to the docs we're recording. One thing I can see is that the consumer transaction's |
@axw I have the debug logs of the embedded OTel collector that recorded the messages it sends. 2 logs are attached. Here is the example of the debug of the span that the consumer reports:
|
@michaelhyatt Can you try to disable X-Ray tracing in the API Gateway (https://docs.aws.amazon.com/xray/latest/devguide/xray-services-apigateway.html) and see whether it solves the observed problem? |
@AlexanderWert I disabled the API gateway tracing and now it doesn't appear on X-Ray service map or waterfall diagram: Kibana APM UI still doesn't show the trace. These are the documents recorded in Elasticsearch.Consumer transaction{
"_index": "apm-7.13.2-transaction-000001",
"_type": "_doc",
"_id": "D-ToRXoB6AbCXZTTT6Dt",
"_version": 1,
"_score": null,
"fields": {
"transaction.name.text": ["handler.consumer"],
"service.framework.version": ["0.16.dev0"],
"observer.name": ["instance-0000000009"],
"labels.faas_name": ["aws-python-api-worker-project-dev-consumer"],
"service.language.name": ["python"],
"labels.faas_id": ["arn:aws:lambda:ap-southeast-2:401722391821:function:aws-python-api-worker-project-dev-consumer"],
"transaction.sampled": [true],
"transaction.id": ["8d38b2d9bee1c20f"],
"trace.id": ["60d681143464a3ff1c61880768907b0a"],
"processor.event": ["transaction"],
"agent.name": ["opentelemetry/python"],
"labels.faas_execution": ["4bc4f6f3-71e3-4380-91d7-5277a79be05e"],
"event.outcome": ["unknown"],
"cloud.region": ["ap-southeast-2"],
"service.name": ["aws-python-api-worker-project-dev-consumer"],
"service.framework.name": ["opentelemetry.instrumentation.aws_lambda"],
"processor.name": ["transaction"],
"transaction.duration.us": [605],
"labels.faas_version": ["$LATEST"],
"observer.version_major": [7],
"observer.hostname": ["1e0706f92f11"],
"transaction.type": ["custom"],
"cloud.provider": ["aws"],
"event.ingested": ["2021-06-26T01:21:30.604Z"],
"timestamp.us": [1624670489577889],
"@timestamp": ["2021-06-26T01:21:29.577Z"],
"ecs.version": ["1.8.0"],
"observer.type": ["apm-server"],
"observer.version": ["7.13.2"],
"agent.version": ["1.2.0"],
"parent.id": ["e53f6b191e1d0281"],
"transaction.name": ["handler.consumer"]
},
"highlight": {
"trace.id": ["@kibana-highlighted-field@60d681143464a3ff1c61880768907b0a@/kibana-highlighted-field@"]
},
"sort": [1624670489577]
} Producer span{
"_index": "apm-7.13.2-span-000001",
"_type": "_doc",
"_id": "EeToRXoB6AbCXZTTT6Dt",
"_version": 1,
"_score": null,
"fields": {
"span.destination.service.type": ["external"],
"span.name": ["HTTP GET"],
"destination.port": [443],
"observer.name": ["instance-0000000009"],
"labels.faas_name": ["aws-python-api-worker-project-dev-producer"],
"trace.id": ["60d681143464a3ff1c61880768907b0a"],
"span.duration.us": [2386427],
"processor.event": ["span"],
"agent.name": ["opentelemetry/python"],
"destination.address": ["bydwx5ss24.execute-api.ap-southeast-2.amazonaws.com"],
"event.outcome": ["success"],
"cloud.region": ["ap-southeast-2"],
"service.name": ["aws-python-api-worker-project-dev-producer"],
"processor.name": ["transaction"],
"labels.faas_version": ["$LATEST"],
"span.id": ["e5edc4fb3045615a"],
"span.subtype": ["http"],
"observer.version_major": [7],
"observer.hostname": ["1e0706f92f11"],
"span.type": ["external"],
"cloud.provider": ["aws"],
"timestamp.us": [1624670487255663],
"@timestamp": ["2021-06-26T01:21:27.255Z"],
"ecs.version": ["1.8.0"],
"observer.type": ["apm-server"],
"observer.version": ["7.13.2"],
"agent.version": ["1.2.0"],
"parent.id": ["a64ec14bc1536181"],
"span.destination.service.name": ["https://bydwx5ss24.execute-api.ap-southeast-2.amazonaws.com"],
"span.destination.service.resource": ["bydwx5ss24.execute-api.ap-southeast-2.amazonaws.com:443"]
},
"highlight": {
"trace.id": ["@kibana-highlighted-field@60d681143464a3ff1c61880768907b0a@/kibana-highlighted-field@"]
},
"sort": [1624670487255]
} Producer transaction{
"_index": "apm-7.13.2-transaction-000001",
"_type": "_doc",
"_id": "EOToRXoB6AbCXZTTT6Dt",
"_version": 1,
"_score": null,
"fields": {
"transaction.name.text": ["handler.producer"],
"service.framework.version": ["0.21b0"],
"observer.name": ["instance-0000000009"],
"labels.faas_name": ["aws-python-api-worker-project-dev-producer"],
"service.language.name": ["python"],
"labels.faas_id": ["arn:aws:lambda:ap-southeast-2:401722391821:function:aws-python-api-worker-project-dev-producer"],
"transaction.sampled": [true],
"transaction.id": ["a64ec14bc1536181"],
"trace.id": ["60d681143464a3ff1c61880768907b0a"],
"processor.event": ["transaction"],
"agent.name": ["opentelemetry/python"],
"labels.faas_execution": ["4580f63e-2761-408e-99f3-046cb774d7e3"],
"event.outcome": ["unknown"],
"cloud.region": ["ap-southeast-2"],
"service.name": ["aws-python-api-worker-project-dev-producer"],
"service.framework.name": ["opentelemetry.instrumentation.requests"],
"processor.name": ["transaction"],
"transaction.duration.us": [2389360],
"labels.faas_version": ["$LATEST"],
"observer.version_major": [7],
"observer.hostname": ["1e0706f92f11"],
"transaction.type": ["custom"],
"cloud.provider": ["aws"],
"event.ingested": ["2021-06-26T01:21:30.605Z"],
"timestamp.us": [1624670487254602],
"@timestamp": ["2021-06-26T01:21:27.254Z"],
"ecs.version": ["1.8.0"],
"observer.type": ["apm-server"],
"observer.version": ["7.13.2"],
"agent.version": ["1.2.0"],
"parent.id": ["7ade6ef5e6220237"],
"transaction.name": ["handler.producer"]
},
"highlight": {
"trace.id": ["@kibana-highlighted-field@60d681143464a3ff1c61880768907b0a@/kibana-highlighted-field@"]
},
"sort": [1624670487254]
} |
@michaelhyatt I've transferred this to elastic/apm as (AFAICS) there's no bug with the translation of OpenTelemetry spans -- the server is translating them faithfully. It appears that API GW is injecting/mutating traceparent, but is not sending any OTel spans. When we receive an OTel span for Lambda, it has a parent span ID which points to nowhere. That missing API GW span would theoretically be the trace root, and we base certain features on the trace roots, such as the list of traces. If API GW spans were sent to APM Server, then I expect it would all just work. Seems like missing/broken functionality in API GW to me. |
@axw The latest test was done without API-GW tracing turned on. Are you saying the API-GW is still changing the trace header even though it is not part of X-Ray traces anymore? |
Yes, it looks like API GW is still adding a traceparent header before both Lambdas. The producer transaction has a field |
Could it be the Lambda extension that is adding the |
Possibly, I don't know. I should have said X-Ray header anyway, since there's no W3C traceparent headers involved here. The bottom line is that we're being told the producer span has a parent which does not exist, and so none of the spans is considered to be the trace root. |
I come to the same conclusion
The |
APM Server version (
apm-server version
): 7.13.1Description of the problem including expected versus actual behavior:
I managed to get the distributed tracing going with AWS X-Ray and the OTel collector to pick it up. The flow is
curl => API-GW => producer-lambda => API-GW => consumer-lambda
. The code (serverless): https://github.com/michaelhyatt/serverless-aws-python-api-worker-otelThe interesting thing is that we are recording the 2 transactions with 1 span correctly as having the same
trace.id
, however, the APM UI doesn’t really show a proper trace.AWS X-Ray showing the trace with 2 components:
Discover shows 3 records, 2 transactions and 1 span with the same
trace.id
:The APM UI is showing 2 services, but no traces:
Steps to reproduce:
As per README, requires the serverless framework.
JSON documents
Consumer transaction
Producer HTTP span
Producer transaction
The text was updated successfully, but these errors were encountered: