Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parsing of binary datadog headers in SQS #7324

Merged
merged 2 commits into from
Jul 18, 2024
Merged

Fix parsing of binary datadog headers in SQS #7324

merged 2 commits into from
Jul 18, 2024

Conversation

vandonr
Copy link
Contributor

@vandonr vandonr commented Jul 12, 2024

What Does This Do

The binary data we receive when the data type is marked as binary is not base64, it's the string itself, so we don't need to decode it.

Additional Notes

Jira ticket: AIDM-205

@vandonr vandonr requested review from a team as code owners July 12, 2024 16:29
@vandonr vandonr requested review from dougqh and ygree July 12, 2024 16:29
Copy link
Contributor

@dougqh dougqh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this change is fine.

But I'm wondering why did we assume that the headers were base64 encoded in the first place?

@vandonr
Copy link
Contributor Author

vandonr commented Jul 12, 2024

But I'm wondering why did we assume that the headers were base64 encoded in the first place?

I don't know, maybe because if you look at the message in json form, the binary value is printed as base64 ?

The PR introducing this was #5920 by @mcculls

@mcculls
Copy link
Contributor

mcculls commented Jul 12, 2024

But I'm wondering why did we assume that the headers were base64 encoded in the first place?

IIRC the code example originally shared with us did base64 decoding for the binary payload.

The odd thing is that I can still see base64 decoding used in various places when handling _datadog:

https://github.com/DataDog/datadog-lambda-python/blob/main/datadog_lambda/tracing.py#L263
https://github.com/DataDog/dd-trace-dotnet/blob/master/tracer/src/Datadog.Trace/ClrProfiler/AutoInstrumentation/AWS/SQS/AwsSqsHeadersAdapters.cs#L79
https://github.com/DataDog/datadog-lambda-js/blob/main/src/trace/context/extractors/sns-sqs.ts#L22
https://github.com/DataDog/dd-trace-js/blob/master/packages/datadog-plugin-aws-sdk/src/services/sqs.js#L161

Hopefully we don't have a situation where some tracers are sending the JSON string base64 encoded and others aren't - because then we may have to peek at the string to see if we need to base64 decode it.

Since the practical results in AIDM-205 show base64 decoding is not required we should remove it - but we should keep an eye open for any cases where the opposite is true

@PerfectSlayer
Copy link
Contributor

@vandonr @dougqh Can you set up the PR labels (like bug and the related instrumentation/component)? It will help @nayeem-kamal with the next release 😉

@vandonr vandonr added type: bug inst: aws sdk AWS SDK instrumentation labels Jul 17, 2024
@vandonr
Copy link
Contributor Author

vandonr commented Jul 18, 2024

IIRC the code example originally shared with us did base64 decoding for the binary payload.

The odd thing is that I can still see base64 decoding used in various places when handling _datadog:

https://github.com/DataDog/datadog-lambda-python/blob/main/datadog_lambda/tracing.py#L263 https://github.com/DataDog/dd-trace-dotnet/blob/master/tracer/src/Datadog.Trace/ClrProfiler/AutoInstrumentation/AWS/SQS/AwsSqsHeadersAdapters.cs#L79 https://github.com/DataDog/datadog-lambda-js/blob/main/src/trace/context/extractors/sns-sqs.ts#L22 https://github.com/DataDog/dd-trace-js/blob/master/packages/datadog-plugin-aws-sdk/src/services/sqs.js#L161

I'm not too sure about py and ts, but for dotnet, the base 64 decode is applied when the payload is a String.

Base64 is meant to carry bytes in a String format (in a less efficient way). If we already have a bytes, it really makes no sense to use base64.

Hopefully we don't have a situation where some tracers are sending the JSON string base64 encoded and others aren't - because then we may have to peek at the string to see if we need to base64 decode it.

yeah... thankfully, json strings start with a {, which is an invalid base64 character, so looking at the first character should be enough to discriminate.

Since the practical results in AIDM-205 show base64 decoding is not required we should remove it - but we should keep an eye open for any cases where the opposite is true

:nod:

@vandonr
Copy link
Contributor Author

vandonr commented Jul 18, 2024

Hopefully we don't have a situation where some tracers are sending the JSON string base64 encoded and others aren't - because then we may have to peek at the string to see if we need to base64 decode it.

yeah... thankfully, json strings start with a {, which is an invalid base64 character, so looking at the first character should be enough to discriminate.

Since it's pretty simple, and can help prevent potential uncomfortable escalations with clients, I added that mechanism to support both while we figure out if we actually need it.

@pr-commenter

This comment was marked as off-topic.

@vandonr vandonr merged commit 24b87f7 into master Jul 18, 2024
82 checks passed
@vandonr vandonr deleted the vandonr/fix2 branch July 18, 2024 12:03
@github-actions github-actions bot added this to the 1.38.0 milestone Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inst: aws sdk AWS SDK instrumentation type: bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants