-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create S3 instrumentation + add span pointers #8075
base: master
Are you sure you want to change the base?
Conversation
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 53 metrics, 10 unstable metrics. Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.44.0-SNAPSHOT~a913e1f361, baseline=1.45.0-SNAPSHOT~fd1f40f934
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.1 s) : 0, 1099854
Total [baseline] (8.672 s) : 0, 8671813
Agent [candidate] (1.093 s) : 0, 1093372
Total [candidate] (8.646 s) : 0, 8646128
section iast
Agent [baseline] (1.22 s) : 0, 1219828
Total [baseline] (9.212 s) : 0, 9211726
Agent [candidate] (1.221 s) : 0, 1220545
Total [candidate] (9.241 s) : 0, 9240858
section iast_HARDCODED_SECRET_DISABLED
Agent [baseline] (1.225 s) : 0, 1224899
Total [baseline] (9.161 s) : 0, 9160978
Agent [candidate] (1.22 s) : 0, 1219833
Total [candidate] (9.173 s) : 0, 9173399
section iast_TELEMETRY_OFF
Agent [baseline] (1.229 s) : 0, 1228957
Total [baseline] (9.202 s) : 0, 9201846
Agent [candidate] (1.228 s) : 0, 1228282
Total [candidate] (9.22 s) : 0, 9220137
gantt
title insecure-bank - break down per module: candidate=1.44.0-SNAPSHOT~a913e1f361, baseline=1.45.0-SNAPSHOT~fd1f40f934
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (700.679 ms) : 0, 700679
BytebuddyAgent [candidate] (696.829 ms) : 0, 696829
GlobalTracer [baseline] (318.635 ms) : 0, 318635
GlobalTracer [candidate] (317.945 ms) : 0, 317945
AppSec [baseline] (55.321 ms) : 0, 55321
AppSec [candidate] (54.243 ms) : 0, 54243
Remote Config [baseline] (688.395 µs) : 0, 688
Remote Config [candidate] (693.353 µs) : 0, 693
Telemetry [baseline] (10.714 ms) : 0, 10714
Telemetry [candidate] (9.873 ms) : 0, 9873
section iast
BytebuddyAgent [baseline] (813.179 ms) : 0, 813179
BytebuddyAgent [candidate] (813.675 ms) : 0, 813675
GlobalTracer [baseline] (305.719 ms) : 0, 305719
GlobalTracer [candidate] (305.691 ms) : 0, 305691
AppSec [baseline] (58.025 ms) : 0, 58025
AppSec [candidate] (57.382 ms) : 0, 57382
IAST [baseline] (21.054 ms) : 0, 21054
IAST [candidate] (21.88 ms) : 0, 21880
Remote Config [baseline] (631.61 µs) : 0, 632
Remote Config [candidate] (647.391 µs) : 0, 647
Telemetry [baseline] (7.482 ms) : 0, 7482
Telemetry [candidate] (7.492 ms) : 0, 7492
section iast_HARDCODED_SECRET_DISABLED
BytebuddyAgent [baseline] (817.725 ms) : 0, 817725
BytebuddyAgent [candidate] (812.925 ms) : 0, 812925
GlobalTracer [baseline] (305.815 ms) : 0, 305815
GlobalTracer [candidate] (305.765 ms) : 0, 305765
AppSec [baseline] (57.422 ms) : 0, 57422
AppSec [candidate] (58.245 ms) : 0, 58245
IAST [baseline] (22.043 ms) : 0, 22043
IAST [candidate] (20.977 ms) : 0, 20977
Remote Config [baseline] (621.738 µs) : 0, 622
Remote Config [candidate] (637.694 µs) : 0, 638
Telemetry [baseline] (7.441 ms) : 0, 7441
Telemetry [candidate] (7.517 ms) : 0, 7517
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (820.162 ms) : 0, 820162
BytebuddyAgent [candidate] (817.953 ms) : 0, 817953
GlobalTracer [baseline] (307.08 ms) : 0, 307080
GlobalTracer [candidate] (308.597 ms) : 0, 308597
AppSec [baseline] (57.301 ms) : 0, 57301
AppSec [candidate] (57.51 ms) : 0, 57510
IAST [baseline] (22.509 ms) : 0, 22509
IAST [candidate] (21.955 ms) : 0, 21955
Remote Config [baseline] (628.962 µs) : 0, 629
Remote Config [candidate] (651.287 µs) : 0, 651
Telemetry [baseline] (7.365 ms) : 0, 7365
Telemetry [candidate] (7.643 ms) : 0, 7643
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.44.0-SNAPSHOT~a913e1f361, baseline=1.45.0-SNAPSHOT~fd1f40f934
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.093 s) : 0, 1092568
Total [baseline] (10.49 s) : 0, 10489692
Agent [candidate] (1.093 s) : 0, 1093229
Total [candidate] (10.414 s) : 0, 10414035
section appsec
Agent [baseline] (1.226 s) : 0, 1225582
Total [baseline] (10.677 s) : 0, 10677021
Agent [candidate] (1.236 s) : 0, 1236148
Total [candidate] (10.741 s) : 0, 10740908
section iast
Agent [baseline] (1.228 s) : 0, 1227672
Total [baseline] (10.944 s) : 0, 10943859
Agent [candidate] (1.22 s) : 0, 1220195
Total [candidate] (11.015 s) : 0, 11015301
section profiling
Agent [baseline] (1.321 s) : 0, 1320961
Total [baseline] (10.876 s) : 0, 10876001
Agent [candidate] (1.318 s) : 0, 1318456
Total [candidate] (10.781 s) : 0, 10781340
gantt
title petclinic - break down per module: candidate=1.44.0-SNAPSHOT~a913e1f361, baseline=1.45.0-SNAPSHOT~fd1f40f934
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (695.567 ms) : 0, 695567
BytebuddyAgent [candidate] (696.892 ms) : 0, 696892
GlobalTracer [baseline] (316.503 ms) : 0, 316503
GlobalTracer [candidate] (317.99 ms) : 0, 317990
AppSec [baseline] (54.635 ms) : 0, 54635
AppSec [candidate] (54.687 ms) : 0, 54687
Remote Config [baseline] (689.461 µs) : 0, 689
Remote Config [candidate] (689.753 µs) : 0, 690
Telemetry [baseline] (11.428 ms) : 0, 11428
Telemetry [candidate] (9.174 ms) : 0, 9174
section appsec
BytebuddyAgent [baseline] (712.268 ms) : 0, 712268
BytebuddyAgent [candidate] (719.523 ms) : 0, 719523
GlobalTracer [baseline] (313.773 ms) : 0, 313773
GlobalTracer [candidate] (316.651 ms) : 0, 316651
AppSec [baseline] (167.87 ms) : 0, 167870
AppSec [candidate] (167.758 ms) : 0, 167758
Remote Config [baseline] (648.365 µs) : 0, 648
Remote Config [candidate] (646.75 µs) : 0, 647
Telemetry [baseline] (7.792 ms) : 0, 7792
Telemetry [candidate] (7.797 ms) : 0, 7797
IAST [baseline] (19.045 ms) : 0, 19045
IAST [candidate] (19.849 ms) : 0, 19849
section iast
BytebuddyAgent [baseline] (818.949 ms) : 0, 818949
BytebuddyAgent [candidate] (813.585 ms) : 0, 813585
GlobalTracer [baseline] (307.5 ms) : 0, 307500
GlobalTracer [candidate] (305.526 ms) : 0, 305526
AppSec [baseline] (58.116 ms) : 0, 58116
AppSec [candidate] (58.063 ms) : 0, 58063
Remote Config [baseline] (627.962 µs) : 0, 628
Remote Config [candidate] (647.887 µs) : 0, 648
Telemetry [baseline] (7.521 ms) : 0, 7521
Telemetry [candidate] (7.587 ms) : 0, 7587
IAST [baseline] (21.112 ms) : 0, 21112
IAST [candidate] (21.023 ms) : 0, 21023
section profiling
ProfilingAgent [baseline] (94.205 ms) : 0, 94205
ProfilingAgent [candidate] (93.679 ms) : 0, 93679
BytebuddyAgent [baseline] (691.375 ms) : 0, 691375
BytebuddyAgent [candidate] (689.852 ms) : 0, 689852
GlobalTracer [baseline] (434.013 ms) : 0, 434013
GlobalTracer [candidate] (433.534 ms) : 0, 433534
AppSec [baseline] (53.784 ms) : 0, 53784
AppSec [candidate] (53.847 ms) : 0, 53847
Remote Config [baseline] (676.767 µs) : 0, 677
Remote Config [candidate] (662.39 µs) : 0, 662
Telemetry [baseline] (7.677 ms) : 0, 7677
Telemetry [candidate] (7.721 ms) : 0, 7721
Profiling [baseline] (94.23 ms) : 0, 94230
Profiling [candidate] (93.703 ms) : 0, 93703
LoadParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 16 unstable metrics. Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.44.0-SNAPSHOT~a913e1f361, baseline=1.45.0-SNAPSHOT~fd1f40f934
dateFormat X
axisFormat %s
section baseline
no_agent (1.355 ms) : 1334, 1375
. : milestone, 1355,
appsec (1.75 ms) : 1726, 1773
. : milestone, 1750,
appsec_no_iast (1.74 ms) : 1715, 1765
. : milestone, 1740,
iast (1.495 ms) : 1471, 1518
. : milestone, 1495,
profiling (1.511 ms) : 1487, 1535
. : milestone, 1511,
tracing (1.471 ms) : 1446, 1496
. : milestone, 1471,
section candidate
no_agent (1.351 ms) : 1332, 1370
. : milestone, 1351,
appsec (1.749 ms) : 1725, 1773
. : milestone, 1749,
appsec_no_iast (1.745 ms) : 1719, 1770
. : milestone, 1745,
iast (1.498 ms) : 1475, 1522
. : milestone, 1498,
profiling (1.505 ms) : 1482, 1528
. : milestone, 1505,
tracing (1.48 ms) : 1455, 1506
. : milestone, 1480,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.44.0-SNAPSHOT~a913e1f361, baseline=1.45.0-SNAPSHOT~fd1f40f934
dateFormat X
axisFormat %s
section baseline
no_agent (378.29 µs) : 359, 398
. : milestone, 378,
iast (492.666 µs) : 471, 514
. : milestone, 493,
iast_FULL (657.225 µs) : 636, 679
. : milestone, 657,
iast_GLOBAL (518.246 µs) : 496, 540
. : milestone, 518,
iast_HARDCODED_SECRET_DISABLED (489.15 µs) : 468, 511
. : milestone, 489,
iast_INACTIVE (449.065 µs) : 428, 470
. : milestone, 449,
iast_TELEMETRY_OFF (479.694 µs) : 458, 502
. : milestone, 480,
tracing (449.542 µs) : 429, 470
. : milestone, 450,
section candidate
no_agent (385.616 µs) : 366, 406
. : milestone, 386,
iast (491.057 µs) : 469, 513
. : milestone, 491,
iast_FULL (644.918 µs) : 624, 666
. : milestone, 645,
iast_GLOBAL (513.064 µs) : 491, 535
. : milestone, 513,
iast_HARDCODED_SECRET_DISABLED (500.075 µs) : 478, 522
. : milestone, 500,
iast_INACTIVE (448.328 µs) : 427, 469
. : milestone, 448,
iast_TELEMETRY_OFF (473.909 µs) : 453, 495
. : milestone, 474,
tracing (451.936 µs) : 431, 473
. : milestone, 452,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics. Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.44.0-SNAPSHOT~a913e1f361, baseline=1.45.0-SNAPSHOT~fd1f40f934
dateFormat X
axisFormat %s
section baseline
no_agent (1.478 ms) : 1466, 1490
. : milestone, 1478,
appsec (2.351 ms) : 2309, 2393
. : milestone, 2351,
iast (2.092 ms) : 2040, 2145
. : milestone, 2092,
iast_GLOBAL (2.142 ms) : 2089, 2195
. : milestone, 2142,
profiling (1.95 ms) : 1908, 1992
. : milestone, 1950,
tracing (1.93 ms) : 1890, 1971
. : milestone, 1930,
section candidate
no_agent (1.469 ms) : 1458, 1481
. : milestone, 1469,
appsec (2.354 ms) : 2312, 2395
. : milestone, 2354,
iast (2.102 ms) : 2049, 2155
. : milestone, 2102,
iast_GLOBAL (2.139 ms) : 2086, 2192
. : milestone, 2139,
profiling (1.966 ms) : 1923, 2009
. : milestone, 1966,
tracing (1.942 ms) : 1901, 1982
. : milestone, 1942,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.44.0-SNAPSHOT~a913e1f361, baseline=1.45.0-SNAPSHOT~fd1f40f934
dateFormat X
axisFormat %s
section baseline
no_agent (15.513 s) : 15513000, 15513000
. : milestone, 15513000,
appsec (15.167 s) : 15167000, 15167000
. : milestone, 15167000,
iast (18.777 s) : 18777000, 18777000
. : milestone, 18777000,
iast_GLOBAL (18.285 s) : 18285000, 18285000
. : milestone, 18285000,
profiling (14.883 s) : 14883000, 14883000
. : milestone, 14883000,
tracing (15.009 s) : 15009000, 15009000
. : milestone, 15009000,
section candidate
no_agent (15.535 s) : 15535000, 15535000
. : milestone, 15535000,
appsec (14.924 s) : 14924000, 14924000
. : milestone, 14924000,
iast (19.014 s) : 19014000, 19014000
. : milestone, 19014000,
iast_GLOBAL (17.903 s) : 17903000, 17903000
. : milestone, 17903000,
profiling (14.692 s) : 14692000, 14692000
. : milestone, 14692000,
tracing (15.121 s) : 15121000, 15121000
. : milestone, 15121000,
|
bf87219
to
49a2179
Compare
AgentTracer.NoopContext zeroContext = AgentTracer.NoopContext.INSTANCE; | ||
span.addLink(SpanLink.from(zeroContext, AgentSpanLink.DEFAULT_FLAGS, "", attributes)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we can't pass trace context, we use our attributes (specifically ptr.hash) to link the two traces, and we just use a trace context with values of zero for the trace/span id.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mhlidd Another case where span link can have an invalid span context 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments as preliminary review.
You might want another pair of eyes from the IDM team 😉
pass { | ||
group = "software.amazon.awssdk" | ||
module = "s3" | ||
versions = "[2.10.36,3)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The instrumentation module name is aws-java-s3-2.0
while muzzle only allows 2.10+
. Is it expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, we could support older 2.x versions by using the deprecated .bucket()
and .key()
as a fallback if destinationBucket()
fail in CopyObjectRequest. But I figured this was unnecessary, as 2.10.36 is over 4 years old now. Do you think we need to support older versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4y old does not feel that old but I would rather let people from IDM define the expectation here.
Any feedback @amarziali ?
...va-s3-2.0/src/main/java/datadog/trace/instrumentation/aws/v2/s3/S3ClientInstrumentation.java
Outdated
Show resolved
Hide resolved
...ion/aws-java-s3-2.0/src/main/java/datadog/trace/instrumentation/aws/v2/s3/S3Interceptor.java
Outdated
Show resolved
Hide resolved
…s' into nicholas.hulston/s3-span-pointers
MessageDigest.getInstance("SHA-256") | ||
.digest(String.join("|", components).getBytes(StandardCharsets.UTF_8)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pretty sure you can avoid the join
operation by "resuming" the hash from the previous return value, something like
var current_hash
foreach string in components
current_hash = hash(string, current_hash)
see how it was done in https://github.com/DataDog/dd-trace-java/blob/a8b33d5c49af8938d816f0faca5e5534d254297f/internal-api/src/main/java/datadog/trace/util/FNV64Hash.java
|
||
// Get bucket, key, and eTag for hash calculation. | ||
// https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/S3Client.html | ||
if (request instanceof PutObjectRequest) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Computation for each s3 operation can be expensive. Can this be put behind a feature flag?
} | ||
String[] components = new String[] {bucket, key, eTag}; | ||
try { | ||
addSpanPointer(span, S3_PTR_KIND, components); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again on performance here, this computation should not be done on the application Thread. Instead, if we add the etag, bucket and key as a tag (they can be removed afterwards if not needed), we can lazily compute those values on a TagPostProcessor
that will run on the agent thread hence not affecting directly the performances of the user code itself
What Does This Do
Creates an instrumentation for S3 and adds span pointers in S3 for
putObject
,copyObject
, andcompleteMultipartUpload
requests.Span pointers are similar to Span Links, but for cases when it is impossible to pass the Trace ID and Span ID between the spans that need to be linked.
When the calculated hashes for the upstream and downstream lambdas match, the Datadog backend will automatically link the two traces together.
When clicking on the linked span, a new tab opens linking to the downstream Lambda function that was triggered by this S3 object update.
Motivation
Span pointers is a new feature being developed by the Serverless team. This feature already exists in Python & Node, and I'm working on adding it to other runtimes (.NET, Java, Golang).
Testing Guidelines
Easy: Checkout this span, which is an upstream Lambda that makes all three S3 requests and triggers 3 separate downstream Lambda invocations. Therefore, the 3 span pointers should link to 3 different traces. Each downstream trace should point back to this span.
More thorough: Run this Lambda function (only on the serverless AWS account) with the event payload
and change one of the bools to true. Check Datadog to ensure that the spans are properly linked.
I also added tests:
Additional Notes
Contributor Checklist
type:
and (comp:
orinst:
) labels in addition to any usefull labelsclose
,fix
or any linking keywords when referencing an issue.Use
solves
instead, and assign the PR milestone to the issueJira ticket: SVLS-6060