Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track payload size in Rabbit & SQS integrations #6067

Closed
wants to merge 3 commits into from
Closed

Conversation

vandonr
Copy link
Contributor

@vandonr vandonr commented Oct 19, 2023

follow up on #6045
adding the same capabilities for other messaging systems, as it was done in .net in DataDog/dd-trace-dotnet#4520

Base automatically changed from vandonr/payload to master October 31, 2023 10:17
@pr-commenter
Copy link

pr-commenter bot commented Nov 6, 2023

Benchmarks

Startup

Parameters

Baseline Candidate
commit 1.23.0-SNAPSHOT~f90d0e58a0 1.23.0-SNAPSHOT~970ce813a7
config baseline candidate
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
module Agent Agent
parent None None
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 54 cases.

Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.23.0-SNAPSHOT~970ce813a7, baseline=1.23.0-SNAPSHOT~f90d0e58a0

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.034 s) : 0, 1033861
Total [baseline] (9.299 s) : 0, 9298553
Agent [candidate] (1.052 s) : 0, 1052427
Total [candidate] (9.398 s) : 0, 9397559
section appsec
Agent [baseline] (1.121 s) : 0, 1121114
Total [baseline] (9.406 s) : 0, 9405954
Agent [candidate] (1.127 s) : 0, 1126786
Total [candidate] (9.434 s) : 0, 9434184
section iast
Agent [baseline] (1.159 s) : 0, 1158520
Total [baseline] (9.548 s) : 0, 9547870
Agent [candidate] (1.16 s) : 0, 1159598
Total [candidate] (9.594 s) : 0, 9594484
section profiling
Agent [baseline] (1.221 s) : 0, 1221322
Total [baseline] (9.552 s) : 0, 9552406
Agent [candidate] (1.222 s) : 0, 1221630
Total [candidate] (9.564 s) : 0, 9564078
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.034 s -
Agent appsec 1.121 s 87.252 ms (8.4%)
Agent iast 1.159 s 124.658 ms (12.1%)
Agent profiling 1.221 s 187.461 ms (18.1%)
Total tracing 9.299 s -
Total appsec 9.406 s 107.401 ms (1.2%)
Total iast 9.548 s 249.317 ms (2.7%)
Total profiling 9.552 s 253.853 ms (2.7%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.052 s -
Agent appsec 1.127 s 74.359 ms (7.1%)
Agent iast 1.16 s 107.172 ms (10.2%)
Agent profiling 1.222 s 169.204 ms (16.1%)
Total tracing 9.398 s -
Total appsec 9.434 s 36.625 ms (0.4%)
Total iast 9.594 s 196.925 ms (2.1%)
Total profiling 9.564 s 166.519 ms (1.8%)
gantt
    title petclinic - break down per module: candidate=1.23.0-SNAPSHOT~970ce813a7, baseline=1.23.0-SNAPSHOT~f90d0e58a0

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (645.342 ms) : 0, 645342
BytebuddyAgent [candidate] (656.279 ms) : 0, 656279
GlobalTracer [baseline] (293.423 ms) : 0, 293423
GlobalTracer [candidate] (299.074 ms) : 0, 299074
AppSec [baseline] (48.791 ms) : 0, 48791
AppSec [candidate] (49.925 ms) : 0, 49925
Remote Config [baseline] (692.134 µs) : 0, 692
Remote Config [candidate] (711.046 µs) : 0, 711
Telemetry [baseline] (11.235 ms) : 0, 11235
Telemetry [candidate] (11.483 ms) : 0, 11483
section appsec
BytebuddyAgent [baseline] (646.508 ms) : 0, 646508
BytebuddyAgent [candidate] (649.873 ms) : 0, 649873
GlobalTracer [baseline] (294.143 ms) : 0, 294143
GlobalTracer [candidate] (295.956 ms) : 0, 295956
AppSec [baseline] (138.474 ms) : 0, 138474
AppSec [candidate] (138.641 ms) : 0, 138641
Remote Config [baseline] (650.104 µs) : 0, 650
Remote Config [candidate] (650.004 µs) : 0, 650
Telemetry [baseline] (6.869 ms) : 0, 6869
Telemetry [candidate] (6.978 ms) : 0, 6978
section iast
BytebuddyAgent [baseline] (771.494 ms) : 0, 771494
BytebuddyAgent [candidate] (771.687 ms) : 0, 771687
GlobalTracer [baseline] (275.354 ms) : 0, 275354
GlobalTracer [candidate] (276.389 ms) : 0, 276389
AppSec [baseline] (46.915 ms) : 0, 46915
AppSec [candidate] (46.915 ms) : 0, 46915
IAST [baseline] (17.605 ms) : 0, 17605
IAST [candidate] (16.846 ms) : 0, 16846
Remote Config [baseline] (582.66 µs) : 0, 583
Remote Config [candidate] (564.731 µs) : 0, 565
Telemetry [baseline] (11.847 ms) : 0, 11847
Telemetry [candidate] (12.475 ms) : 0, 12475
section profiling
BytebuddyAgent [baseline] (657.515 ms) : 0, 657515
BytebuddyAgent [candidate] (658.225 ms) : 0, 658225
GlobalTracer [baseline] (359.346 ms) : 0, 359346
GlobalTracer [candidate] (358.981 ms) : 0, 358981
AppSec [baseline] (49.504 ms) : 0, 49504
AppSec [candidate] (49.37 ms) : 0, 49370
Remote Config [baseline] (644.199 µs) : 0, 644
Remote Config [candidate] (646.039 µs) : 0, 646
Telemetry [baseline] (11.299 ms) : 0, 11299
Telemetry [candidate] (11.293 ms) : 0, 11293
ProfilingAgent [baseline] (88.787 ms) : 0, 88787
ProfilingAgent [candidate] (88.826 ms) : 0, 88826
Profiling [baseline] (88.812 ms) : 0, 88812
Profiling [candidate] (88.85 ms) : 0, 88850
Loading
Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.23.0-SNAPSHOT~970ce813a7, baseline=1.23.0-SNAPSHOT~f90d0e58a0

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.042 s) : 0, 1042054
Total [baseline] (8.787 s) : 0, 8786610
Agent [candidate] (1.037 s) : 0, 1036761
Total [candidate] (8.81 s) : 0, 8810497
section iast
Agent [baseline] (1.153 s) : 0, 1153066
Total [baseline] (9.316 s) : 0, 9316309
Agent [candidate] (1.153 s) : 0, 1153196
Total [candidate] (9.347 s) : 0, 9346602
section iast_TELEMETRY_OFF
Agent [baseline] (1.146 s) : 0, 1146129
Total [baseline] (9.329 s) : 0, 9328911
Agent [candidate] (1.154 s) : 0, 1154387
Total [candidate] (9.348 s) : 0, 9347558
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.042 s -
Agent iast 1.153 s 111.011 ms (10.7%)
Agent iast_TELEMETRY_OFF 1.146 s 104.074 ms (10.0%)
Total tracing 8.787 s -
Total iast 9.316 s 529.699 ms (6.0%)
Total iast_TELEMETRY_OFF 9.329 s 542.301 ms (6.2%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.037 s -
Agent iast 1.153 s 116.435 ms (11.2%)
Agent iast_TELEMETRY_OFF 1.154 s 117.626 ms (11.3%)
Total tracing 8.81 s -
Total iast 9.347 s 536.105 ms (6.1%)
Total iast_TELEMETRY_OFF 9.348 s 537.061 ms (6.1%)
gantt
    title insecure-bank - break down per module: candidate=1.23.0-SNAPSHOT~970ce813a7, baseline=1.23.0-SNAPSHOT~f90d0e58a0

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (650.143 ms) : 0, 650143
BytebuddyAgent [candidate] (646.525 ms) : 0, 646525
GlobalTracer [baseline] (295.752 ms) : 0, 295752
GlobalTracer [candidate] (294.828 ms) : 0, 294828
AppSec [baseline] (49.297 ms) : 0, 49297
AppSec [candidate] (48.821 ms) : 0, 48821
Remote Config [baseline] (714.296 µs) : 0, 714
Remote Config [candidate] (699.934 µs) : 0, 700
Telemetry [baseline] (11.484 ms) : 0, 11484
Telemetry [candidate] (11.343 ms) : 0, 11343
section iast
BytebuddyAgent [baseline] (766.117 ms) : 0, 766117
BytebuddyAgent [candidate] (766.744 ms) : 0, 766744
GlobalTracer [baseline] (274.255 ms) : 0, 274255
GlobalTracer [candidate] (274.477 ms) : 0, 274477
AppSec [baseline] (47.063 ms) : 0, 47063
AppSec [candidate] (46.652 ms) : 0, 46652
IAST [baseline] (16.779 ms) : 0, 16779
IAST [candidate] (17.646 ms) : 0, 17646
Remote Config [baseline] (573.465 µs) : 0, 573
Remote Config [candidate] (576.128 µs) : 0, 576
Telemetry [baseline] (13.882 ms) : 0, 13882
Telemetry [candidate] (12.612 ms) : 0, 12612
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (759.615 ms) : 0, 759615
BytebuddyAgent [candidate] (766.272 ms) : 0, 766272
GlobalTracer [baseline] (274.736 ms) : 0, 274736
GlobalTracer [candidate] (276.685 ms) : 0, 276685
AppSec [baseline] (46.789 ms) : 0, 46789
AppSec [candidate] (47.274 ms) : 0, 47274
IAST [baseline] (18.176 ms) : 0, 18176
IAST [candidate] (16.989 ms) : 0, 16989
Remote Config [baseline] (566.959 µs) : 0, 567
Remote Config [candidate] (567.903 µs) : 0, 568
Telemetry [baseline] (11.756 ms) : 0, 11756
Telemetry [candidate] (11.848 ms) : 0, 11848
Loading

Load

Parameters

Baseline Candidate
commit 1.23.0-SNAPSHOT~f90d0e58a0 1.23.0-SNAPSHOT~970ce813a7
config baseline candidate
end_time 2023-11-09T09:24:08 2023-11-09T09:40:39
start_time 2023-11-09T09:23:55 2023-11-09T09:40:27
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 22 cases.

Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.23.0-SNAPSHOT~970ce813a7, baseline=1.23.0-SNAPSHOT~f90d0e58a0
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.359 ms) : 1340, 1378
.   : milestone, 1359,
appsec (1.739 ms) : 1715, 1764
.   : milestone, 1739,
iast (1.501 ms) : 1477, 1525
.   : milestone, 1501,
profiling (1.472 ms) : 1447, 1497
.   : milestone, 1472,
tracing (1.459 ms) : 1434, 1484
.   : milestone, 1459,
section candidate
no_agent (1.355 ms) : 1336, 1374
.   : milestone, 1355,
appsec (1.692 ms) : 1667, 1717
.   : milestone, 1692,
iast (1.487 ms) : 1463, 1511
.   : milestone, 1487,
profiling (1.484 ms) : 1457, 1511
.   : milestone, 1484,
tracing (1.445 ms) : 1419, 1470
.   : milestone, 1445,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.359 ms [1.34 ms, 1.378 ms] -
appsec 1.739 ms [1.715 ms, 1.764 ms] 380.447 µs (28.0%)
iast 1.501 ms [1.477 ms, 1.525 ms] 141.673 µs (10.4%)
profiling 1.472 ms [1.447 ms, 1.497 ms] 113.002 µs (8.3%)
tracing 1.459 ms [1.434 ms, 1.484 ms] 100.179 µs (7.4%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.355 ms [1.336 ms, 1.374 ms] -
appsec 1.692 ms [1.667 ms, 1.717 ms] 337.069 µs (24.9%)
iast 1.487 ms [1.463 ms, 1.511 ms] 131.896 µs (9.7%)
profiling 1.484 ms [1.457 ms, 1.511 ms] 129.332 µs (9.5%)
tracing 1.445 ms [1.419 ms, 1.47 ms] 89.624 µs (6.6%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.23.0-SNAPSHOT~970ce813a7, baseline=1.23.0-SNAPSHOT~f90d0e58a0
    dateFormat X
    axisFormat %s
section baseline
no_agent (361.04 µs) : 341, 381
.   : milestone, 361,
iast (469.494 µs) : 448, 490
.   : milestone, 469,
iast_FULL (521.265 µs) : 501, 542
.   : milestone, 521,
iast_INACTIVE (436.582 µs) : 415, 458
.   : milestone, 437,
iast_TELEMETRY_OFF (464.594 µs) : 444, 486
.   : milestone, 465,
tracing (440.281 µs) : 420, 461
.   : milestone, 440,
section candidate
no_agent (368.312 µs) : 348, 389
.   : milestone, 368,
iast (462.286 µs) : 442, 483
.   : milestone, 462,
iast_FULL (530.77 µs) : 510, 551
.   : milestone, 531,
iast_INACTIVE (439.083 µs) : 418, 460
.   : milestone, 439,
iast_TELEMETRY_OFF (460.22 µs) : 440, 481
.   : milestone, 460,
tracing (438.935 µs) : 418, 460
.   : milestone, 439,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 361.04 µs [340.938 µs, 381.142 µs] -
iast 469.494 µs [448.491 µs, 490.496 µs] 108.454 µs (30.0%)
iast_FULL 521.265 µs [500.715 µs, 541.815 µs] 160.225 µs (44.4%)
iast_INACTIVE 436.582 µs [415.293 µs, 457.87 µs] 75.542 µs (20.9%)
iast_TELEMETRY_OFF 464.594 µs [443.619 µs, 485.569 µs] 103.554 µs (28.7%)
tracing 440.281 µs [419.564 µs, 460.998 µs] 79.241 µs (21.9%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 368.312 µs [347.549 µs, 389.074 µs] -
iast 462.286 µs [441.549 µs, 483.022 µs] 93.974 µs (25.5%)
iast_FULL 530.77 µs [510.235 µs, 551.306 µs] 162.459 µs (44.1%)
iast_INACTIVE 439.083 µs [418.332 µs, 459.833 µs] 70.771 µs (19.2%)
iast_TELEMETRY_OFF 460.22 µs [439.623 µs, 480.818 µs] 91.909 µs (25.0%)
tracing 438.935 µs [417.895 µs, 459.974 µs] 70.623 µs (19.2%)

@vandonr vandonr marked this pull request as ready for review November 8, 2023 12:20
@vandonr vandonr requested a review from a team as a code owner November 8, 2023 12:20
@vandonr vandonr requested review from dougqh, ygree and kr-igor and removed request for a team, dougqh and ygree November 8, 2023 12:20

MessageAttributeValue val = attr.getValue();
if (null != val.stringValue())
size += val.stringValue().getBytes(StandardCharsets.UTF_8).length;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty costly in terms of computation and allocation.
It would be better to move this out of the message processing thread and into background processing.

Did you measure the overhead? Is this enabled by default? Or only when DSM is enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it should be enabled only when DSM is enabled, but I don't really know where this flag is applied in the java tracer. It might very well be that this is computed every time, but the metric is only sent if DSM is enabled ? I mostly followed what had been done in the .net tracer, and in my previous PR where I did that for kafka.

Given the fact that it's computed for a metric, I think there is no rush producing that value and we could definitely compute it in the background, however I have no idea how this would take place :p
If you can point me to somewhere in the code where this is done that'd be very helpful !

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Um, my reading of that code is that computeHeadersSizeBytes(headers) will be called even when DSM is off:

        AgentTracer.get()
            .getDataStreamsMonitoring()
            .setCheckpoint(span, sortedTags, 0, computePayloadSizeBytes(message));

Regardless of what getDataStreamsMonitoring() returns the JVM will always evaluate any arguments before making the call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at the Kafka change you mentioned and it also applies even when DSM is turned off :(

IMHO that code as it stands would negatively affect Kafka users

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we can rework on that change to have it applied only to DSM users. Which in itself still isn't awesome since we're giving the worst experience to the users who pay the most, but at least it reduces the impact.
I can make the payload computation a lambda, this would at least solve that problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a commit to that effect, tell me what you think.

@vandonr
Copy link
Contributor Author

vandonr commented Nov 9, 2023

ok, @amarziali explained to me that lambdas were not a good idea, I'll try to find something better.

@vandonr vandonr marked this pull request as draft November 9, 2023 10:26
@vandonr vandonr closed this Jul 24, 2024
@vandonr vandonr deleted the vandonr/wip branch July 24, 2024 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants