Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R&D Week][Profiler] Gleocadie/walltime profiler optimization execution hijack #6418

Draft
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

gleocadie
Copy link
Collaborator

Summary of changes

Reason for change

Implementation details

Test coverage

Other details

@datadog-ddstaging
Copy link

datadog-ddstaging bot commented Dec 10, 2024

Datadog Report

Branch report: gleocadie/walltime-profiler-optimization-execution-hijack
Commit report: a86a826
Test service: dd-trace-dotnet

✅ 0 Failed, 461163 Passed, 2789 Skipped, 17h 54m 21.64s Total Time

@github-actions github-actions bot added the area:profiler Issues related to the continous-profiler label Dec 10, 2024
@andrewlock
Copy link
Member

andrewlock commented Dec 10, 2024

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6418) - mean (68ms)  : 65, 71
     .   : milestone, 68,
    master - mean (68ms)  : 65, 71
     .   : milestone, 68,

    section CallTarget+Inlining+NGEN
    This PR (6418) - mean (977ms)  : 957, 997
     .   : milestone, 977,
    master - mean (977ms)  : 953, 1000
     .   : milestone, 977,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6418) - mean (107ms)  : 105, 110
     .   : milestone, 107,
    master - mean (107ms)  : 105, 109
     .   : milestone, 107,

    section CallTarget+Inlining+NGEN
    This PR (6418) - mean (675ms)  : 661, 689
     .   : milestone, 675,
    master - mean (677ms)  : 660, 694
     .   : milestone, 677,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6418) - mean (91ms)  : 88, 93
     .   : milestone, 91,
    master - mean (91ms)  : 89, 93
     .   : milestone, 91,

    section CallTarget+Inlining+NGEN
    This PR (6418) - mean (633ms)  : 616, 649
     .   : milestone, 633,
    master - mean (633ms)  : 615, 650
     .   : milestone, 633,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6418) - mean (193ms)  : 188, 199
     .   : milestone, 193,
    master - mean (194ms)  : 190, 198
     .   : milestone, 194,

    section CallTarget+Inlining+NGEN
    This PR (6418) - mean (1,098ms)  : 1075, 1121
     .   : milestone, 1098,
    master - mean (1,098ms)  : 1071, 1125
     .   : milestone, 1098,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6418) - mean (277ms)  : 273, 282
     .   : milestone, 277,
    master - mean (277ms)  : 271, 282
     .   : milestone, 277,

    section CallTarget+Inlining+NGEN
    This PR (6418) - mean (868ms)  : 842, 895
     .   : milestone, 868,
    master - mean (873ms)  : 838, 908
     .   : milestone, 873,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6418) - mean (267ms)  : 261, 272
     .   : milestone, 267,
    master - mean (266ms)  : 262, 270
     .   : milestone, 266,

    section CallTarget+Inlining+NGEN
    This PR (6418) - mean (852ms)  : 818, 887
     .   : milestone, 852,
    master - mean (853ms)  : 820, 885
     .   : milestone, 853,

Loading

@andrewlock
Copy link
Member

andrewlock commented Dec 11, 2024

Benchmarks Report for tracer 🐌

Benchmarks for #6418 compared to master:

  • 2 benchmarks are slower, with geometric mean 1.133
  • All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartStopWithChild net6.0 7.97μs 42.2ns 235ns 0.0163 0.00816 0 5.6 KB
master StartStopWithChild netcoreapp3.1 10.3μs 55.2ns 307ns 0.0204 0.00511 0 5.8 KB
master StartStopWithChild net472 16.3μs 36.1ns 130ns 1.04 0.302 0.0897 6.21 KB
#6418 StartStopWithChild net6.0 7.94μs 36.3ns 195ns 0.0193 0.00772 0 5.6 KB
#6418 StartStopWithChild netcoreapp3.1 10.1μs 55.3ns 313ns 0.0193 0.00483 0 5.81 KB
#6418 StartStopWithChild net472 16.6μs 66.1ns 256ns 1.06 0.34 0.105 6.2 KB
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 494μs 144ns 538ns 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 643μs 249ns 932ns 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces net472 858μs 356ns 1.38μs 0.428 0 0 3.3 KB
#6418 WriteAndFlushEnrichedTraces net6.0 491μs 195ns 703ns 0 0 0 2.7 KB
#6418 WriteAndFlushEnrichedTraces netcoreapp3.1 667μs 462ns 1.79μs 0 0 0 2.7 KB
#6418 WriteAndFlushEnrichedTraces net472 852μs 711ns 2.76μs 0.425 0 0 3.3 KB
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendRequest net6.0 152μs 868ns 8.46μs 0.148 0 0 14.47 KB
master SendRequest netcoreapp3.1 167μs 961ns 8.1μs 0.157 0 0 17.27 KB
master SendRequest net472 0.00118ns 0.000644ns 0.0025ns 0 0 0 0 b
#6418 SendRequest net6.0 151μs 871ns 7.49μs 0.14 0 0 14.47 KB
#6418 SendRequest netcoreapp3.1 169μs 921ns 7.07μs 0.177 0 0 17.27 KB
#6418 SendRequest net472 0.000601ns 0.000371ns 0.00144ns 0 0 0 0 b
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 565μs 2.71μs 11.2μs 0.568 0 0 41.59 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 661μs 3.57μs 20.2μs 0.353 0 0 41.72 KB
master WriteAndFlushEnrichedTraces net472 821μs 4.02μs 16.6μs 8.45 2.53 0.422 53.29 KB
#6418 WriteAndFlushEnrichedTraces net6.0 555μs 2.54μs 10.2μs 0.546 0 0 41.72 KB
#6418 WriteAndFlushEnrichedTraces netcoreapp3.1 666μs 3.71μs 23.5μs 0.326 0 0 41.72 KB
#6418 WriteAndFlushEnrichedTraces net472 846μs 4.15μs 17.6μs 8.17 2.45 0.408 53.3 KB
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteNonQuery net6.0 1.23μs 0.62ns 2.32ns 0.0142 0 0 1.02 KB
master ExecuteNonQuery netcoreapp3.1 1.77μs 0.915ns 3.54ns 0.0138 0 0 1.02 KB
master ExecuteNonQuery net472 2.01μs 2.19ns 8.19ns 0.156 0.001 0 987 B
#6418 ExecuteNonQuery net6.0 1.34μs 1.43ns 4.95ns 0.0147 0 0 1.02 KB
#6418 ExecuteNonQuery netcoreapp3.1 1.78μs 1.24ns 4.79ns 0.0134 0 0 1.02 KB
#6418 ExecuteNonQuery net472 2.04μs 3.68ns 14.2ns 0.157 0.00102 0 987 B
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master CallElasticsearch net6.0 1.26μs 1.08ns 4.03ns 0.0133 0 0 976 B
master CallElasticsearch netcoreapp3.1 1.63μs 0.751ns 2.91ns 0.0134 0 0 976 B
master CallElasticsearch net472 2.49μs 3.3ns 12.8ns 0.158 0 0 995 B
master CallElasticsearchAsync net6.0 1.33μs 1.22ns 4.74ns 0.0131 0 0 952 B
master CallElasticsearchAsync netcoreapp3.1 1.75μs 0.987ns 3.69ns 0.014 0 0 1.02 KB
master CallElasticsearchAsync net472 2.66μs 2.13ns 8.25ns 0.167 0 0 1.05 KB
#6418 CallElasticsearch net6.0 1.14μs 0.518ns 1.94ns 0.0137 0 0 976 B
#6418 CallElasticsearch netcoreapp3.1 1.64μs 2.2ns 8.24ns 0.0131 0 0 976 B
#6418 CallElasticsearch net472 2.42μs 2.45ns 9.18ns 0.158 0 0 995 B
#6418 CallElasticsearchAsync net6.0 1.38μs 0.416ns 1.56ns 0.0131 0 0 952 B
#6418 CallElasticsearchAsync netcoreapp3.1 1.71μs 0.791ns 2.96ns 0.014 0 0 1.02 KB
#6418 CallElasticsearchAsync net472 2.54μs 1.69ns 6.33ns 0.166 0 0 1.05 KB
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteAsync net6.0 1.27μs 0.466ns 1.8ns 0.0135 0 0 952 B
master ExecuteAsync netcoreapp3.1 1.7μs 0.474ns 1.77ns 0.0128 0 0 952 B
master ExecuteAsync net472 1.81μs 0.506ns 1.89ns 0.145 0 0 915 B
#6418 ExecuteAsync net6.0 1.28μs 0.552ns 2.07ns 0.0134 0 0 952 B
#6418 ExecuteAsync netcoreapp3.1 1.62μs 0.774ns 3ns 0.0124 0 0 952 B
#6418 ExecuteAsync net472 1.79μs 0.634ns 2.46ns 0.145 0 0 915 B
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendAsync net6.0 4.44μs 2.39ns 9.26ns 0.0311 0 0 2.31 KB
master SendAsync netcoreapp3.1 5.31μs 2.93ns 11.4ns 0.0372 0 0 2.85 KB
master SendAsync net472 7.35μs 2.39ns 9.27ns 0.496 0 0 3.12 KB
#6418 SendAsync net6.0 4.29μs 1.47ns 5.49ns 0.0321 0 0 2.31 KB
#6418 SendAsync netcoreapp3.1 5.21μs 2.51ns 9.04ns 0.0367 0 0 2.85 KB
#6418 SendAsync net472 7.32μs 1.47ns 5.49ns 0.494 0 0 3.12 KB
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 1.52μs 1.04ns 3.9ns 0.0232 0 0 1.64 KB
master EnrichedLog netcoreapp3.1 2.24μs 0.978ns 3.66ns 0.0223 0 0 1.64 KB
master EnrichedLog net472 2.68μs 1.83ns 7.09ns 0.25 0 0 1.57 KB
#6418 EnrichedLog net6.0 1.48μs 0.754ns 2.82ns 0.023 0 0 1.64 KB
#6418 EnrichedLog netcoreapp3.1 2.22μs 1.13ns 4.37ns 0.0226 0 0 1.64 KB
#6418 EnrichedLog net472 2.63μs 0.697ns 2.41ns 0.249 0 0 1.57 KB
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 119μs 291ns 1.13μs 0.0591 0 0 4.28 KB
master EnrichedLog netcoreapp3.1 124μs 117ns 439ns 0.063 0 0 4.28 KB
master EnrichedLog net472 153μs 216ns 838ns 0.693 0.231 0 4.46 KB
#6418 EnrichedLog net6.0 121μs 223ns 863ns 0.0606 0 0 4.28 KB
#6418 EnrichedLog netcoreapp3.1 125μs 279ns 1.08μs 0.0619 0 0 4.28 KB
#6418 EnrichedLog net472 154μs 184ns 714ns 0.692 0.231 0 4.46 KB
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 3.14μs 0.916ns 3.43ns 0.0315 0 0 2.2 KB
master EnrichedLog netcoreapp3.1 4.19μs 1.97ns 7.65ns 0.0293 0 0 2.2 KB
master EnrichedLog net472 4.98μs 1.32ns 5.12ns 0.319 0 0 2.02 KB
#6418 EnrichedLog net6.0 3.06μs 0.854ns 3.31ns 0.0308 0 0 2.2 KB
#6418 EnrichedLog netcoreapp3.1 4.29μs 3.41ns 13.2ns 0.0301 0 0 2.2 KB
#6418 EnrichedLog net472 4.81μs 1.07ns 3.99ns 0.321 0 0 2.02 KB
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendReceive net6.0 1.41μs 0.586ns 2.27ns 0.0162 0 0 1.14 KB
master SendReceive netcoreapp3.1 1.8μs 1.25ns 4.83ns 0.0153 0 0 1.14 KB
master SendReceive net472 2.17μs 0.992ns 3.58ns 0.184 0 0 1.16 KB
#6418 SendReceive net6.0 1.36μs 0.629ns 2.27ns 0.0162 0 0 1.14 KB
#6418 SendReceive netcoreapp3.1 1.72μs 1.48ns 5.73ns 0.0157 0 0 1.14 KB
#6418 SendReceive net472 2.1μs 0.986ns 3.82ns 0.184 0 0 1.16 KB
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 2.66μs 0.712ns 2.66ns 0.0229 0 0 1.6 KB
master EnrichedLog netcoreapp3.1 3.79μs 3.19ns 12.3ns 0.0208 0 0 1.65 KB
master EnrichedLog net472 4.36μs 2.11ns 7.88ns 0.322 0 0 2.04 KB
#6418 EnrichedLog net6.0 2.83μs 1.11ns 4.3ns 0.0227 0 0 1.6 KB
#6418 EnrichedLog netcoreapp3.1 4μs 1.03ns 3.86ns 0.0221 0 0 1.65 KB
#6418 EnrichedLog net472 4.47μs 4.16ns 15.6ns 0.323 0 0 2.04 KB
Benchmarks.Trace.SpanBenchmark - Slower ⚠️ Same allocations ✔️

Slower ⚠️ in #6418

Benchmark diff/base Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net6.0 1.136 398.33 452.66

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartFinishSpan net6.0 398ns 0.273ns 1.06ns 0.00802 0 0 576 B
master StartFinishSpan netcoreapp3.1 618ns 0.659ns 2.55ns 0.0077 0 0 576 B
master StartFinishSpan net472 674ns 2.01ns 7.78ns 0.0916 0 0 578 B
master StartFinishScope net6.0 492ns 0.46ns 1.78ns 0.00965 0 0 696 B
master StartFinishScope netcoreapp3.1 751ns 0.711ns 2.75ns 0.00947 0 0 696 B
master StartFinishScope net472 823ns 3.16ns 11.8ns 0.105 0 0 658 B
#6418 StartFinishSpan net6.0 452ns 0.283ns 1.1ns 0.00796 0 0 576 B
#6418 StartFinishSpan netcoreapp3.1 620ns 0.54ns 1.95ns 0.00772 0 0 576 B
#6418 StartFinishSpan net472 708ns 0.502ns 1.94ns 0.0915 0 0 578 B
#6418 StartFinishScope net6.0 494ns 0.447ns 1.73ns 0.00968 0 0 696 B
#6418 StartFinishScope netcoreapp3.1 701ns 0.452ns 1.69ns 0.00959 0 0 696 B
#6418 StartFinishScope net472 827ns 0.705ns 2.73ns 0.105 0 0 658 B
Benchmarks.Trace.TraceAnnotationsBenchmark - Slower ⚠️ Same allocations ✔️

Slower ⚠️ in #6418

Benchmark diff/base Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin‑net6.0 1.129 643.29 726.35

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunOnMethodBegin net6.0 643ns 0.442ns 1.65ns 0.00981 0 0 696 B
master RunOnMethodBegin netcoreapp3.1 923ns 0.497ns 1.86ns 0.00924 0 0 696 B
master RunOnMethodBegin net472 1.14μs 0.524ns 2.03ns 0.104 0 0 658 B
#6418 RunOnMethodBegin net6.0 726ns 0.965ns 3.74ns 0.00979 0 0 696 B
#6418 RunOnMethodBegin netcoreapp3.1 908ns 0.504ns 1.89ns 0.0091 0 0 696 B
#6418 RunOnMethodBegin net472 1.1μs 0.456ns 1.77ns 0.105 0 0 658 B

@gleocadie gleocadie force-pushed the gleocadie/walltime-profiler-optimization-execution-hijack branch from 8169d91 to d8b6435 Compare December 11, 2024 17:23
@andrewlock
Copy link
Member

andrewlock commented Dec 11, 2024

Throughput/Crank Report ⚡

Throughput results for AspNetCoreSimpleController comparing the following branches/commits:

Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red.

Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards!

gantt
    title Throughput Linux x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6418) (11.198M)   : 0, 11197739
    master (11.207M)   : 0, 11207401
    benchmarks/2.9.0 (11.033M)   : 0, 11032866

    section Automatic
    This PR (6418) (7.235M)   : 0, 7235059
    master (7.306M)   : 0, 7305881
    benchmarks/2.9.0 (7.786M)   : 0, 7785853

    section Trace stats
    master (7.547M)   : 0, 7547108

    section Manual
    master (10.807M)   : 0, 10807085

    section Manual + Automatic
    This PR (6418) (6.696M)   : 0, 6696010
    master (6.654M)   : 0, 6654300

    section DD_TRACE_ENABLED=0
    master (10.240M)   : 0, 10239959

Loading
gantt
    title Throughput Linux arm64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6418) (9.600M)   : 0, 9600043
    master (9.559M)   : 0, 9558927
    benchmarks/2.9.0 (9.495M)   : 0, 9494821

    section Automatic
    This PR (6418) (6.342M)   : 0, 6342363
    master (6.276M)   : 0, 6276496

    section Trace stats
    master (6.702M)   : 0, 6702308

    section Manual
    master (9.582M)   : 0, 9582063

    section Manual + Automatic
    This PR (6418) (5.843M)   : 0, 5842500
    master (5.938M)   : 0, 5937696

    section DD_TRACE_ENABLED=0
    master (9.073M)   : 0, 9072831

Loading
gantt
    title Throughput Windows x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6418) (9.695M)   : 0, 9695128
    master (10.036M)   : 0, 10035894
    benchmarks/2.9.0 (10.020M)   : 0, 10019592

    section Automatic
    This PR (6418) (6.172M)   : crit ,0, 6171576
    master (6.535M)   : 0, 6535216
    benchmarks/2.9.0 (7.255M)   : 0, 7255257

    section Trace stats
    master (7.175M)   : 0, 7175343

    section Manual
    master (9.942M)   : 0, 9942221

    section Manual + Automatic
    This PR (6418) (5.675M)   : crit ,0, 5675221
    master (6.147M)   : 0, 6147492

    section DD_TRACE_ENABLED=0
    master (9.295M)   : 0, 9295444

Loading

@gleocadie gleocadie force-pushed the gleocadie/walltime-profiler-optimization-execution-hijack branch 2 times, most recently from d2b53d8 to fc8f0a6 Compare December 17, 2024 10:56
@gleocadie gleocadie force-pushed the gleocadie/walltime-profiler-optimization-execution-hijack branch from fc8f0a6 to a86a826 Compare December 17, 2024 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:profiler Issues related to the continous-profiler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants