Skip to content

Conversation

@travisdowns
Copy link
Contributor

@travisdowns travisdowns commented Oct 2, 2025

This is a follow up for an earlier series where we made the columns
more generic. Now we add the ability to choose exactly which columns
are output, and also show the "variance" (in the form of MAD / median)
for any numeric column.

Configurable MAD display, selectable columns, tighter numeric packing

  • Add --mad-columns option allowing 'all' or comma-separated list so any
    stats column (runtime, allocs, tasks, inst, cycles, etc.) can show
    relative MAD (±pct of median) instead of being fixed
  • Add --columns option to restrict which columns appear in text/markdown
    output (or 'all' to keep previous behavior)
  • Improve width fitting: adaptive formatting packs large values and
    scaled durations (ns/µs/ms/s) more tightly while preserving
    significant digits
  • Derive header widths from representative sample values for more
    consistent alignment and reduced horizontal space
  • Unified duration scaling path reused for both raw durations and
    aggregated statistics

This shouldn't affect json output at all.

Also in this series we add a few which benchmarks stress the textual output with large/small amounts
of iterations, allocs, runtime. Useful to check that the table formatting works. The output from these
benchmarks are shown below (the new output is run with --mad-columns=runtime,inst):


text output

before

test                                  iterations      median         mad         min         max      allocs       tasks        inst      cycles
output_check.high_iters              396696730000     0.000ns     0.000ns     0.000ns     0.000ns       0.000       0.000         0.0         0.0
output_check.no_runtime                 71041741     0.141ns     0.000ns     0.141ns     0.142ns       0.000       0.000         4.0         0.6
output_check.low_runtime                  382523    26.094ns     0.005ns    26.049ns    26.099ns       0.000       0.000       307.0       103.5
output_check.high_runtime                     79   126.607us    97.873ns   126.509us   126.865us       0.000       0.000   3000012.1    502851.1
output_check.high_runtime_allocs               1   582.727ms   108.715us   582.619ms   587.716ms 100000000.000       0.000 9600000987.3 2325084710.7
output_check.highly_variable_runtime           1    20.089ms    14.525ms     1.404us    34.614ms 3130404.333       0.000 300519388.3  72540181.0

after

test                                      iters            runtime     allocs      tasks               inst     cycles
output_check.high_iters                   4e+11     0.00ns ± 0.05%      0.000      0.000       0.00 ± 0.00%        0.0
output_check.no_runtime                70937442     0.14ns ± 0.44%      0.000      0.000       4.00 ± 0.00%        0.6
output_check.low_runtime                 382914    26.07ns ± 0.03%      0.000      0.000     307.00 ± 0.00%      103.4
output_check.high_runtime                    79   126.72µs ± 0.18%      0.000      0.000  3000012.1 ± 0.00%   502621.3
output_check.high_runtime_allocs              1   582.97ms ± 0.06%  100000000      0.000   9.60e+09 ± 0.00%    2.3e+09
output_check.highly_variable_runtime          3    18.43ms ± 4.35%  3155514.0      0.000  302929658 ± 4.51%   73260844

md output

before

test iterations median mad min max allocs tasks inst cycles
output_check.high_iters 396696730000 0.000ns 0.000ns 0.000ns 0.000ns 0.000 0.000 0.0 0.0
output_check.no_runtime 71041741 0.141ns 0.000ns 0.141ns 0.142ns 0.000 0.000 4.0 0.6
output_check.low_runtime 382523 26.094ns 0.005ns 26.049ns 26.099ns 0.000 0.000 307.0 103.5
output_check.high_runtime 79 126.607us 97.873ns 126.509us 126.865us 0.000 0.000 3000012.1 502851.1
output_check.high_runtime_allocs 1 582.727ms 108.715us 582.619ms 587.716ms 100000000.000 0.000 9600000987.3 2325084710.7
output_check.highly_variable_runtime 1 20.089ms 14.525ms 1.404us 34.614ms 3130404.333 0.000 300519388.3 72540181.0

after

test iters runtime allocs tasks inst cycles
output_check.high_iters 4e+11 0.00ns ± 0.05% 0.000 0.000 0.00 ± 0.00% 0.0
output_check.no_runtime 70937442 0.14ns ± 0.44% 0.000 0.000 4.00 ± 0.00% 0.6
output_check.low_runtime 382914 26.07ns ± 0.03% 0.000 0.000 307.00 ± 0.00% 103.4
output_check.high_runtime 79 126.72µs ± 0.18% 0.000 0.000 3000012.1 ± 0.00% 502621.3
output_check.high_runtime_allocs 1 582.97ms ± 0.06% 100000000 0.000 9.60e+09 ± 0.00% 2.3e+09
output_check.highly_variable_runtime 3 18.43ms ± 4.35% 3155514.0 0.000 302929658 ± 4.51% 73260844

@travisdowns travisdowns changed the title Td perf tests output upstream Update perf_test text output, make columns selectable Oct 2, 2025
@travisdowns
Copy link
Contributor Author

CI falure looks spurious to me, I don't think this even changes code that is linked into the binary that failed.

@avikivity
Copy link
Member

"2.3e+09" carries too little information. I'm not a fan of 352352984723945871259.23154123 precision-without-accuracy, but this is too little. Often changes in the 0.5% range are measurable, and this hides it.

@travisdowns
Copy link
Contributor Author

"2.3e+09" carries too little information. I'm not a fan of 352352984723945871259.23154123 precision-without-accuracy, but this is too little. Often changes in the 0.5% range are measurable, and this hides it.

Fair. In part this is to discourage you from writing tests which fail to normalize the result down to something reasonable which doesn't need this in the first place, where you'd get 6 sig figs. Right now the width is set to 7, which means with scientific notification there is only room for 2 sig figs, since .e+00 take the other 5. I can set this to 8 or 9 which give 3 or 4 sig figs. 3 is usually enough to measure a 0.5% change, but not a 0.1% one. WDYT?

Other option is to use a suffix like K, M, G to represent magnitudes, but this has the disadvantage of jumping by 3 digits every time, which eats into the 4 chars you save by not using scientific notation.

@avikivity
Copy link
Member

"2.3e+09" carries too little information. I'm not a fan of 352352984723945871259.23154123 precision-without-accuracy, but this is too little. Often changes in the 0.5% range are measurable, and this hides it.

Fair. In part this is to discourage you from writing tests which fail to normalize the result down to something reasonable which doesn't need this in the first place, where you'd get 6 sig figs. Right now the width is set to 7, which means with scientific notification there is only room for 2 sig figs, since .e+00 take the other 5. I can set this to 8 or 9 which give 3 or 4 sig figs. 3 is usually enough to measure a 0.5% change, but not a 0.1% one. WDYT?

Let's set it to 9.

Other option is to use a suffix like K, M, G to represent magnitudes, but this has the disadvantage of jumping by 3 digits every time, which eats into the 4 chars you save by not using scientific notation.

These benchmarks stress the textual output with large/small amounts
of iterations, allocs, runtime. Useful to check that the table
formatting works.
This is a follow up for an earlier series where we made the columns
more generic. Now we add the ability to choose exactly which columns
are output, and also show the "variance" (in the form of MAD / median)
for any numeric column.

Configurable MAD display, selectable columns, tighter numeric packing

- Add --mad-columns option allowing 'all' or comma-separated list so any
  stats column (runtime, allocs, tasks, inst, cycles, etc.) can show
  relative MAD (±pct of median) instead of being fixed
- Add --columns option to restrict which columns appear in text/markdown
  output (or 'all' to keep previous behavior)
- Improve width fitting: adaptive formatting packs large values and
  scaled durations (ns/µs/ms/s) more tightly while preserving
  significant digits
- Derive header widths from representative sample values for more
  consistent alignment and reduced horizontal space
- Unified duration scaling path reused for both raw durations and
  aggregated statistics
- Internal printing path simplified (less branching, clearer per-column
  MAD decision) without changing JSON output schema
- Default behavior unchanged when new options are not specified

This shouldn't affect json output at all.
@travisdowns travisdowns force-pushed the td-perf-tests-output-upstream branch from 3db285d to db5eda0 Compare October 10, 2025 13:14
@travisdowns
Copy link
Contributor Author

@avikivity wrote:

Let's set it to 9.

Done.

@avikivity avikivity closed this in c7dae27 Oct 16, 2025
@avikivity avikivity merged commit c7dae27 into scylladb:master Oct 16, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants