What
the test-all-packages workflow is about 20% slower under Node 22 than Node 20.
Using the recent successful Test all Packages workflow runs from .github/workflows/test-all-packages.yml, Codex pulled 44 successful runs (from the latest 100 runs), extracted all paired node-old/node-new job timings, and fit the model with job as a random effect.
Dataset
- 44 runs
- 9 job slots per run (
test-boot and test-swingset shards)
- 792 job rows total
- 396 paired comparisons
Raw means
- Node 20 mean job duration: 562.77s
- Node 22 mean job duration: 679.31s
- Difference: +116.55s (about +1m56s, +20.7%)
Paired test (within run+slot)
- Mean diff: +116.55s
- 95% CI: [112.11s, 120.98s]
- p-value: 1.14e-177
Mixed-effects model (job random effect)
- Model:
dur_s ~ node_new + (1 | slot) with run control
slot random intercept + run fixed effects:
- Effect (
node_new): +116.55s
- 95% CI: [112.66s, 120.43s]
- p-value: effectively 0 (underflow; extremely significant)
- Cross-check model with run variance component gave nearly identical estimate:
- Effect: +116.55s
- 95% CI: [112.13s, 120.97s]
So the estimate is very stable: Node 22 is about 1 minute 56 seconds slower per matched job than Node 20 in this workflow sample.
Why
We should understand how Node 22 vs Node 20 affects production. And what impact Node 24 will have,
How
TBD
What
the test-all-packages workflow is about 20% slower under Node 22 than Node 20.
Using the recent successful
Test all Packagesworkflow runs from.github/workflows/test-all-packages.yml, Codex pulled 44 successful runs (from the latest 100 runs), extracted all pairednode-old/node-newjob timings, and fit the model with job as a random effect.Dataset
test-bootandtest-swingsetshards)Raw means
Paired test (within run+slot)
Mixed-effects model (job random effect)
dur_s ~ node_new + (1 | slot)with run controlslotrandom intercept + run fixed effects:node_new): +116.55sSo the estimate is very stable: Node 22 is about 1 minute 56 seconds slower per matched job than Node 20 in this workflow sample.
Why
We should understand how Node 22 vs Node 20 affects production. And what impact Node 24 will have,
How
TBD