Enable computed gotos #539

zanieb · 2025-02-25T02:38:41Z

These do not appear to be on by default and provide a consistent speed-up. I noticed this while comparing the conda-forge build flags to ours.

❯ /Users/zb/.local/share/uv/python/cpython-3.13.2-macos-aarch64-none/bin/python3.13 -m sysconfig | grep GOTO
    HAVE_COMPUTED_GOTOS = "1"
    USE_COMPUTED_GOTOS = "0"

It's not clear why they're not enabled. The upstream documentation says they are enabled by default on support compilers and the configure-code is quite straight-forward. Furthermore, we have problems with computed-gotos in PGO / BOLT so they must be available in some capacity? Discussion with the upstream suggests this may be a sysconfig bug — which would mean the following performance improvement isn't true. It's possible there's some other difference I need to dig into.

Using a benchmark derived from #535 on macOS aarch64:

❯ hyperfine "/Users/zb/.local/share/uv/python/cpython-3.13.2-macos-aarch64-none/bin/python3.13 bench.py" "./python/install/bin/python bench.py" --min-runs 100
Benchmark 1: /Users/zb/.local/share/uv/python/cpython-3.13.2-macos-aarch64-none/bin/python3.13 bench.py
  Time (mean ± σ):      1.416 s ±  0.025 s    [User: 1.407 s, System: 0.007 s]
  Range (min … max):    1.359 s …  1.473 s    100 runs
 
Benchmark 2: ./python/install/bin/python bench.py
  Time (mean ± σ):      1.309 s ±  0.019 s    [User: 1.301 s, System: 0.007 s]
  Range (min … max):    1.270 s …  1.359 s    100 runs
 
Summary
  ./python/install/bin/python bench.py ran
    1.08 ± 0.02 times faster than /Users/zb/.local/share/uv/python/cpython-3.13.2-macos-aarch64-none/bin/python3.13 bench.py

I plan to test an artifact with the pyperformance suite on Linux as well

These do not appear to be on by default and provide a consistent speed-up

zanieb · 2025-02-25T03:48:35Z

Comparing to a build from main instead of to the uv-installed version, there's not a clear difference here:

def short_calcul(n):
    result = 0
    for i in range(1, n+1):
        result += i
    return result


def long_calcul(num):
    result = 0
    for i in range(num):
        result += short_calcul(i) - short_calcul(i)
    return result


number = 1000
number_long_calcul=100

for _ in range(number_long_calcul):
    long_calcul(number)

❯ hyperfine "./baseline/python/install/bin/python bench.py" "./branch/python/install/bin/python bench.py" --min-runs 50
Benchmark 1: ./baseline/python/install/bin/python bench.py
  Time (mean ± σ):      1.805 s ±  0.027 s    [User: 1.795 s, System: 0.008 s]
  Range (min … max):    1.757 s …  1.910 s    50 runs
 
Benchmark 2: ./branch/python/install/bin/python bench.py
  Time (mean ± σ):      1.809 s ±  0.027 s    [User: 1.799 s, System: 0.008 s]
  Range (min … max):    1.759 s …  1.905 s    50 runs
 
Summary
  ./baseline/python/install/bin/python bench.py ran
    1.00 ± 0.02 times faster than ./branch/python/install/bin/python bench.py

The only obvious difference is I was building without LTO — presuming that it'd be a faster build and that performance would be worse (and an improvement would only be under measured). So, apparently LTO is making things slower here?

I downloaded a binary from CI and am running the pyperformance suite on an x86-64 Linux machine for further confirmation.

indygreg · 2025-02-25T03:59:43Z

Computed gotos were correctly detected and enabled by configure at some point: I definitely spot verified this years ago.

zanieb · 2025-02-25T04:02:40Z

Thanks for confirming!

It's a sysconfig display bug (https://github.com/python/cpython/blob/main/Lib/sysconfig/__init__.py#L452) — unfortunately just a rabbit hole for me to go down :) Going to explore #529 next.

Enable computed gotos

24ecbff

These do not appear to be on by default and provide a consistent speed-up

zanieb mentioned this pull request Feb 25, 2025

Enable full LTO on Python 3.12 and 3.13 #529

Closed

zanieb closed this Feb 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable computed gotos #539

Enable computed gotos #539

zanieb commented Feb 25, 2025 •

edited

Loading

zanieb commented Feb 25, 2025

indygreg commented Feb 25, 2025

zanieb commented Feb 25, 2025

Enable computed gotos #539

Enable computed gotos #539

Conversation

zanieb commented Feb 25, 2025 • edited Loading

zanieb commented Feb 25, 2025

indygreg commented Feb 25, 2025

zanieb commented Feb 25, 2025

zanieb commented Feb 25, 2025 •

edited

Loading