Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable computed gotos #539

Closed
wants to merge 1 commit into from
Closed

Enable computed gotos #539

wants to merge 1 commit into from

Conversation

zanieb
Copy link
Member

@zanieb zanieb commented Feb 25, 2025

These do not appear to be on by default and provide a consistent speed-up. I noticed this while comparing the conda-forge build flags to ours.

❯ /Users/zb/.local/share/uv/python/cpython-3.13.2-macos-aarch64-none/bin/python3.13 -m sysconfig | grep GOTO
    HAVE_COMPUTED_GOTOS = "1"
    USE_COMPUTED_GOTOS = "0"

It's not clear why they're not enabled. The upstream documentation says they are enabled by default on support compilers and the configure-code is quite straight-forward. Furthermore, we have problems with computed-gotos in PGO / BOLT so they must be available in some capacity? Discussion with the upstream suggests this may be a sysconfig bug — which would mean the following performance improvement isn't true. It's possible there's some other difference I need to dig into.

Using a benchmark derived from #535 on macOS aarch64:

❯ hyperfine "/Users/zb/.local/share/uv/python/cpython-3.13.2-macos-aarch64-none/bin/python3.13 bench.py" "./python/install/bin/python bench.py" --min-runs 100
Benchmark 1: /Users/zb/.local/share/uv/python/cpython-3.13.2-macos-aarch64-none/bin/python3.13 bench.py
  Time (mean ± σ):      1.416 s ±  0.025 s    [User: 1.407 s, System: 0.007 s]
  Range (min … max):    1.359 s …  1.473 s    100 runs
 
Benchmark 2: ./python/install/bin/python bench.py
  Time (mean ± σ):      1.309 s ±  0.019 s    [User: 1.301 s, System: 0.007 s]
  Range (min … max):    1.270 s …  1.359 s    100 runs
 
Summary
  ./python/install/bin/python bench.py ran
    1.08 ± 0.02 times faster than /Users/zb/.local/share/uv/python/cpython-3.13.2-macos-aarch64-none/bin/python3.13 bench.py

I plan to test an artifact with the pyperformance suite on Linux as well

These do not appear to be on by default and provide a consistent speed-up
@zanieb
Copy link
Member Author

zanieb commented Feb 25, 2025

Comparing to a build from main instead of to the uv-installed version, there's not a clear difference here:

def short_calcul(n):
    result = 0
    for i in range(1, n+1):
        result += i
    return result


def long_calcul(num):
    result = 0
    for i in range(num):
        result += short_calcul(i) - short_calcul(i)
    return result


number = 1000
number_long_calcul=100

for _ in range(number_long_calcul):
    long_calcul(number)
❯ hyperfine "./baseline/python/install/bin/python bench.py" "./branch/python/install/bin/python bench.py" --min-runs 50
Benchmark 1: ./baseline/python/install/bin/python bench.py
  Time (mean ± σ):      1.805 s ±  0.027 s    [User: 1.795 s, System: 0.008 s]
  Range (min … max):    1.757 s …  1.910 s    50 runs
 
Benchmark 2: ./branch/python/install/bin/python bench.py
  Time (mean ± σ):      1.809 s ±  0.027 s    [User: 1.799 s, System: 0.008 s]
  Range (min … max):    1.759 s …  1.905 s    50 runs
 
Summary
  ./baseline/python/install/bin/python bench.py ran
    1.00 ± 0.02 times faster than ./branch/python/install/bin/python bench.py

The only obvious difference is I was building without LTO — presuming that it'd be a faster build and that performance would be worse (and an improvement would only be under measured). So, apparently LTO is making things slower here?

I downloaded a binary from CI and am running the pyperformance suite on an x86-64 Linux machine for further confirmation.

@indygreg
Copy link
Collaborator

Computed gotos were correctly detected and enabled by configure at some point: I definitely spot verified this years ago.

@zanieb
Copy link
Member Author

zanieb commented Feb 25, 2025

Thanks for confirming!

It's a sysconfig display bug (https://github.com/python/cpython/blob/main/Lib/sysconfig/__init__.py#L452) — unfortunately just a rabbit hole for me to go down :) Going to explore #529 next.

@zanieb zanieb closed this Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants