Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Freethreading on Python >= 3.13 #935

Open
H-Plus-Time opened this issue Dec 11, 2024 · 1 comment
Open

Freethreading on Python >= 3.13 #935

H-Plus-Time opened this issue Dec 11, 2024 · 1 comment

Comments

@H-Plus-Time
Copy link
Contributor

TLDR: I (barely) managed to get a 3.13t build functioning (a very basic read_parquet -> no segfault 🎉 check); the blockers boiled down to rust-numpy, arro3-core, and pyproj.

This more or less amounted to swapping numpy for a git dependency (to the source of PyO3/rust-numpy#471 ), marking each #[pymodule] as #[pymodule(gil_used = false)] (which is apparently fine for with_gil calls, though I haven't tried forcing thread-unsafe behaviour), and bumping pyproj's build-deps to cython==3.1.0a1 (freethreading_compatible is the directive to flip on).

The main sticking point is pyproj (a pixi environment with proj and python-freethreading does the trick, locally), so this isn't likely to be actionable for a while.

So a few questions I have are:

  1. Aside from file object interactions, there's not a whole lot that's likely to break unexpectedly in the absence of the GIL, right?
  2. The general idea going forward is to reduce, not increase the number of unmanaged rs->python callouts, right? By that I mean leaning heavily on buffer, array, dataframe and pycapsules.
@kylebarron
Copy link
Member

kylebarron commented Dec 11, 2024

That's awesome!

  1. Aside from file object interactions, there's not a whole lot that's likely to break unexpectedly in the absence of the GIL, right?

I don't really know.

I think async readers hold the py: Python token as well. I'm not sure how the async story in pyo3 will change without the GIL. Especially since the async support may be changing in the medium term. (https://github.com/wyfo/pyo3-async, PyO3/pyo3#1632)

  1. The general idea going forward is to reduce, not increase the number of unmanaged rs->python callouts, right? By that I mean leaning heavily on buffer, array, dataframe and pycapsules.

What do you mean by unmanaged? The main place where we're adding cases of rust calling into Python is CRS transformations. I don't want to deal with building PROJ from source, so we use pyproj from the Python bindings to handle converting WKT to PROJJSON when writing GeoParquet and converting PROJJSON to WKT when writing FlatGeobuf.

Otherwise, we should be able to stay in Rust-land as much as possible.

If you have diffs to those projects I'd happily accept PRs! Especially if you get the wheels to build too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants