Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API surface to 'build' pyo3 and pyo3-ffi #3437

Closed
nandita-v opened this issue Sep 6, 2023 · 8 comments
Closed

API surface to 'build' pyo3 and pyo3-ffi #3437

nandita-v opened this issue Sep 6, 2023 · 8 comments

Comments

@nandita-v
Copy link

For my use case, I would like to statically embed the python interpreter during the build time of my crate that uses PyO3. I am creating an InterpreterConfig within my crate's build.rs and I'm looking for a programmatic way to supply this as input to pyo3 and pyo3-ffi - this is not possible right now because both rely solely on pyo3-build-config's build set up. Per the documentation, the only way to currently override the interpreter config is to set the PYO3_CONFIG_FILE environment variable.

Some issues I'm having with environment variables being the only way to do this:

  • If my crate generates all resources needed for embedding the python interpreter, there is currently no way for me to use this in PyO3 (unless I take a complex multi-step approach wherein I build a crate that generates python embedding resources first, dump out a PyO3 config file, set the PYO3_CONFIG_FILE environment variable and then trigger the build of my crate that uses PyO3)
  • Requiring end-users set environment variables is quite cumbersome, especially when some of these need to be set to values generated in another crate.

One way I can think of to work around this is to provide a no-build feature that no ops the current build of pyo3 and pyo3-ffi and provides API surface (meant for usage within another crate's build.rs) to manually 'build' the crate with a supplied InterpreterConfig argument.

Either way, I think it would be useful to have a programmatic way to supply the InterpreterConfig when building pyo3, since the config file is just the contents of that struct in a text file.

@davidhewitt
Copy link
Member

A no-build feature sounds extremely complex. To be comfortable merging such a thing I'd probably need convincing that your use case is common enough, and the simplification it brings warrants the additional maintenance burden. Sure the multi-step approach is a touch cumbersome, but at least it's using a relatively simple mechanism (the config file) which is useful in a lot of cases downstream.

Have you investigated the way PyOxidizer works? I suspect it has a similar solution.

@davidhewitt
Copy link
Member

I wonder - rather than making this a PyO3-specific thing, maybe there's a generic tool which could add any crate into another crate via a build.rs? That's at least got more flexibility than baking a no-build feature into the target crate.

@nandita-v
Copy link
Author

I see this problem being applicable to anyone interested in creating standalone, distributable binaries that have some usage of PyO3 (there is also some related discussion in Issue #416.) . For our specific use case, we’re trying to create a standalone application that uses the bumble crate. Currently the crate requires you to have python + the python bumble library installed - instead, we’d like to generate all the resources needed and embed this ‘custom’ python interpreter during the build time of our crate.

I did take a look at how PyOxidizer does things and don’t know if that approach is applicable to every use-case that requires embedding of the python interpreter. For more context, PyOxidizer uses pyo3_ffi in a fairly complex manner to initialize the python interpreter through a MainPythonInterpreter struct - any subsequent python operation needs to go through this MainPythonInterpreter object. It looks like we are re-doing all the work that PyO3 already does on build + need to now be careful to use this new object (and not pyo3::Python), for any python usage. If there was a way to trigger the build of pyo3 with a programmatically supplied InterpreterConfig, any use case that needs to embed the python interpreter wouldn't need to repeat the above steps!

There are ways around this - a multi-step build that sets environment variables etc ( + passing around environment variables between dependencies poses its own set of challenges, assuming we circumvent those issues).. but it would be great to have native support for building with a supplied InterpreterConfig! If you think that there is value in this, I could create a draft of a no-build feature.

@davidhewitt
Copy link
Member

I still fear that there is a lot of complexity in such a feature, and it circumvents cargo's normal model of how dependencies work. If you can demonstrate that this is a relatively self-contained diff which is easy to maintain, I can possibly be persuaded to house it in pyo3. If it's very complex, my reservation that the maintenance/value tradeoff is not worth it may hold.

I still wonder if a generic tool which can apply this transformation to any crate is a better way to build such a feature, if it's possible.

@davidhewitt
Copy link
Member

Regarding PyOxidizer being unsuitable, I see this comment in the docs:

If you use pyo3 APIs like Python::with_gil() directly, you may inadvertently attempt to operate on a finalized interpreter. Therefore it is recommended to always go through a method on an MainPythonInterpreter instance in order to interact with the Python interpreter.

Cc @indygreg, was this a common problem? Maybe PyO3 should change Python::with_gil to check the interpreter is initialized (and not in the process of finalizing). Xref #3386 #3412

@indygreg
Copy link
Contributor

There may be room to improve PyO3's APIs. But I think my note on operating on finalized interpreter has more to do with bad API design on my part.

There are also some quirks where CPython relies too heavily on static variables and interpreter finalization leaves bad state, preventing spawning new interpreters in the same process. I spent way too much time fighting CPython and gave up, eventually running all interpreter tests in separate processes. I want to say these problems are getting cleaned up in CPython as part of making the GIL optional.

And having spent several days worth of time figuring out how to reliably embed Python in a [Rust] process, I suspect the amount of effort involved to support this in a turnkey way in PyO3 is non-trivial. There are also limitations in Cargo that stand in your way. There's a lot of hackiness in pyembed's build.rs and the way pyoxidizer invokes Cargo to get the binaries to build just right. Most of the magic is in https://github.com/indygreg/PyOxidizer/blob/main/pyoxidizer/src/project_building.rs and https://github.com/indygreg/PyOxidizer/blob/main/pyoxidizer/src/py_packaging/embedding.rs. I've long thought about extracting this functionality to a standalone crate because empowering people to more easily embed Python in Rust binaries without encumbering them with PyOxidizer would likely be valuable.

@davidhewitt
Copy link
Member

Thanks!

@luketpeterson @nandita-v perhaps a good way forward is to collaborate with @indygreg to extract into a standalone crate as suggested?

@davidhewitt
Copy link
Member

Given this issue has been inactive for over a year and I don't think there's a clear action here in PyO3, will close this for now.

@davidhewitt davidhewitt closed this as not planned Won't fix, can't repro, duplicate, stale Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants