-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
relax multiple-import check to only prevent subinterpreters #3446
Conversation
f5d6b6d
to
3beadbd
Compare
src/impl_/pymodule.rs
Outdated
return Err(PyImportError::new_err( | ||
"PyO3 modules may only be initialized once per interpreter process", | ||
"PyO3 modules do not yet support subinterpreters, see https://github.com/PyO3/pyo3/issues/576", | ||
)); | ||
} | ||
(self.initializer.0)(py, module.as_ref(py))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to call the initializer multiple times or should the second import become a no-op?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the spirit of PEP 489 (which we don't support yet, but hopefully are heading towards) is that modules can be torn down and re-imported, so I think it's correct to call the initializer multiple times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which we don't support yet, but hopefully are heading towards
I think this is the main issue here, isn't it? Does this change mean we opt into allowing tear-down and reinitialization of PyO3-based modules? What invariants would this break? Do we test this somewhere?
(I am not trying to say that it doesn't work. Just that I myself do not understand the consequences of this change and have little confidence about its effects on existing code.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In pydantic/pydantic#6584 (comment), you mention opt-in. Might this be advisable even for this limited relaxation? What if the module is torn down but static data still contains pointers into the old module (e.g. capsules) which are now invalid? For example, is the code in https://github.com/PyO3/rust-numpy/blob/c16fbb1e630b0538fdf17cbc53043bf0467459f9/src/npyffi/mod.rs#L30-L32 valid after this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I found the Cython implementation here:
That code:
- Has a similar check that the interpreter hasn't changed ID
- Caches a module instance in a C static and returns the single module instance always.
So I guess let's implement this strategy too and defer the question of PEP 489 compatibility to a future PR. My goal for this PR is primarily to fix the use cases in the OP, which I believe the Cython strategy would cover.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's clearly awkward to get this right. I think on balance it's desirable if we match Cython so that more of the ecosystem fails in the same way and it seems better that we only ever run the module initializer once for now. We can make progress towards per-module state to eventually remove this worry.
I agree with that. Do you know the upstream position on this, i.e. would they consider fixing this or does it work as intended?
(I made with_embedded_python_interpreter unsafe for this reason.)
I am somewhat confused as to what the consequences of this discovery are. Especially with the Cython module immediately showing the failure mode I was concerned about. Can we still relax this? Is the unsafe
on with_embedded_python_interpreter
sufficient? (I guess another other way to re-initialize Python in the current process would call unsafe Rust code or code in inherently unsafe languages like C, i.e. is trusted?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just posted python/cpython#109785 - let's see what the outcome of that discussion is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No movement yet on the upstream issue. @adamreichold are you happy if I proceed to merge this? As above at least we'll then be consistent with Cython and I think the generally accepted position is that things are expected to be wonky after finalizing and re-initializing the interpreter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think the language on with_embedded_python_interpreter
is strong enough. This is the only API we provide to finalize and then re-initialize the interpreter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes (apart from the raw FFI bindings). auto-initialize
never bothers finalizing the interpreter.
3beadbd
to
ca7fa9f
Compare
Ok, I've pushed a new commit to this PR which:
The use of |
ca7fa9f
to
9061ffa
Compare
(I'm not a huge fan of the divergence of behaviour between Python 3.7/3.8 and 3.9+, but I think in this case it is justified as the 3.9 behaviour is now a relaxation of the existing constraint to improve ergonomics where we can safely do so.) |
9061ffa
to
de2faba
Compare
de2faba
to
f17e703
Compare
bc0222e
to
1a349c2
Compare
Continues to be an issue in Dec 2024. Users of python crypto modules are affected, including Ceph, which is trying to create X509 certificates, are hit by this bug. I've collected links to some of the affected projects here. pyca/cryptography#12080 It appears that PyO3 is always downstream of all of these issues. Is there some way of refocusing attention to get this fixed? |
Some issues observed in the wild from the current implementation of the multiple-import check:
The motivation for the check is from #2346 (comment) - it's to prevent use of PyO3 modules in sub-interpreters, which we are definitely not ready for yet.
Given the issues above don't actually use sub-interpreters, it seemed worth trying to fix them. This PR does so by storing the current interpreter ID when initializing a
#[pymodule]
, and checking that does not change on subsequent imports.I've also updated the error message to point at #576 so that discussion on the community can at least be centralized here.