-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyO3 Sub Interpreter Broken since 0.22.0 #4570
Comments
Furthermore, in the upcoming |
Your code was always broken, this is just when you noticed it. Your code has various examples of UB, like releasing the gil when you aren't holding it, and calling python api without the gil held. Your are also not handling existing threadstate correctly, which is probably why the linked PR stopped your code from "working". Mixing pyo3 api and subinterpreters is also not supported. My suggestion is to exclusively use pyo3-ffi. The documentation for that is at https://docs.python.org/3/c-api/init.html |
cc @wangrunji0408 do you have any ideas? |
Hi, @mejrs, thank you for your prompt response. Our use case is quite unique, as we will run users' UDFs exclusively in this sub-interpreter.
It seems this approach is incorrect; I didn't run that code in the sub-interpreter. |
What you are doing is inherently |
Hi, @davidhewitt, would you like to lead me a direction in finding a solution? I'm willing to help implement it in PyO3 or fix our usage. |
I would suggest looking at the problem like this:
A safe solution in PyO3 is extremely nontrivial, see #576 What we should start with, is to consider: what state are you passing into the subinterpreter? If we can isolate that to a very well defined data payload, we can probably make a very limited subinterpreter call sound. |
@davidhewitt Thank you for your suggestion. After our investigation, we found that the root cause of the issue is in the third point you mentioned—the internal GIL counter. In our code, after switching to a subinterpreter, a GILPool is created, which increments the internal counter by 1. However, after this commit, the counter remains unchanged at 0. This results in Python objects being registered in the internal queue for deferred drop instead of being dropped immediately when exiting the Python scope. When re-entering, a different subinterpreter might be used. This leads to exceptions. Unfortunately, we haven’t found any public interfaces in the latest version of PyO3 that allow us to manipulate the internal counter directly. As a result, this issue cannot be resolved for now. We are looking forward to native support for subinterpreters in PyO3 soon. |
From what I understand, it's a long way for PyO3 to officially support subinterpreters. I hope we can find a way to use subinterpreters unsafely for now and work together to build the final solution. Is it possible for us to modify |
What makes you think GILGuard is the problem? You should fix the bugs in your code first. |
You should use |
I would hasten to note that the internal counter is really an implementation detail / optimization, and you should expect that what you're doing might break again with future PyO3 upgrades. |
I'm going to close this as really this is part of #576, and the exact (unsafe) stuff going on here can be discussed without needing to track an issue. |
Bug Description
Arrow UDF is a User-Defined Functions Framework that allows users to easily create and run user-defined functions (UDF) on Apache Arrow.
We use pyo3 to create a sub-interpreter and run users' UDFs within it to avoid the global GIL. It used to work fine, but we discovered it broke after the
0.22
release, specifically after this commit in #4188.Steps to Reproduce
I've prepared a repository: https://github.com/Xuanwo/pyo3-sub-interpreter-broken
To reproduce:
Backtrace
The text was updated successfully, but these errors were encountered: