feat: support UUIDs to pyarrow on more backends#8901
feat: support UUIDs to pyarrow on more backends#8901NickCrews wants to merge 1 commit intoibis-project:mainfrom
Conversation
2d5361f to
bb2087d
Compare
accada4 to
b853a19
Compare
b853a19 to
dae0350
Compare
cpcloud
left a comment
There was a problem hiding this comment.
We need to avoid mixing pyarrow and pandas conversion paths.
|
OK, I think this brings up a larger philosophical question: Do we want to totally separate the pandas and pyarrow codepaths, or can they rely on each other? Currently, to get pyarrow results from a backend:
I think the coupling between pandas and pyarrow for this conversion isn't inherently bad (we don't need to implement the db -> pyarrow path!), but I agree that it should be isolated, so we are very clear where we are mixing these two ecosystems, so that for the backends that don't need it, you can just have pyarrow installed, you don't need pandas. So I see two options:
I think I would lean towards 2. I want to remove reliance on pandas as much as possible. Possibly this implementation won't be that hard for these other backends. |
|
I think we'd to eventually be able to offer Ibis without requiring There's also the potential of using something that doesn't depend on either of those for the core (like printing tables), so I think we'd like to keep things as isolated from one another as possible. Even more is the fact that sending anything through pandas is likely to result in some kind of type or value alteration that doesn't happen with pyarrow. Especially with NULLs, pandas is likely to do something completely different and incompatible with what pyarrow would do. |
|
Ok, when I get back to this I'll try the db -> arrow method! |
|
Is this PR still viable? |
|
viable, I just stopped needing it personally so the urgency of it dropped a lot compared the 5 million other PRs I have open haha. Feel free to close if you want, and re-open once someone actually finds time to work on it. |
partially fixes #8902.
Implements UUID execution to pyarrow on some backends, and adds notimpl tests for the rest.