(rust) How to use opendal to load data into DuckDB or Polars LazyFrame? #5972
Replies: 2 comments 6 replies
-
Hi, we have https://github.com/apache/opendal/blob/main/bindings/python/examples/polars.ipynb and https://github.com/apache/opendal/blob/main/bindings/python/examples/pandas.ipynb as basic examples. Would these be helpful to you? |
Beta Was this translation helpful? Give feedback.
-
@Xuanwo I had one idea to make this work for all 3 - pyarrow, polars and duckdb. Instead of doing things on rust/ c side in each project, I saw that the python bindings all all 3 projects support fsspec based file systems - pyarrow, duckdb and polars So if the opendal team sees it as a good fit, we can provide an opendal backed implementation of the fsspec.AsyncFileSystem + maybe pyarrow.fs.FSSpecHandler, directly in the python bindings for end users to use (maybe behind an extra/ group??). Beyond polars and duckdb, also this will help users to use opendal backends through these interfaces. Someone is also partially maintaining this in a separate repo, might as well do it officially. wdyt? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I’m trying to load data from a remote storage backend (e.g., S3, local FS) using opendal, and I want to pass that data into either:
My current idea is to read the data into memory using op.read() and then wrap it with Cursor, but I’m unsure if this is the best or most idiomatic way to integrate with these libraries, especially for streaming large datasets.
Has anyone successfully done this? Are there preferred patterns or best practices for interoperability between opendal and DuckDB or Polars?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions