Replies: 1 comment
-
|
I'd like to second this request :) This would be a great improvement for those of us wanting smaller bundles, more control, as well as more efficient and up-to-date Arrow features. Technically this should not be too challenging. Locally I experimented with a build that strips out the Arrow dependency, but there are some complications: DuckDB-WASM uses the Arrow library for more than just handling result buffers. It also uses the exported Arrow types as part of the signatures for JavaScript UDFs. I did not try to update these; my local version instead simply does not support such UDFs. So one variable here is that I'm not sure how difficult it would (or would not) be to port this to a vendor independent typing system. All that being said, the biggest hurdle here may be the breaking changes induced. If the existing Arrow library was fully removed, there could be some migration pains to add explicit Arrow decoding (or new UDF types). Nevertheless, in most cases the change would involve one additional import and one additional line of code. Overall, I think the change would be a beneficial one for all the reasons @TomBor enumerates. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Proposal
Provide an alternative DuckDB WASM build that doesn't bundle
apache-arrowas a dependency, allowing users to choose their own Arrow IPC parser (apache-arrow, Flechette, or others).DuckDB WASM currently has a hard dependency on
apache-arrow@^17.0.0:This creates several issues:
Version conflicts: Projects using libraries like
@geoarrow/deck.gl-layers(requiresapache-arrow >=15) or other Arrow-based tools often need newer Arrow versions. This leads to duplicate Arrow bundles or version incompatibilities (see Failure in tableToIPC when using a v21.0.0 apache-arrow version than the dependency #2097).Bundle size:
apache-arrowadds ~200KB+ to the bundle. For projects already using lighter alternatives like Flechette (~50KB), this is unnecessary bloat.No choice: Users cannot opt for alternative Arrow implementations that may better suit their needs (performance, bundle size, specific features).
In #2041, @jheer demonstrated that DuckDB WASM can work without the
apache-arrowdependency by using the unsafe API to retrieve raw IPC buffers.I have absolutely no idea what this would imply in terms of difficulty or even feasibility. But I'm quite sure I won't be the only one who would benefit from this feature.
Beta Was this translation helpful? Give feedback.
All reactions