Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Arrow Flight frontend #477

Closed
gruuya opened this issue Dec 1, 2023 · 0 comments · Fixed by #478
Closed

Add Arrow Flight frontend #477

gruuya opened this issue Dec 1, 2023 · 0 comments · Fixed by #478

Comments

@gruuya
Copy link
Contributor

gruuya commented Dec 1, 2023

Currently Seafowl provides two interfaces, the HTTP frontend and the PG endpoint. While they're fine for summarized (i.e. aggregated) data output for end users directly or via a web app, they fall short when it comes to transferring large result sets.

Putting aside the fact that not all Arrow types are supported on the currently existing frontends (e.g. see #393), returning large data sets will most likely be highly inefficient due to the overhead of converting the internal columnar representation into row-based JSON response (for the HTTP frontend) for non-trivial row counts. The particular scenario there would involve an external DB system which uses Seafowl for analytical workloads, but can't push down the entire query in some cases so it must fetch the underlying data to perform the original query itself.

Introducing the Arrow Flight (Arrow Fligt SQL in particular[1]) frontend here would solve this problem, since it provides a protocol for sending Arrow data, and would thus avoid unnecessary serialization.

[1] https://voltrondata.com/resources/apache-arrow-flight-sql-arrow-for-every-database-developer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant