You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently Seafowl provides two interfaces, the HTTP frontend and the PG endpoint. While they're fine for summarized (i.e. aggregated) data output for end users directly or via a web app, they fall short when it comes to transferring large result sets.
Putting aside the fact that not all Arrow types are supported on the currently existing frontends (e.g. see #393), returning large data sets will most likely be highly inefficient due to the overhead of converting the internal columnar representation into row-based JSON response (for the HTTP frontend) for non-trivial row counts. The particular scenario there would involve an external DB system which uses Seafowl for analytical workloads, but can't push down the entire query in some cases so it must fetch the underlying data to perform the original query itself.
Introducing the Arrow Flight (Arrow Fligt SQL in particular[1]) frontend here would solve this problem, since it provides a protocol for sending Arrow data, and would thus avoid unnecessary serialization.
Currently Seafowl provides two interfaces, the HTTP frontend and the PG endpoint. While they're fine for summarized (i.e. aggregated) data output for end users directly or via a web app, they fall short when it comes to transferring large result sets.
Putting aside the fact that not all Arrow types are supported on the currently existing frontends (e.g. see #393), returning large data sets will most likely be highly inefficient due to the overhead of converting the internal columnar representation into row-based JSON response (for the HTTP frontend) for non-trivial row counts. The particular scenario there would involve an external DB system which uses Seafowl for analytical workloads, but can't push down the entire query in some cases so it must fetch the underlying data to perform the original query itself.
Introducing the Arrow Flight (Arrow Fligt SQL in particular[1]) frontend here would solve this problem, since it provides a protocol for sending Arrow data, and would thus avoid unnecessary serialization.
[1] https://voltrondata.com/resources/apache-arrow-flight-sql-arrow-for-every-database-developer
The text was updated successfully, but these errors were encountered: