SNOW-873466 Allow reading Arrow record batch streams from result set #1422
Labels
backend changes needed
Change must be implemented on the Snowflake service, and not in the client driver.
feature
status-blocked
Progress cannot be made to this issue due to an outside blocking factor.
status-triage_done
Initial triage done, will be further handled by the driver team
I would like to read the result set of a query as streams of Arrow record batches. The QueryResultFormat.ARROW provides serialization of data in Arrow stream format, but I don't see a way I can read those streams directly. Can this be exposed to read directly similar to the
ArrowStreamLoader
in gosnowflake? see https://github.com/snowflakedb/gosnowflake/blob/master/connection.go#L577What is the current behavior?
Arrow format is used to serialize data from a result set, but doesn't seem to be exposed to read directly.
What is the desired behavior?
Read the serialized Arrow streams directly.
How would this improve
snowflake-jdbc
?By consuming Arrow data directly, entire batches can be read at a time instead of each scalar value, increasing performance.
References, Other Background
Similar interface is provided in gosnowflake, https://github.com/snowflakedb/gosnowflake/blob/master/connection.go#L577
Currently the Arrow ADBC driver makes use of it and shows good performance gains https://github.com/apache/arrow-adbc/blob/main/go/adbc/driver/snowflake/record_reader.go#L242
Arrow stream is being read here https://github.com/snowflakedb/snowflake-jdbc/blob/master/src/main/java/net/snowflake/client/jdbc/SnowflakeChunkDownloader.java#L892
What is your Snowflake account identifier, if any?
The text was updated successfully, but these errors were encountered: