snowflake: streaming PUTs to internal stages #50

jgraettinger · 2021-10-13T16:50:17Z

Not long ago, I attempted to update the connector to use the gosnowflake driver's recently added support for PUTs to Snowflake stages. I ran into a bug that has since been fixed, and we should try again.

While we're at it, we should switch to streaming PUTs to the internal stage as we consume from the Store iterator (instead of staging to a local temporary file, and then starting to upload only after the Store iterator is consumed). In my own profiling, it seems like this would materially reduce data stalls as we execute transactions, as these PUTs typically take seconds to complete for larger files.

As further context, our philosophy on connector errors and retries has shifted, and we're planning to implement a watch-dog in the control plane which looks for failed shards and restarts them with a backoff policy. That means the connector doesn't need to worry about spurious errors and retries while executing the PUT to Snowflake -- it can implement the more efficient, direct strategy of simply streaming to the stage and hoping for the best.

The text was updated successfully, but these errors were encountered:

williamhbaker · 2023-04-12T22:52:58Z

I attempted to implement streaming PUTs and while it does work now, the memory usage is not practical since the gosnowflake driver reads the entire stream into memory, see snowflakedb/gosnowflake#536. Once this is resolved it should be straightforward to switch to streaming PUTs.

jgraettinger added the enhancement New feature or request label Oct 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

snowflake: streaming PUTs to internal stages #50

snowflake: streaming PUTs to internal stages #50

jgraettinger commented Oct 13, 2021

williamhbaker commented Apr 12, 2023

snowflake: streaming PUTs to internal stages #50

snowflake: streaming PUTs to internal stages #50

Comments

jgraettinger commented Oct 13, 2021

williamhbaker commented Apr 12, 2023