You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
We need to scale our data ingestion and would be nice to have support to Spark when writing the outputs from DataExporter. When using the pyarrow, we also noted the number of parquet files grows and with maybe would be good also to allow to control this as a parameter.
Describe the solution you'd like
We would like to allow optionally passing a spark session instance to FocusConverter class to be passed down to DataExporter.
Describe alternatives you've considered
We're considering to fork the repository itself, but maybe the solution is needed by others. We didn't find other options to speedup the convertion. Any ideas are welcome!
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
We need to scale our data ingestion and would be nice to have support to Spark when writing the outputs from DataExporter. When using the pyarrow, we also noted the number of parquet files grows and with maybe would be good also to allow to control this as a parameter.
Describe the solution you'd like
We would like to allow optionally passing a spark session instance to FocusConverter class to be passed down to DataExporter.
Describe alternatives you've considered
We're considering to fork the repository itself, but maybe the solution is needed by others. We didn't find other options to speedup the convertion. Any ideas are welcome!
The text was updated successfully, but these errors were encountered: