-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to specify output format #305
Comments
Hi @sxdjt , This is one of the planned feature. We will get this resolved soon. One of the issue here to think about is, do we public one consolidated output or a split file, which then can be split into multiple files. |
Hi @varunmittal91 - that is a consideration, especially when dealing with multi-GB CSV files. Ideally, users would be presented with options on how they want the output generated, something like:
If the user selects option 3, they would be prompted to enter the number of rows they want per CSV. These could also be passed as arguments, e.g. There are tools that could be adapted for use to reduce coding efforts? xsv: https://github.com/BurntSushi/xsv |
sorry I added #348 before I read this... multiple output options would be beneficial |
Hey Folks, as a workaround I'm looking at using https://github.com/clemensv/avrotize as a second conversion step to get to AVRO format. fingers crossed... |
Is your feature request related to a problem? Please describe.
No, not related to a problem.
Describe the solution you'd like
The default output of the conversion is parquet. While parquet is efficient, having the option to output CSV would be beneficial. It would support quick checks of the conversion, as well as further analysis/manipulation without having to deal with a parquet file.
What I would like: I would like to be able to specify the output format, e.g.
-output-format csv
to override the default format.Describe alternatives you've considered
Converting parquet to CSV manually is certainly workable, but the tool should provide this option directly.
The text was updated successfully, but these errors were encountered: