Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly handle CSV files with a single separator throughout #3186

Merged
merged 2 commits into from
Feb 9, 2025

Conversation

keith-hall
Copy link
Collaborator

@keith-hall keith-hall commented Jan 24, 2025

fixes #3127 and fixes #2078

better auto-detection of CSV delimiter

  • files with a tsv extension are automatically detected as tab delimited
  • other files parsed as CSV go through the following steps:
    • if the first line contains at least 3 of the same separator, it uses that separator as a delimiter
    • if the first line contains only one supported separator character, it uses that separator as a delimiter
    • otherwise it falls back to treating all supported delimiters as the delimiter

supported delimiters, in precedence order:

  • comma ,
  • semi-colon ;
  • tab \t
  • pipe |

image

@keith-hall keith-hall force-pushed the csv_1977 branch 2 times, most recently from ad9dba5 to 9abba46 Compare January 25, 2025 19:33
keith-hall and others added 2 commits February 9, 2025 20:37
better auto-detection of CSV delimiter
- files with a tsv extension are automatically detected as tab delimited
- other files parsed as CSV go through the following steps:
  - if the first line contains at least 3 of the same separator, it uses that separator as a delimiter
  - if the first line contains only one supported separator character, it uses that separator as a delimiter
  - otherwise it falls back to treating all supported delimiters as the delimiter

 supported delimiters, in precedence order:
 - comma `,`
 - semi-colon `;`
 - tab `\t`
 - pipe `|`
@keith-hall keith-hall enabled auto-merge February 9, 2025 18:38
@keith-hall keith-hall merged commit 547bc38 into master Feb 9, 2025
24 checks passed
@keith-hall keith-hall deleted the csv_1977 branch February 9, 2025 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Determine CSV delimiter based on header TSV highlighting doesn't work
1 participant