Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using the space separator with NA can reorder columns in non-obvious ways #4945

Open
apoliakov opened this issue Mar 26, 2021 · 2 comments
Open
Labels

Comments

@apoliakov
Copy link

You do get a warning but you also get a silent kind of "column reordering." This could lead to tricky and insidious wrong results:

> foo <- data.frame(a = 1, b = NA, c = 3)
> foo
  a  b c
1 1 NA 3
> data.table::fwrite(foo, file = 'foo.sdv', sep = ' ')
> data.table::fread('foo.sdv')
   a b  c
1: 1 3 NA
Warning message:
In data.table::fread("foo.sdv") :
  Detected 3 column names but the data has 2 columns. Filling rows automatically. Set fill=TRUE explicitly to avoid this warning.
> packageDescription('data.table')
Package: data.table
Version: 1.13.4
@myoung3
Copy link
Contributor

myoung3 commented Mar 26, 2021

In the meantime, a workaround here is to specify a different na character (the default is ""). Note that the readLines of foo.sdv shows two spaces between 1 and 3, so fwrite is working as intended, and thus the issue--if there is one--is with fread. It may or may not be reasonable to expect fread to parse those two spaces as separators with a missing value between them--someone who understands all the complexities of fread would need to weigh in.

foo <- data.frame(a = 1, b = NA, c = 3)
data.table::fwrite(foo, file = 'foo.sdv', sep = ' ')
data.table::fread('foo.sdv')
#> Warning in data.table::fread("foo.sdv"): Detected 3 column names but the data
#> has 2 columns. Filling rows automatically. Set fill=TRUE explicitly to avoid
#> this warning.
#>    a b  c
#> 1: 1 3 NA
readLines("foo.sdv")
#> [1] "a b c" "1  3"
data.table::fwrite(foo, file = 'foo2.sdv', sep = ' ',na = "NA")
readLines("foo2.sdv")
#> [1] "\"a\" \"b\" \"c\"" "1 NA 3"
data.table::fread('foo2.sdv')
#>    a  b c
#> 1: 1 NA 3

Created on 2021-03-26 by the reprex package (v1.0.0)

@dvg-p4
Copy link
Contributor

dvg-p4 commented Feb 12, 2025

This is consequence/duplicate of #3658

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants