Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_table for a file with missing first column name issues parsing warning #1549

Open
Sankarinam10san opened this issue Jul 31, 2024 · 1 comment

Comments

@Sankarinam10san
Copy link

I am using read_table for my file which is quite big. The first column doesn't have any column name. After read_table, it is moving the column names to left and it is giving a parsing issue, as the column names and length don't match. This ends up giving me the file in R with one missing column at last. Any way to read this file without losing the last column and the names. I tried read and it gave me only one variable instead of 1027 variables, even though I tried with different sep options.

# insert reprex here
recentered_logCounts <- read_table("recentered_logCount_values.txt")

── Column specification ───────────────────────────────────────────────────────────────────────────────────
cols(
  .default = col_double(),
  TCGA.3C.AAAU_tumor = col_character()
)
ℹ Use `spec()` for the full column specifications.

|============================================================================================| 100% 3350 MB
Warning: 302951 parsing failures.
row col     expected       actual                             file
  1  -- 1026 columns 1027 columns 'recentered_logCount_values.txt'
  2  -- 1026 columns 1027 columns 'recentered_logCount_values.txt'
  3  -- 1026 columns 1027 columns 'recentered_logCount_values.txt'
  4  -- 1026 columns 1027 columns 'recentered_logCount_values.txt'
  5  -- 1026 columns 1027 columns 'recentered_logCount_values.txt'
... ... ............ ............ ................................
See problems(...) for more details.
@joranE
Copy link

joranE commented Jul 31, 2024

Assuming your file is whitespace delimited (since you're using read_table), you might try read_delim instead, which will read in the first column with the missing column name and give a generic placeholder name (with a bunch of warnings):

> read_delim(I(" x\n1 2\n3 4"))
New names:                                                                                        
• `` -> `...1`
Rows: 2 Columns: 2
── Column specification ──────────────────────────────────────────────────────────────────────────
Delimiter: " "
dbl (2): ...1, x

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 2 × 2
   ...1     x
  <dbl> <dbl>
1     1     2
2     3     4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants