read_table for a file with missing first column name issues parsing warning #1549

Sankarinam10san · 2024-07-31T22:25:58Z

I am using read_table for my file which is quite big. The first column doesn't have any column name. After read_table, it is moving the column names to left and it is giving a parsing issue, as the column names and length don't match. This ends up giving me the file in R with one missing column at last. Any way to read this file without losing the last column and the names. I tried read and it gave me only one variable instead of 1027 variables, even though I tried with different sep options.

# insert reprex here
recentered_logCounts <- read_table("recentered_logCount_values.txt")

── Column specification ───────────────────────────────────────────────────────────────────────────────────
cols(
  .default = col_double(),
  TCGA.3C.AAAU_tumor = col_character()
)
ℹ Use `spec()` for the full column specifications.

|============================================================================================| 100% 3350 MB
Warning: 302951 parsing failures.
row col     expected       actual                             file
  1  -- 1026 columns 1027 columns 'recentered_logCount_values.txt'
  2  -- 1026 columns 1027 columns 'recentered_logCount_values.txt'
  3  -- 1026 columns 1027 columns 'recentered_logCount_values.txt'
  4  -- 1026 columns 1027 columns 'recentered_logCount_values.txt'
  5  -- 1026 columns 1027 columns 'recentered_logCount_values.txt'
... ... ............ ............ ................................
See problems(...) for more details.

joranE · 2024-07-31T23:27:57Z

Assuming your file is whitespace delimited (since you're using read_table), you might try read_delim instead, which will read in the first column with the missing column name and give a generic placeholder name (with a bunch of warnings):

> read_delim(I(" x\n1 2\n3 4"))
New names:                                                                                        
• `` -> `...1`
Rows: 2 Columns: 2
── Column specification ──────────────────────────────────────────────────────────────────────────
Delimiter: " "
dbl (2): ...1, x

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 2 × 2
   ...1     x
  <dbl> <dbl>
1     1     2
2     3     4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read_table for a file with missing first column name issues parsing warning #1549

read_table for a file with missing first column name issues parsing warning #1549

Sankarinam10san commented Jul 31, 2024

joranE commented Jul 31, 2024

read_table for a file with missing first column name issues parsing warning #1549

read_table for a file with missing first column name issues parsing warning #1549

Comments

Sankarinam10san commented Jul 31, 2024

joranE commented Jul 31, 2024