`read_fwf` doesn't support using the first row as column names #1393

rgzn · 2022-03-25T17:58:19Z

With most of the readr rectangular data reading functions, the col_names argument can be set to TRUE to interpret the first row as column names. This feature does not exist in read_fwf().

I am not sure if this is a deliberate design decision, but it would be very nice to have the option of using the first row as column names.

Here is an example of reading the same data in csv or fwf format, and the column names being interpreted differently:

> read_csv(I(c("col1,col2,col3", "first,middle,last")))
Rows: 1 Columns: 3                                                                                                                                                  
-- Column specification ---------------------------------------------------------------------------------------------------------
Delimiter: ","
chr (3): col1, col2, col3

i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 1 x 3
  col1  col2   col3 
  <chr> <chr>  <chr>
1 first middle last

Using defaults for read_fwf:

> read_fwf(I(c(" col1   col2 col3", "first middle last")))
Rows: 2 Columns: 3                                                                                                                                                  
, eta: -- Column specification ---------------------------------------------------------------------------------------------------------

chr (3): X1, X2, X3

i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 2 x 3
  X1    X2     X3   
  <chr> <chr>  <chr>
1 col1  col2   col3 
2 first middle last

The text was updated successfully, but these errors were encountered:

ilikegitlab · 2022-06-11T20:07:46Z

Adding my vote as I spend 15 min looking for this as all the other readr functions have it..

rgzn · 2022-06-12T00:45:50Z

Yeah I spent a lot longer than that! Unfortunately after looking at the code, I think it would take a while for me to figure this out enough to submit a pull request.

It's very unexpected behavior though and I don't really understand why it's not there.

hadley · 2023-07-31T22:52:34Z

That looks more like a white-space delimited file than what you normally see in a FWF. And read_table() does work:

readr::read_table(I(c(" col1   col2 col3", "first middle last")))
#> # A tibble: 1 × 3
#>   col1  col2   col3 
#>   <chr> <chr>  <chr>
#> 1 first middle last

^{Created on 2023-07-31 with reprex v2.0.2}

rgzn · 2023-08-01T22:08:59Z

That looks more like a white-space delimited file than what you normally see in a FWF. And read_table() does work:

Well, I just used that example to illustrate the issue. The original data where I encountered this problem (animal tracking collar) was fixed width, not white space (there was white space in the data). I still don't understand the reasoning for not including the col_names argument.

hadley · 2023-08-01T22:19:54Z

Because it's very rare for fwf columns to have reasonable column names, since column widths are typically very small.

ilikegitlab · 2023-08-02T06:59:13Z

That is not true.

I've data from commercial dataloggers that produces tons of the stuff.

hugomflavio · 2024-10-12T19:50:29Z

hm... this shows closed as completed, but was this functionality added? I have a similar "datalogger produces a weird format output" situation, but couldn't find an argument to get read_fwf to use the first row as column headers. read_table() does not like the file because one of the columns has no data. read_fwf can parse it correctly though (except for the column names).

read_fwf example: Gets the column names (and therefore data types) wrong, but the data right.

> readr::read_fwf(I(c(" col1   col2 col3", "first        last")))
Rows: 2 Columns: 3                                                                                                 
── Column specification ───────────────────────────────────────────────────────────────────────────

chr (3): X1, X2, X3

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 2 × 3
  X1    X2    X3   
  <chr> <chr> <chr>
1 col1  col2  col3 
2 first NA    last

read_table example: Gets the columns right, but the data wrong.

> readr::read_table(I(c(" col1   col2 col3", "first        last")))
Warning: 1 parsing failure.
row col  expected    actual         file
  1  -- 3 columns 2 columns literal data

# A tibble: 1 × 3
  col1  col2  col3 
  <chr> <chr> <chr>
1 first last  NA

rgzn changed the title ~~read_fwf doesn't support using the first row as column names, why?~~ read_fwf doesn't support using the first row as column names Mar 25, 2022

sbearrows added the feature a feature request or enhancement label Apr 7, 2022

jennybc added the read 📖 label Sep 1, 2022

hadley closed this as completed Jul 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`read_fwf` doesn't support using the first row as column names #1393

`read_fwf` doesn't support using the first row as column names #1393

rgzn commented Mar 25, 2022

ilikegitlab commented Jun 11, 2022

rgzn commented Jun 12, 2022

hadley commented Jul 31, 2023

rgzn commented Aug 1, 2023

hadley commented Aug 1, 2023

ilikegitlab commented Aug 2, 2023

hugomflavio commented Oct 12, 2024

read_fwf doesn't support using the first row as column names #1393

read_fwf doesn't support using the first row as column names #1393

Comments

rgzn commented Mar 25, 2022

ilikegitlab commented Jun 11, 2022

rgzn commented Jun 12, 2022

hadley commented Jul 31, 2023

rgzn commented Aug 1, 2023

hadley commented Aug 1, 2023

ilikegitlab commented Aug 2, 2023

hugomflavio commented Oct 12, 2024

`read_fwf` doesn't support using the first row as column names #1393

`read_fwf` doesn't support using the first row as column names #1393