-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confusing error for non-ascii input #1521
Comments
In terms of what's happening, A complicating factor is that the input
Then we're also getting lots of base R warnings from a failed file existence check, since the file path is not valid UTF-8. I'm pretty surprised this code ever worked or that it worked in recent memory. I'll have a think on whether we can improve on the error. But the error and warnings above do actually explain what's wrong, albeit it in a rather cryptic way. If you want to update the code, here are some ideas. Key changes are to explicitly convert from latin1 to UTF-8 and to use If you want to keep using the library(readr)
x1 <- "text\nEl Ni\xf1o was particularly bad this year"
read_csv(I(iconv(x1, "latin1", "utf-8")), show_col_types = FALSE)$text
#> [1] "El Niño was particularly bad this year" library(vroom)
x1 <- "text\nEl Ni\xf1o was particularly bad this year"
vroom(I(iconv(x1, "latin1", "utf-8")), delim = ",", show_col_types = FALSE)$text
#> [1] "El Niño was particularly bad this year" But if this is just about using an accented character in literal input, then use a library(readr)
x1 <- "text\nEl Ni\u00F1o was particularly bad this year"
read_csv(I(x1), show_col_types = FALSE)$text
#> [1] "El Niño was particularly bad this year" Created on 2023-11-09 with reprex v2.0.2.9000 |
The noise/errors around the original example (which originates in R4DS) have probably gotten worse over time due to changes in base R. Some relevant items from NEWS:
|
This is from R4DS.
Created on 2023-11-09 with reprex v2.0.2
The text was updated successfully, but these errors were encountered: