You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a file containing a very large number of lines (>150 million). For my purposes, I have tried to read the file a piece at a time (e.g. 10 million lines) using read_lines(), extracting the needed information before moving onto the next piece, but I'm getting some very erroneous behavior. To illustrate, I have created a small dummy file (attached) with about 30 lines. When I read this file 10 lines at a time 3 times and piece them together, the result is very different from reading 30 lines.
conn<- file("readr_error.txt", open="rb")
r1<- read_lines(conn, n_max=10) #attempting to read lines 1-10r2<- read_lines(conn, n_max=10) #attempting to read lines 11-20r3<- read_lines(conn, n_max=10) #attempting to read lines 21-30r4<- c(r1, r2, r3)
close(conn)
conn<- file("readr_error.txt", open="rb")
r5<- read_lines(conn, n_max=30) #reading lines 1-30
close(conn)
#all lines from 11 to 30 are not equalr4==r5
On the other hand, if I use readLines from the base R package, it works as intended:
Hello,
I have a file containing a very large number of lines (>150 million). For my purposes, I have tried to read the file a piece at a time (e.g. 10 million lines) using read_lines(), extracting the needed information before moving onto the next piece, but I'm getting some very erroneous behavior. To illustrate, I have created a small dummy file (attached) with about 30 lines. When I read this file 10 lines at a time 3 times and piece them together, the result is very different from reading 30 lines.
On the other hand, if I use readLines from the base R package, it works as intended:
Can you please explain this?
readr_error.txt
The text was updated successfully, but these errors were encountered: