Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fwrite() should (possibly invisibly) return the file path #5706

Open
eliocamp opened this issue Oct 18, 2023 · 12 comments · May be fixed by #6181
Open

fwrite() should (possibly invisibly) return the file path #5706

eliocamp opened this issue Oct 18, 2023 · 12 comments · May be fixed by #6181
Assignees
Labels

Comments

@eliocamp
Copy link
Contributor

Now fwrite() returns invisible(), but I think it might be more useful if it returned the path to the created file. It's a trivial change and I don't think it would create issues with backward compatibility, since it's probably not likely that any code relies on fwrite() not returning anything (although a similar change to ggsave() did break a few scripts 🤣).

@MichaelChirico
Copy link
Member

why the file path as opposed to x?

the latter behavior would be more like print() where a function used for side effects returns it's main argument (https://design.tidyverse.org/out-invisible.html)

@eliocamp
Copy link
Contributor Author

I guess because then you can work with the results of the write in the pipeline. It would be analogous to fs::file_create() in that page. Also, you already have easy access to x in your environment, while the path to the file might not be available since it might be created on based on data. For instance:

DT[, fwrite(.SD, group), by = group]

Will create a bunch of files, but you don't have access to them (the result is an empty data.table). If fwrite() returned the created file you could do

files <- DT[, fwrite(.SD, group), by = group]

and then use files in your code easily.

@MichaelChirico
Copy link
Member

It would be analogous to fs::file_create()

Not really, because the "main argument" of file_create() is the file path. The main argument for fwrite() is x.

you already have easy access to x in your environment

Not necessarily, e.g. fwrite(DT[...], file) where now the output can be any table.

Your code can be done currently like:

files <- DT[, fwrite(.SD, .BY$group), by = group]$group

More generally, I think it's just a difference between code like:

DT[, by = group, {
  ...
  fwrite(.SD, file, ...)
  .SD
}]

and

DT[, by = group, {
  ...
  fwrite(.SD, file, ...)
  file
}]

So I'm still not sure why to prioritize one over the other.

One way to approach this empirically would be to find out which object is used more:

https://github.com/search?q=lang%3AR+%22fwrite%28%22+%22.SD%22+%22%7B%22&type=code

@eliocamp
Copy link
Contributor Author

Oh, sorry, I meant fs:file_copy(from, to), which returns to instead of from.

One issue with your examples is that you only grab the file name and not the full path.

For fwrite(DT[...], file), you could always do fwrite(saved_data <- DT[...], file) . I'd argue that if you're computing something inside the call to fwrite() then you probably don't care about the output in the rest of your code, otherwise you would save it to a variable and then save it. A more realistic example could be to save intermediate steps in a pipeline like:

DT2 <- DT |> 
   _[...] |> 
   fwrite("file.csv") |> 
   _[...]

But the %T>% pipe might be more useful for that, as you are not relying on the return value of the intermediate function.

@jangorecki jangorecki changed the title fwrite() should (possibly invidibly) return the file path fwrite() should (possibly invisibly) return the file path Nov 3, 2023
@AngelFelizR
Copy link

@eliocamp I think that adding a write step between several modification steps would affect code readability.

It will take more time to a new user to understand when we are saving the data in the script

@eliocamp
Copy link
Contributor Author

eliocamp commented Nov 4, 2023

I agree. That's why I think that fwrite() should return the path to the created file. So the next step after writing would be to do stuff with the file, which is relatively clear and straightforward.

@AngelFelizR
Copy link

Got it.

It will prevent me from writing a for loop to achieve that task. I wouldn't have written the next code as it is really clever.

files <- DT[, fwrite(.SD, .BY$group), by = group]$group

It can become much harder if we want to save the files in a folder, so returning the file's path seems to be a good a idea.

files <- DT[, fwrite(.SD,  paste0("data/", .BY$group, ".csv")), by = group]

@Nj221102
Copy link
Contributor

Hi i want to work on this issue , @MichaelChirico I wanna know what you think should be returned x or file path, and can you give me some idea about stuff need to be done for this issue :)

@Nj221102 Nj221102 self-assigned this Mar 20, 2024
@ben-schwen
Copy link
Member

AFAIU there is no decision made yet what should be returned by fwrite. First step could be to scan popular packages and see what they do or what base does, e.g. check out the writing to disk methods for utils::write.table, vroom, fst, etc.

@jangorecki
Copy link
Member

I would say if they return NULL then it is not very useful follow them. TRUE is already better result then NULL...

@Nj221102
Copy link
Contributor

I would say if they return NULL then it is not very useful follow them. TRUE is already better result then NULL...

what do you think should be returned?

@jangorecki
Copy link
Member

File name or path, depends what was provided

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants