Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommend file(..., raw = TRUE) for checksums #85

Open
nielsaka opened this issue Dec 18, 2020 · 0 comments
Open

Recommend file(..., raw = TRUE) for checksums #85

nielsaka opened this issue Dec 18, 2020 · 0 comments

Comments

@nielsaka
Copy link

When creating file checksums via sha1 or similar, I would recommend setting raw=TRUE in the file connections. Maybe that can be added to the documentation?

Use case: comparing files on different machines. If the file is an RDS file (or binary or compressed?) and raw=FALSE (default), the file() function does something that leads to changes in the hash. It is also quicker to use raw=TRUE.

Example:

> system("sha1sum data/article_all.Rds")
8192a2610e8e67e559ba80760f198bf810096f7a  data/article_all.Rds
> openssl::sha1(file("data/article_all.Rds"))
sha1 9c:11:ac:17:5a:86:7c:67:a4:77:ad:87:35:67:62:09:64:1e:88:36 
> openssl::sha1(file("data/article_all.Rds", raw = TRUE))
sha1 81:92:a2:61:0e:8e:67:e5:59:ba:80:76:0f:19:8b:f8:10:09:6f:7a 

 
From the documentation of file

raw
logical. If true, a ‘raw’ interface is used which will be more suitable for arguments which are not regular files, e.g.character devices. This suppresses the check for a compressed file when opening for text-mode reading, and asserts that the ‘file’ may not be seekable.

 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant