-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent file ordering when generating checksums #393
Comments
Maybe it's because of this: #394 |
Do this: |
The checksum output of md5deep/hashdeep is in the order of the hashing threads completing their work. From the man page: By default the program will create one producer thread to scan the file system and one hashing thread per CPU core. Multi-threading causes output filenames to be in non-deterministic order, as files that take longer to hash will be delayed while they are hashed. If a deterministic order is required, specify -j0 to disable multi-threading. So, as well has the suggestion about sorting the checksums as you diff them, you have a few options to produce nice sorted output. (1) If you don't care about how long the job takes to run, use the -j0 flag, for example:
(2) Sort the output as the job runs, for example:
I prefer to use hashdeep, which has more complex output. I'll just note my method here for sorting (using Bash shell):
This gives output (on stderr) showing the hashing progress on each file, writing the checksums to a temporary file, then sorting that temporary file. |
Performed
md5deep -r folder/
several consecutive times, my surprise is that each time the file order may differ, even if the contents are exactly the same. For example:First checksum generation:
Second checksum generation:
I can't perform any
diff 1.md5 2.md5
because the order differs, have to sort them first. Can this be fixed or the file list is randomly generated then checksum is performed?The text was updated successfully, but these errors were encountered: