This is a Python3 tool that can mass-download files from Zenodo records.
The code is hosted at Github, former Gitlab hosting is discontinued.
From PyPI:
pip3 install zenodo_get
Or from Github:
pip3 install git+https://github.com/dvolgyes/zenodo_get
Afterwards, you can query the command line options:
zenodo_get -h
but the default settings should work for most use cases:
zenodo_get RECORD_ID_OR_DOI
The tool itself is simple, and the help message is reasonable:
zenodo_get -h
but if you need more, open a github ticket and explain what is missing.
Basic usage:
zenodo_get RECORD_ID_OR_DOI
Special parameters:
-m
: generate md5sums.txt for verification. Beware, ifmd5sums.txt
is present in the dataset, it will overwrite this generated file. Verification example:md5sum -c md5sums.txt
-g GLOB
: A glob expression to select a subset of record files.-w FILE
: instead of downloading the record files, it will generate a FILE which contains direct links to the Zenodo site. These links could be downloaded with any download manager, e.g. with wget:wget -i urls.txt
-e
: continue on error. It will skip the files with errors, but it will try to download the rest of the files.-k
: keep files: it will keep files with invalid md5 checksum. The main purpose is debugging.-R N
: retry on error N times.-p N
: Waiting time in sec before retry attempt. Default: 0.5 sec.-n
: do not continue. The default behaviour is to download only the files which are not yet download or where the checksum does not match with the file. This flag disables this feature, and it will force download existing files, and assigning a new name to the files (e.g. file(1).ext )
Remark for batch processing: the program always exits with non-zero exit code, if any error has happened, for instance, checksum mismatch, download error, time-out, etc. Only perfectly correct downloads end with 0 exit code.
You don't really need to cite this software, except if you use it for another academic publication. E.g. if you download something from Zenodo with zenodo-get: no need to cite anything. If you download a lot from Zenodo, and you publish about Zenodo, and my tool is integral part of the methodology, then you could cite it. You could always ask the code to print the most up-to-date reference producing plain text and bibtex references too:
zenodo_get --cite