Skip to content

Latest commit

 

History

History
executable file
·
29 lines (18 loc) · 924 Bytes

README.md

File metadata and controls

executable file
·
29 lines (18 loc) · 924 Bytes

Tudocomp Datasets

A collection of scripts and sources for the generation and gathering of a comprehensive text corpus.

Standalone usage

TODO: Running tests

Using as external dependency

TODO: usecase of copying, or using as submodule

Dependencies

The CMake build process will either find external dependencies on the system if they have been properly installed, or automatically download and build them from their official repositories in case they cannot be found. In that regard, a proper installation of the dependencies is not required.

Said external dependencies are the following:

License

The code in this repository is published under the Apache License 2.0