This release (v1.1) builds upon the initial release (v1.0) from September 22, 2019, incorporating several improvements. The underlying dataset itself remains unchanged, with the exception of the inclusion of the new format.
What's Changed
- Fix path in a data preparation notebook by @c4n in #6
- Add Hugging Face format by @cstorm125 in #7
- Revised copyright and disclaimer statements, along with the addition of a LICENSE and SPDX file, for legal clarity.
- Updated contributor recognition to acknowledge those who contributed to the tokenization annotation dataset.
- Added metadata in citation information to ensure correct automatic recognition as a dataset (previously recognized as a "software").
New Contributors
- @c4n made their first contribution in #6
- @cstorm125 made their first contribution in #7
Full Changelog: v1.0...v1.1