This repository contains the data for AFRIDOC-MT. AFRIDOC-MT is a document-level multi-parallel translation dataset covering English and five African languages: Amharic, Hausa, Swahili, Yorùbá, and Zulu. The dataset comprises 334 health and 271 information technology news documents, all human-translated from English to these languages.
The github page is still under construction. Please refer to the huggingface page for the data.
The project was generously funded by Lacuna Fund.