Skip to content

Latest commit

 

History

History
 
 

reproduction

Reproduction of Models

In this directory, we provide our code which we used to reproduce the German text simplification models. We have reproduced the following models:

model name reference code comment
hda_LS Siegel et al. (2019) https://github.com/rstodden/easy-to-understand_language We have slightly updated the original code.
Sockeye-APA-LHA Spring et al. (2021) https://github.com/ZurichNLP/RANLP2021-German-ATS We haven't changed the original code, please follow the instructions of the original authors.
trimmed_mbart_sents_apa Stodden et al. (2023) reproduction-based-on-checkpoints.ipynb model loaded from Huggingface checkpoint
trimmed_mbart_sents_apa_web Stodden et al. (2023) reproduction-based-on-checkpoints.ipynb model loaded from Huggingface checkpoint
BLOOM zero-shot Ryan et al. (2023) reproduction_bloom_by_ryan-eta-al-2023.ipynb We have slightly updated the original code.
BLOOM 10-random-shot Ryan et al. (2023) reproduction_bloom_by_ryan-eta-al-2023.ipynb We have slightly updated the original code.
BLOOM 10-similarity-shot Ryan et al. (2023) reproduction_bloom_by_ryan-eta-al-2023.ipynb We have slightly updated the original code.
customer-decoder-ats Anschütz et al. (2023) reproduction-based-on-checkpoints.ipynb model loaded from Huggingface checkpoint
mT5 models Stodden et al. (2024) mt5-models-loop.py mT5 models fine-tuned using the provided code

Additionally, we have fine-tuned mT5 on the simple-german-corpus and on DEplain-APA. The corresponding code is also linked in this repository.