This is the dataset described in the paper Hard is the Task, the Samples are Few. It is based on the experiments in the paper Data-Driven Detection of General Chiasmi Using Lexical and Semantic Features presented at the SIGHUM workshop at the EMNLP 2021.
The data can be found in the data subfolder and is in the json file format.
Der Reiche setzt eher sein [Leben] für seinen [Reichtum] als seinen [Reichtum] für sein [Leben] aufs Spiel. The rich one would rather risk his [life] for his [riches] than his [riches] for his [life].
@inproceedings{schneider2023hard,
title = {Hard is the Task, the Samples are Few: A German Chiasmus Dataset},
author = {Felix Schneider and Sven Sickert and Phillip Brandes and Sophie Marshall and Joachim Denzler},
booktitle = {Human Language Technologies as a Challenge for Computer Science and Linguistics},
year = {2023},
address = {Poznan, Poland},
pages = {255-260},
doi = {10.14746/amup.9788323241775},
}