This is Bi-Context datasets used in AAAI 2017 paper, Geometry of Compositionality. Files: (1) bicontext_English.txt: contains 104 English phrases and 208 sentences. (2) bicontext_Chinese.txt: contains 64 Chinese phrases and 128 sentences.
Data Format: In each line, the target phrase and its sentence are separated by tab, i.e., "phrase sentence". The phrases listed in the datasets are polysemous, and we provide two sentences containing it. The target phrase has literal meaning (compositional meaning) in one sentence, and has idiomatic meaning (non-compositional meaning) in the other sentence.
If you use our data or code, kindly cite our work: Hongyu Gong, Suma Bhat, Pramod Viswanath. Geometry of compositionality. InThirty-First AAAI Conference on Artificial Intelligence 2017 Feb 12.
@inproceedings{gong2017geometry, title={Geometry of compositionality}, author={Gong, Hongyu and Bhat, Suma and Viswanath, Pramod}, booktitle={Thirty-First AAAI Conference on Artificial Intelligence}, year={2017} }