This repository contains the code and dataset for our paper: PDFTriage: Question Answering over Long, Structured Documents (EMNLP 2024 Industry Track).
If you use this work in your research, please cite our paper PDFTriage: Question Answering over Long, Structured Documents as follows:
@article{saad2023pdftriage,
title={Pdftriage: Question answering over long, structured documents},
author={Saad-Falcon, Jon and Barrow, Joe and Siu, Alexa and Nenkova, Ani and Rossi, Ryan A and Dernoncourt, Franck},
journal={arXiv preprint arXiv:2309.08872},
year={2023}
}
- The code and model are licensed under the Adobe Research License. The license prohibits commercial use and allows non-commercial research use.
- The dataset is licensed under the CC BY-NC-SA 4.0 license.