We are excited to announce the release of the CNNSum dataset!
As outlined in Section 3.1 and Appendix E of our paper, we have conducted a final round of manual cleaning to address any possible omissions. This process affects only a minimal number of samples, ensuring that the length statistics reported in our paper remain virtually unchanged.
The dataset consists of two primary fields:
- context: The novel excerpts
- summary: The manually annotated summaries