CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels

[2025.5] - Accepted to Findings of ACL 2025

[2025.1] - Add inference script

[2024.12] - CNNSum Dataset Release

We are excited to announce the release of the CNNSum dataset!

As outlined in Section 3.1 and Appendix E of our paper, we have conducted a final round of manual cleaning to address any possible omissions. This process affects only a minimal number of samples, ensuring that the length statistics reported in our paper remain virtually unchanged.

Dataset Details

The dataset consists of two primary fields:

context: The novel excerpts
summary: The manually annotated summaries

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
vllm-0.5.3.post1		vllm-0.5.3.post1
.gitattributes		.gitattributes
Inference.py		Inference.py
LICENSE		LICENSE
README.md		README.md
prompts.json		prompts.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels

[2025.5] - Accepted to Findings of ACL 2025

[2025.1] - Add inference script

[2024.12] - CNNSum Dataset Release

Dataset Details

About

Uh oh!

Releases

Packages

Languages

License

CxsGhost/CNNSum

Folders and files

Latest commit

History

Repository files navigation

CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels

[2025.5] - Accepted to Findings of ACL 2025

[2025.1] - Add inference script

[2024.12] - CNNSum Dataset Release

Dataset Details

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages