System for extractive summarization of research text using Deep Learning
We consider the task of summarization as a sequence-to-sequence mapping task where,
Source Sequence: Research text
Target Sequence: Extractive summary of the text
Hence, we use LSTM (Long Short Term Memory) based RNNs (Recurrent Neural Networks) with Attention Mechanism for the task since research text has lot of long-term dependencies among the various sections of the paper. Attention mechanism models focussing on some portion of the input text more than the others at different instances of time. During training, the abstract of the input research paper is the target output.
Scientific articles are too long to be processed for current GPUs using LSTMs. Hence we need to get to a compressed representation of the paper retaining important information conveyed by the paper, before applying the sequence-to-sequence mapping task.
LaTeX sources of articles from arxiv.org.
-
Using Extractive Summarization: We first generate a 2000 word summary of the input research text using Lex-Rank algorithm (and others also), then pass this extractive summary to a Sequence-To-Sequence Model to furthur generate its summary.
-
Using Paragraph Embeddings: We first use Para2Vec algorithm to generate embeddings for individual sections of the input research paper then pass these as inputs to the Sequence-To-Sequence Model.
-
Using GRU (Gated Recurrent Units): We first use Lex-Rank to generate single sentence summary for individual paragraphs of the input research text, then concatenate these representative sentences and pass to a GRU based Model.
For checking the goodness of the summary generated by our system, we use the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) Metrics.
The results and examples generated using different approaches can be found in the presentation, in the root directory of the project.
-
Erkan, Günes, and Dragomir R. Radev. "LexRank: Graph-based lexical centrality as salience in text summarization." Journal of Artificial Intelligence Research 22 (2004): 457-479.
-
Text Summarization with Tensorflow (Generation of News Headlines) by Peter Liu and Xin Pan
-
Le, Quoc V., and Tomas Mikolov. "Distributed Representations of Sentences and Documents." ICML. Vol. 14. 2014.
-
Kim, Minsoo, Moirangthem Dennis Singh, and Minho Lee. "Towards Abstraction from Extraction: Multiple Timescale Gated Recurrent Unit for Summarization." arXiv preprint arXiv:1607.00718 (2016).
-
Lin, Chin-Yew. "Rouge: A package for automatic evaluation of summaries." Text summarization branches out: Proceedings of the ACL-04 workshop. Vol. 8. 2004.