Skip to content

Starter repository for the Manning liveProject: What's the news: Summarize news articles with NLP, Deep Learning, and Python.

Notifications You must be signed in to change notification settings

Manning-LP-What-s-The-News/Starter-Repository

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 

Repository files navigation

Starter repository for the Manning liveProject: Summarize News Articles with NLP and TensorFlow](https://www.manning.com/liveproject/summarize-news-articles-with-nlp-and-tensorflow). This code lpsumnlp20 is always good for a 35% discount on the liveProject. This repository contains the intermediate files that might be helpful for the learners of this liveProject. Those files reside here.

By: Souradip Chakraborty & Sayak Paul

About this liveProject

In this liveProject, you will be filling in the shoes of an NLP Engineer to work on building an automatic text summarizer for your colleagues at a News Media firm. This hypothetical News Media firm uses flashcards of broad news articles to design the front page of their blog that is read by more than a million readers across the globe. To develop the content for these flashcards, currently, the news editors manually summarize the prospective news articles, and needless to say, this process is very time-consuming. This text summarizer will be used by the news editors to automatically generate these summaries that could act as fairly good starting points.

This text summarizer is going to play a very crucial role in reducing the turnaround time for the news editors in developing the content for the flashcards. Your first assignment as a newly hired NLP Engineer would be to develop a PoC (proof of concept) text summarizer so that the stakeholders can properly plan out the next steps.

The steps would briefly include -

  • Converting an abstractive text summarization dataset to conform to extractive text summarization with the help of Rouge score.
  • Visualizing the newly prepared dataset and extracting meaningful summary statistics from it. For example - highest news article length, highest summary length, and so on.
  • Preprocessing the dataset with basic NLP techniques like tokenization, padding, and so on.
  • Building deep learning models with attention mechanism that are able to produce meaningful summary candidates from a news article.
  • Preparing an overall report of the most performing deep learning models from your experiments for the stakeholders.