Skip to content

sor8sh/Semisupervised-Sentiment-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Twitter Semisupervised Sentiment Analysis

This repository is made for the NLP course project - Apr 2018.

Dependencies:

Dataset:


A semisupervised sentiment analysis for tweets of a twitter account over time.

Steps:

  • Collect all Tweets of an account in a json file with the following format:
{
  "source": "Twitter for iPhone",
  "text": "Some text",
  "created_at": "Sun Jul 08 21:58:52 +0000 2018",
  "retweet_count": 64399,
  "favorite_count": 183994,
  "is_retweet": false,
  "id_str": "1016079192604139520"
} 
  • Use NLTK for Lemmatization and Tokenization.
  • Based on AFINN dataset, each word is given a score, from +5 (very positive) to -5 (very negative).
  • Use scikit-learn to calculate precision, recall, and f1-score.
  • Use Matplotlib to plot a histogram of the sentiment analysis over time.

sentiment analysis histogram