Skip to content

Predict the emotion behind the tweets with recurrent neural network

Notifications You must be signed in to change notification settings

pclightyear/Twitter_Emotion_Recognition

 
 

Repository files navigation

Emotion Recognition on Twitter

The goal of this project is to predict the emotion behind the tweets. This is a competition from the data mining course offered in Nation Tsing Hua University.

We preprocess the tweets and trained a LSTM model on 1.46 million tweets. Our method reaches 0.477 mean f1-score on testing data and places 9th out of 40 teams in the class. The complete report is provided in this notebook.

Preprocessing

  • Replace tokens in the dataset such as "@user" and hashtag into tokens in the dictionary of the pre-trained embeddings.
  • Replace common emojis to their corresponding adjectives.

Training

  • Convert the tweets into glove embeddings.
  • Train a LSTM model to predict the emotion behind the tweets.
  • Validation set is used to prevent overfitting.

Data

  • Training set - 1.46m tweets
  • Testing set - 412k tweets

There are 8 different emotion labels in the dataset: anger, anticipation, disgust, fear, sadness, surprise, trust, and joy.

Follow the link to download the dataset.

Tools in use

  • keras
  • nltk
  • scikit-learn

About

Predict the emotion behind the tweets with recurrent neural network

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 98.5%
  • Python 1.5%