Twitter Sentiment Analysis:

Usecase:

This is an entity-level sentiment analysis dataset of twitter. Given a message and an entity, the task is to judge the sentiment of the message about the entity. There are three classes in this dataset: Positive, Negative, Neutral and Irrelevant.

Tech Stack:

Category: NLP, Multiclass Classification problem
Tech Stack: Python, Regular expression, Word cloud, NLTK, TF-IDF, Bag of Words, Pandas, Matplotlib, Sklearn

Medium Blog:

https://parisrohan.medium.com/twitter-sentiment-analysis-and-classification-7060d4444a27

Files:

EDA_TextCleaning.ipynb - EDA and text cleaning code
Model_building.ipynb - Model building code

Workflow:

1. Data Collection:

Twitter sentiment analysis dataset from Kaggle has been used to build a multiclass classification model. The dataset can be found from the following link:- https://www.kaggle.com/datasets/jp797498e/twitter-entity-sentiment-analysis
The dataset contains 74682 rows and 4 columns
Distribution of target feature is as below

2. EDA:

The dataset columns have been renamed to {0:'Tweet_ID',1:'Topic',2:'Sentiment',3:'Tweet'} to get a better sense of the data.
0.9% of the data has been dropped as it contains null values
On an average each tweet contains 23 tokens and there are some tweets with extreme outliers

3. Data preprocessing:

Following actions are performed on the 'Tweet' feature to extract important information.
Remove user mentions
Remove hashtags
Remove contractions
Remove urls
Remove special characters
Convert tweets into lowercase
Remove stopwords
Normalize text by converting words into lemma
Generate word clouds for each sentiment on the cleaned tweets
Perform one-hot encoding on the 'Topic' feature
Drop features like 'Tweet_ID','Tweet','Topic' as they are no longer required

4. Model Building

TF-IDF vectorizer is used to create bag of words
Results of Multinomial Naive Bayes model:
Results of Logistic Regression model:
Results of Decision Tree Classifier model:
Results of Random Forest Classifier model:

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
EDA_TextCleaning.ipynb		EDA_TextCleaning.ipynb
Model_building.ipynb		Model_building.ipynb
README.md		README.md
Twitter_cleaned.csv		Twitter_cleaned.csv
twitter_training.csv		twitter_training.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Twitter Sentiment Analysis:

Usecase:

Tech Stack:

Medium Blog:

Files:

Workflow:

1. Data Collection:

2. EDA:

3. Data preprocessing:

4. Model Building

About

Releases

Packages

Languages

imverma/Twitter-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

Twitter Sentiment Analysis:

Usecase:

Tech Stack:

Medium Blog:

Files:

Workflow:

1. Data Collection:

2. EDA:

3. Data preprocessing:

4. Model Building

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages