Skip to content

This model analyzes tweets to classify them as Positive, Neutral, or Negative. It cleans the text, converts it to numerical features, trains a Logistic Regression model, and evaluates its accuracy.

License

Notifications You must be signed in to change notification settings

sgupta1703/Sentiment-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sentiment Analysis on Tweets

Overview

This project performs sentiment analysis on tweets, determining whether a tweet expresses a positive, neutral, or negative sentiment. It leverages natural language processing (NLP) techniques and machine learning to analyze textual data, making it a valuable tool for understanding public opinion, monitoring brand sentiment, or analyzing customer feedback.

Key Features

  • Text Preprocessing: Cleans the raw text data by removing special characters, converting text to lowercase, and normalizing whitespace.
  • TF-IDF Vectorization: Converts text into numerical representations based on the importance of words.
  • Machine Learning Model: Utilizes a Logistic Regression classifier for sentiment prediction.
  • Evaluation Metrics: Provides detailed performance evaluation, including accuracy, precision, recall, and F1-score.

How It Works

  1. Data Loading: Reads a labeled dataset of tweets with their corresponding sentiment.
  2. Data Cleaning: Prepares the text for analysis by removing noise and standardizing the format.
  3. Label Encoding: Maps sentiment labels (Positive, Neutral, Negative) to numerical values.
  4. Training: Trains the model using an 80/20 train-test split.
  5. Prediction: Predicts sentiment for test data using the trained model.
  6. Evaluation: Reports accuracy and provides a detailed classification report.

Requirements

  • Python 3.x
  • Libraries:
    • pandas
    • numpy
    • scikit-learn
    • re

How to Run

  1. Clone the repository:
    git clone https://github.com/your-username/sentiment-analysis.git
    cd sentiment-analysis
        
  2. Install dependencies:
    pip install -r requirements.txt
  3. Run the script:
    python sentiment_analysis.py
  4. Add the dataset (twitter_training.csv) in the project directory.

Future Enhancements

  • Incorporate additional preprocessing like removing stop words or stemming.
  • Use advanced machine learning models (e.g., SVM, Random Forest) or deep learning models (e.g., LSTMs, Transformers).
  • Expand the dataset to improve model accuracy and generalizability.

About

This model analyzes tweets to classify them as Positive, Neutral, or Negative. It cleans the text, converts it to numerical features, trains a Logistic Regression model, and evaluates its accuracy.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages