This project implements a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) model to classify text into six distinct emotion categories: Sadness, Joy, Love, Anger, Fear, and Surprise. It provides an intuitive web interface powered by Flask, enabling users to input text and obtain the predicted emotion in real-time.
The model achieved an accuracy of 95% on the test set, with an average latency of ~5000 ms for inference, regardless of batch size, demonstrating consistent performance for both single and large batch predictions.
- Overview
- Features
- Project Architecture
- Requirements
- Installation and Setup
- Usage
- Model Training and Customization
- Web Application
- Logging and Debugging
- Contributing
- License
This project combines state-of-the-art NLP techniques with a fine-tuned BERT model to classify emotions in text. It offers:
- Real-time Predictions: Through a Flask-based web app.
- Custom Pipelines: For ingestion, transformation, and training.
- Pretrained BERT Backbone: Leveraging
bert_base_en_uncasedfor contextualized text embeddings.
- Sentiment Classification:
- Classifies text into six emotion categories.
- Uses a pre-trained BERT backbone for superior accuracy.
- Interactive Web Interface:
- A Flask-powered interface for user interaction.
- Pipeline Structure:
- Modular pipeline for easy customization and scalability.
- Error Handling and Logging:
- Centralized logging and custom exception handling.
The project is structured as follows:
BERTMODEL/
├── artifact/ # Stores model artifacts (e.g., weights, metadata)
├── logs/ # Logs generated during model training and runtime
├── notebooks/ # Jupyter notebooks for exploration and experimentation
├── src/ # Core source code
│ ├── components/ # Pipeline components: data ingestion, transformation, model trainer
│ ├── pipeline/ # Train and predict pipelines
│ ├── exception.py # Custom exception class for debugging
│ ├── logger.py # Logging setup and utilities
│ ├── utils.py # Helper functions
├── templates/ # HTML templates for Flask web app
│ ├── index.html # Landing page template
│ ├── home.html # Prediction results page
├── app.py # Flask application entry point
├── README.md # Project documentation
├── requirements.txt # Python dependencies
├── setup.py # Setup script for project packaging
The following dependencies are required to run the project:
- Python: Version >= 3.7
- TensorFlow: Version >= 2.10.1
- TensorFlow_text: Version >= 2.10
- Keras NLP: Version >= 0.6.0
- Additional libraries:
PandasNumPyScikit-learn
All dependencies are listed in the requirements.txt file with the working.
Follow these steps to set up the project:
git clone https://github.com/mohdamaj/Sentiment-Analysis-using-BERT.git
cd BERT-Sentiment-Analysispython -m venv venv- On Unix or MacOS:
source venv/bin/activate- On Windows:
venv\Scripts\activatepip install -r requirements.txtRefer to Model Training and Customization
python app.pyThe app will run locally, and you can access it at http://127.0.0.1:5000.
Once the Flask app is running:
- Open the web application in your browser.
- Enter a text snippet in the input field.
- Click the Submit button to see the predicted emotion.
Model Architecture
The model uses the following components:
- BERT Backbone: Pre-trained
bert_base_en_uncasedfor contextual embeddings. - Preprocessing: Tokenization and input preparation using
BertPreprocessor. - Classification Head:
- Dropout layer for regularization.
- Dense layer with softmax activation for emotion classification.
The trigger for training takes place in data_ingestion.py.
Train the model by running:
python src/pipeline/data_ingestion.pyThe trained model will be saved in the artifact/ directory.
Features
- Input Form: Allows users to enter text for classification.
- Prediction Display: Shows the predicted emotion on the results page.
- Error Handling: Provides user-friendly error messages if input is missing or invalid.
Templates
- index.html: Landing page of the web app.
- home.html: Displays prediction results.
- Logs are stored in the
logs/directory. - Centralized logging is implemented using Python’s
loggingmodule. - Errors are handled via the
CustomExceptionclass inexception.py.
This project is licensed under the MIT License. See the LICENSE file for details.