Skip to content

This project is a SMS spam classifier which detect whether the SMS is spam or ham using the multinomial Naive Bayes algorithm along the side of BOW/TF-IDF in NLP

License

Notifications You must be signed in to change notification settings

SINGHxTUSHAR/BOW-TFIDF-spamBuster

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub license GitHub contributors GitHub issues GitHub pull-requests PRs Welcome

GitHub watchers GitHub forks GitHub stars

Open in Visual Studio Code

SMS Classifier(BOW/TF-IDF) 🌟

IMG_2

Problem Statement:

The exponential rise in SMS spam necessitates robust filtering techniques. This project aims to develop a machine learning classifier to categorize incoming SMS messages as spam or legitimate. The classifier will leverage Natural Language Processing (NLP) techniques, specifically Bag-of-Words (BOW) for message representation and Term Frequency-Inverse Document Frequency (TF-IDF) for feature weighting. By analyzing the word frequency patterns within messages and identifying terms that differentiate spam from legitimate messages, the classifier will learn to classify new SMS effectively, curbing spam and enhancing user experience.

Data Dictionary 📄✏ :

The DataSet is taken from the UCI DataSet Machine learning Repo. Click for DESCRIPTION of DataSet.

DataSet Credit:

The corpus has been collected by Tiago Agostinho de Almeida (http://www.dt.fee.unicamp.br/~tiago) and Jos� Mar�a G�mez Hidalgo (http://www.esp.uem.es/jmgomez).

Requirements :

Ensure you have the following dependencies installed:

  • Python (version 3.9)
  • Jupyter Notebook
  • Other dependencies (refer to the requirements.txt)

You can install the required Python packages using:

pip install -r requirements.txt

Setup 💿:

  • Clone the repository:
git clone https://github.com/SINGHxTUSHAR/BOW-TFIDF-spamBuster.git
cd BOW-TFIDF-spamBuster
  • Create a virtual environment (optional but recommended):
python -m venv venv
  • Activate the virtual environment:
    • On Windows:
    venv\Scripts\activate
    • On macOS/Linux:
    source venv/bin/activate

Contributing :

If you'd like to contribute to this project, please follow the standard GitHub fork and pull request process. Contributions, issues, and feature requests are welcome!

Suggestion:

If you have any suggestions for me related to this project, feel free to contact me at [email protected] or LinkedIn.

License :

This project is licensed under the MIT License - see the LICENSE file for details.

About

This project is a SMS spam classifier which detect whether the SMS is spam or ham using the multinomial Naive Bayes algorithm along the side of BOW/TF-IDF in NLP

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages