Skip to content

Classify the message is spam or not using Multinomial Naive Bayes.

Notifications You must be signed in to change notification settings

Udrasht/Multinomial-Naive-Bayes-from-Scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

SMAI-Mini-Project-1

Classify the message is spam or not using Multinomial Naive Bayes.

Introduction

This question will have you working and experimenting with the Multinomial Naïve Bayes classifier. Initially, you will transform the given data in csv file to count matrix, then calculate the priors. Use those priors to compute likelyhoods according to Multinomial Naive Bayes and then classify the test data. Please note that use of sklearn implementations is only for the final question of the assignment.

The dataset is about Spam SMS. There is 1 attribute that is the message, and the class label which could be spam or ham. The data is present in spam.csv. It contains about 5-6000 samples. For your convinience the data is already pre-processed and loaded, but I suggest you to just take a look at the code for your own knowledge, and parts vectorization is left up to you which could be easily done with the help of the given example code.

Releases

No releases published

Packages

No packages published