Skip to content

simrann20/Toxic-Comment-Classification-LSTM-and-Bi-LSTM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Toxic Comment Classification using LSTM and Bi-LSTM

In this project, we proposed a multi-label classification model using LSTM and Bi-LSTM to classify the various toxic comments into six classes namely toxic, severe toxic, obscene, threat, insult and identity hate.

About the Dataset

The dataset for this project is taken from Kaggle and is provided by the Conversation AI team (a research initiative co-founded by Jigsaw and Google). Word Embedding is performed to get insights from the previous research works. In this project, Glove.6B.300D is used which contain 6 billion tokens and each token is represented by 300D vector representation.

Download the Dataset from here.

Download the word embeddings from here

The following figure shows the distribution of the comments into six labels according to the length.

Proposed Model

The following figure shows the proposed model of this project work.

Found this project useful? ❤️

If you found this project useful, then please consider giving it a ⭐ on Github and sharing it with your friends via social media.

Project Created & Maintained By

1. Simrann Arora

Machine Learning Enthusiast #MachineLearning #DeepLearning #ToxicCommentsClassification #LSTM #Bi-LSTM #Python

2. Akash Gupta

Machine Learning Enthusiast #MachineLearning #DeepLearning #ToxicCommentsClassification #LSTM #Bi-LSTM #Python