README.html

# twitter-hate-speech-detection
Automated Content Moderation - Hate Speech Detection on Twitter - Sentiment Analysis

Lalit and Akshat

Artificial Intelligence Project

Submitted to Prof. Punam Bedi

https://github.com/khulalit/twitter-hate-speech-detection 

Introduction of the Project

With the rapid growth of social networks and microblogging websites, communication between people from different cultural and psychological backgrounds became more direct, resulting in more and more “cyber” conflicts between these people. Consequently, hate speech is used more and more, to the point where it has become a serious problem invading these open spaces. Hate speech refers to the use of aggressive, violent or offensive language, targeting a specific group of people sharing a common property, whether this property is their gender (i.e., sexism), their ethnic group or race (i.e., racism) or their beliefs and religion, etc. While most of the online social networks and microblogging websites forbid the use of hate speech, the size of these networks and websites makes it almost impossible to control all of their content.Therefore, arises the necessity to detect such speech automatically and filter any content that presents hateful language or language inciting to hatred

Our Project aims to develop automatic content moderation systems using AI and ML Techniques. Our model will detect hate/offensive content in the text.

Steps and Approaches

    We have collected the dataset
    Cleaned the dataset
    Preprocessing the dataset
    Applying NLP to the dataset

    Tokenization
    Stemming
    Lemmatization
    Removing the Stop words

    Vectorization

    Count Vectorization
    TF-IDF 

    Creating ML model (Supervised Classification Algorithms)

    Naive Bayes
    Support Vector Machine
    Logistic Regression

    Front End using Streamlit Framework
    Deployment on Heroku or share.streamlit.io

Data Sourcing

The dataset for this capstone project was sourced from a study called Automated Hate Speech Detection and the Problem of Offensive Language conducted by Thomas Davidson and a team at Cornell University in 2017. The GitHub repository can be found here.

    The dataset is a .csv file with 24,802 text posts from Twitter where 6% of the tweets were labeled as hate speech
    The labels on this dataset were voted on by crowdsource and determined by majority-rules
    To prepare the data for binary classification, labels were manually replaced by changing existing 1 and 2 values to 0, and changing 0 to 1 to indicate hate speech

Cleaned Data Source

Vectorization and ML models

We have used two different vectorization techniques

Count Vectorization and TF-IDF

And by using these one applying all these with three different ML model techniques

Comparing all the 6 combinations results using the best suited one.

Logistic Regression

 

                Count Vectorization

                

TF-IDF

        Naive Bayes

                TF-IDF

                

                Count Vectorization

Overview

This project aims to automate content moderation to identify hate speech using machine learning binary classification algorithms. Baseline models included Naive Bayes, Logistic Regression The final model was a Logistic Regression model that used Count Vectorization for feature engineering. It has 94%  produced accuracy of . This performance can be attributed to the massive class imbalance and the model's inability to "understand" the nuances of English slang and slurs. Ultimately, automating hate speech detection is an extremely difficult task. And although this project was able to get that process started, there is more work to be done in order to keep this content off of public-facing forums such as Twitter.
Final Model Performance

F1 score was used as the main metric for this project, while also looking at Precision and Recall.

Overall, we want as much hate speech to be flagged as possible and so that it can be efficiently removed. This means also optimizing the True Positive Rate, aka Recall.

As expected, the final model has a True Negative Rate of 91% and a True Positive Rate of 62%.

    This is consistent with the final model's evaluation metrics
    We ideally want as many True Negatives as possible, because that would be identifying Hate Speech correctly

    This is where the model could be improved

    However, it has a very low False Positive Rate, which means regular tweets won't be misclassified as Hate Speech often

    With this, users won't complain about over-censorship

Overall, the Recall of this model needs to be improved further, in addition to the F1 of 0.3958.

Front End Development

For developing the front end we have used Streamlit framework.It is an open source library for creating user interface for the data applicationsIt supports charts, graphs and all other Data Science Visualization Tools.

For more Details check out. https://streamlit.io

Screenshot of final Product

https://share.streamlit.io/khulalit/twitter-hate-speech-detection/main.py

1.

2.

3.

Next Steps after this project
To further develop this project, here are some immediate next steps that anyone could execute.

    Collect more potential "Hate Speech" data to be labeled by CrowdFlower voting system
    Improve final model with different preprocessing techniques, such as removing offensive language as stop words
    Evaluate model with new tweets or other online forum data to see if it can generalize well
    LDA Topic Modeling with Gensim