-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.html
127 lines (70 loc) · 5.42 KB
/
README.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
# twitter-hate-speech-detection
Automated Content Moderation - Hate Speech Detection on Twitter - Sentiment Analysis
Lalit and Akshat
Artificial Intelligence Project
Submitted to Prof. Punam Bedi
https://github.com/khulalit/twitter-hate-speech-detection
Introduction of the Project
With the rapid growth of social networks and microblogging websites, communication between people from different cultural and psychological backgrounds became more direct, resulting in more and more “cyber” conflicts between these people. Consequently, hate speech is used more and more, to the point where it has become a serious problem invading these open spaces. Hate speech refers to the use of aggressive, violent or offensive language, targeting a specific group of people sharing a common property, whether this property is their gender (i.e., sexism), their ethnic group or race (i.e., racism) or their beliefs and religion, etc. While most of the online social networks and microblogging websites forbid the use of hate speech, the size of these networks and websites makes it almost impossible to control all of their content.Therefore, arises the necessity to detect such speech automatically and filter any content that presents hateful language or language inciting to hatred
Our Project aims to develop automatic content moderation systems using AI and ML Techniques. Our model will detect hate/offensive content in the text.
Steps and Approaches
We have collected the dataset
Cleaned the dataset
Preprocessing the dataset
Applying NLP to the dataset
Tokenization
Stemming
Lemmatization
Removing the Stop words
Vectorization
Count Vectorization
TF-IDF
Creating ML model (Supervised Classification Algorithms)
Naive Bayes
Support Vector Machine
Logistic Regression
Front End using Streamlit Framework
Deployment on Heroku or share.streamlit.io
Data Sourcing
The dataset for this capstone project was sourced from a study called Automated Hate Speech Detection and the Problem of Offensive Language conducted by Thomas Davidson and a team at Cornell University in 2017. The GitHub repository can be found here.
The dataset is a .csv file with 24,802 text posts from Twitter where 6% of the tweets were labeled as hate speech
The labels on this dataset were voted on by crowdsource and determined by majority-rules
To prepare the data for binary classification, labels were manually replaced by changing existing 1 and 2 values to 0, and changing 0 to 1 to indicate hate speech
Cleaned Data Source
Vectorization and ML models
We have used two different vectorization techniques
Count Vectorization and TF-IDF
And by using these one applying all these with three different ML model techniques
Comparing all the 6 combinations results using the best suited one.
Logistic Regression
Count Vectorization
TF-IDF
Naive Bayes
TF-IDF
Count Vectorization
Overview
This project aims to automate content moderation to identify hate speech using machine learning binary classification algorithms. Baseline models included Naive Bayes, Logistic Regression The final model was a Logistic Regression model that used Count Vectorization for feature engineering. It has 94% produced accuracy of . This performance can be attributed to the massive class imbalance and the model's inability to "understand" the nuances of English slang and slurs. Ultimately, automating hate speech detection is an extremely difficult task. And although this project was able to get that process started, there is more work to be done in order to keep this content off of public-facing forums such as Twitter.
Final Model Performance
F1 score was used as the main metric for this project, while also looking at Precision and Recall.
Overall, we want as much hate speech to be flagged as possible and so that it can be efficiently removed. This means also optimizing the True Positive Rate, aka Recall.
As expected, the final model has a True Negative Rate of 91% and a True Positive Rate of 62%.
This is consistent with the final model's evaluation metrics
We ideally want as many True Negatives as possible, because that would be identifying Hate Speech correctly
This is where the model could be improved
However, it has a very low False Positive Rate, which means regular tweets won't be misclassified as Hate Speech often
With this, users won't complain about over-censorship
Overall, the Recall of this model needs to be improved further, in addition to the F1 of 0.3958.
Front End Development
For developing the front end we have used Streamlit framework.It is an open source library for creating user interface for the data applicationsIt supports charts, graphs and all other Data Science Visualization Tools.
For more Details check out. https://streamlit.io
Screenshot of final Product
https://share.streamlit.io/khulalit/twitter-hate-speech-detection/main.py
1.
2.
3.
Next Steps after this project
To further develop this project, here are some immediate next steps that anyone could execute.
Collect more potential "Hate Speech" data to be labeled by CrowdFlower voting system
Improve final model with different preprocessing techniques, such as removing offensive language as stop words
Evaluate model with new tweets or other online forum data to see if it can generalize well
LDA Topic Modeling with Gensim