Skip to content

Guscode/DKbert-hatespeech-detection

Repository files navigation

Danish Hatespeech Detection using BERT

This tool was made as a part of an exam in Cultural Data Science at Aarhus University

Contribution

This project is created in collaboration between Gustav Aarup Lauridsen and Johan Kresten Horsmans. Both have contributed equally to every stage of this project from initial conception and implementation, through the production of the final output and structuring of the repository. (50/50%)

Project description

Danish hate speech detection

For our self-assigned project, we wish to see if we can improve the Danish hate-speech detection algorithm that we designed for assignment 5. As stated in assignment 5, we find this task very interesting due to the amount of media coverage on Danish hate speech on social media in recent months. We believe that a robust hate-speech classifier could be a very valuable tool for moderating the tone and retoric of the public debate to make it more constructive.

In assignment 5, we achieved a macro F1-score of 0.71. The current state-of-the-art, as described in OffensEval2020, achieves a macro F1-score of 0.81. Our goal with this project is to build a competing state-of-the-art model with similar performance and make it openly available by uploading it, as the first Danish hate speech detection model, to huggingface. We wish to do this using the Nordic Bert-architecture by BotXO.

Following this we are going to build a .py-script that can be employed for hate speech classification on ones own dataset. Furthermore, we are creating a Jupyter notebook acting as a tutorial to easily help users deploy the model from huggingface on their own data. Using our huggingface-model will be advantageous, since it takes a long time to train a BERT-model for classification tasks. Using our pretrained model from huggingface, will make it easier and much less time-consuming to implement hate speech moderation for various media-sites and firms who wish to combat Danish hate speech on their online platforms. To improve usability, we will make the model compatible with both a tensorflow- and pytorch framework

In summary, the project is comprised of the following steps:

  1. Train and test a Nordic Bert-model on the official OffensEval2020-dataset
  2. Upload the trained model to huggingface.co
  3. Create a Jupyter notebook and .py-script designed to help users deploy the model on their own data.

Methods

NOTE: Some parts of the following section is repeated from assignment 5

For model training and testing, we are using the OffensEval2020 dataset containing 3000+ Danish comments from Ekstra Bladet and Reddit, labeled with a binary coding scheme indicating offensiveness (link: https://figshare.com/articles/dataset/Danish_Hate_Speech_Abusive_Language_data/12220805).

OffensEval2020 was a competition where researchers and data scientists from all over the world competed to create the best classification models for various languages (including Danish).

The best team in the Danish task achieved a macro F1-score of 0.8119 and the worst team achieved a score of 0.4913. For the full paper, see: OffensEval2020

To make our model-performance comparable to the current state-of-the-art presented in OffensEval2020, we utilized macro F1-score as our evaluation metric:

The F1-score is a metric devised to fuse the relation between model precision and recall into a unified score. The metric is defined as taking the harmonic mean of precision and recall. The reason for using the harmonic mean, rather than the arithmetic mean, is that the harmonic mean of a recall-score of 0 and a precision-score of 100 would result in an F1-score of 0, rather than 50. This is advantageous, since it means that a model cannot achieve a high F1-score by having a high recall or precision by itself. The macro-averaging procedure of the macro F1-score involves calculating the arithmetic mean of the F1-score for each class.

For our modeling, we have chosen to use the Nordic BERT-architecture. The reason behind using Nordic BERT is that it has been deployed with great results in the litterature for a large range of similar classificaion tasks. Furthermore, the winning team in the OffensEval competition for the Danish task also used a Nordic BERT framework.

We trained the the BERT model for 10-epochs with the following hyperparameters:

  • Learning rate: 1e-5,
  • Batch size: 16
  • Max sequence length: 128

We ran- and developed the code on Google Colaboratory. For our model-training notebook, please see: "dk_hate_training.ipynb"

Our uploaded model can be found here, on huggingface.co: https://huggingface.co/Guscode/DKbert-hatespeech-detection

How to run

To run the code, please clone this repository and activate the virtual environment langvenv

git clone https://github.com/Guscode/DKbert-hatespeech-detection.git
cd DKbert-hatespeech-detection
bash ./create_lang_venv.sh
source ./langvenv/bin/activate

To evaluate the model, please refer to the dk_hate_detect.py-script, since this is the main tool. The dk_hate_detect.ipynb-notebook is mainly designed as a tutorial for non-expert users. Both contain the same model.

To run the script go through the following steps (NOTE: you have to specify either the --text-argument or the --data and --column-arguments to run the script):

python3 dk_hate_detect.py --data "data/Test_Hate.csv" --column "tweet"

You can specify the following arguments from the terminal:

Data path:

"--data", 
required = False, 
default = None,
help = "Path to the a dataset in csv format"

Column:

"--column"
required = False
default = None,
help = "name of column including text for hatespeech detection "

Single string classification:

"--text"
required = False, 
default = None
type = str
help = "string for single string hatespeech detection"

Output:

"--output"
required = False
type = str, default = "./"
help = "output path for dataset with hatespeech column"

You can also type: python3 dk_hate_detect.py -h for a detailed guide on how to specify script-arguments.

Go through the following steps to run the notebook:

Navigate to the "self_assigned"-folder. Open the "dk_hate_detect.ipynb"-file. Make sure the kernel is set to langvenv. You can do this by pressing "kernel" -> "change kernel" -> "langvenv".

Repository structure and contents

This repository contains the following folder:

Folder Description
data/ Folder containing a testing- and training dataset consisting over a 3.000 social media comments labeled after offensiveness (i.e. NOT and OFF).

Furthermore, it holds the following files:

File Description
dk_hate_detect.py The python script for the assignment.
dk_hate_detect.ipynb The Jupyter notebook for the assignment.
dk_hate_training.ipynb The Jupyter notebook we created when training the model.
README.md The README file that you are currently reading.

Discussion of results

Our model achieved a macro F1-score of 0.78. As stated in assignment 5, it is important to note that the dataset is heavily skewed towards non-offensive comments. This skew is also reflected in our model predictions, where the F1-score was much higher for non-offensive comments compared to offensive ones (0.95 vs. 0.60). We believe that this bias towards non-offensive comments, might very well reflect the imbalanced nature of the dataset.

As stated earlier, the currently best performing Danish hate speech model achieved a macro F1-score of 0.81 on the same dataset (as described in OffensEval2020). As such, we have not quite built the new gold-standard model for hate speech detection in Danish. Nonetheless, we have come very close, and our model would have finished 4th place (out of 38 contenders) in the OffensEval2020 competition. Furthermore, it is important to note that we have created the best publically available Danish hate speech + a ready-to-use .py-script and a thorough Jupyter notebook tutorial on how to use it. Therefore, we argue that we have greatly improved the possibilities for an actual real-life implementation of such an algorithm.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published