Vietnamese Targeted Hate Speech Detection on Social Media Texts.
Contact information: Mr. Son T. Luu
Email: [email protected] (Alternative: [email protected])
10,000 comments, each comment has 05 targets with three relevant hateful levels.
https://arxiv.org/abs/2404.19252
(Please cite this paper when using the dataset)
Citation:
Vo, C.N., Huynh, K.B., Luu, S.T. et al. ViTHSD: exploiting hatred by targets for hate speech detection on Vietnamese social media texts. J Comput Soc Sc 8, 30 (2025). https://doi.org/10.1007/s42001-024-00348-6
@article{vo2025vithsd,
title={ViTHSD: exploiting hatred by targets for hate speech detection on Vietnamese social media texts},
author={Vo, Cuong Nhat and Huynh, Khanh Bao and Luu, Son T and Do, Trong-Hop},
journal={Journal of Computational Social Science},
volume={8},
number={2},
pages={30},
year={2025},
publisher={Springer}
}
Updating
- Apache Kafka
- Apache Spark Structured Streaming
- QuestDB - for sink
-
Step 1: Start zookeeper server and kafka server Code:
-
Start zookeeper server
bin/zookeeper-server-start.sh config/zookeeper.properties
-
Start kafka server
bin/kafka-server-start.sh config/server.properties
-
-
Step 2: Create topic
- Create topic named "youtube"
bin/kafka-topics.sh --create --topic youtube --bootstrap-server localhost:9092
- Create topic named "youtube"
-
Step 3: Start questdb and connect questdb to topic.
- Start questdb
sudo questdb start
- Connect questdb connector to kafka topic
bin/connect-standalone.sh config/connect-standalone.properties config/questdb-connector.properties
- Start questdb
-
Step 4: Submit spark to kafka topic
spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.3.2 sparkStreaming.py
-
Step 5: Start producer and consumer
-
Producer
python3 youtubeLiveData.py
-
Consumer
python3 consumer.py
-
Now you can see the data on questdb at here
Updating at here