TwitterBotBusters

Introduction

Bots are a prevalent problem on Twitter. At best, bots create inauthentic interactions and artificially inflate one’s social influence; at worst, they spread dangerous content like scams or fake news. At Twitter’s scale, it can no longer rely on human annotators to identify bots from humans and has to opt for some form of automatic detection. This project aims to detect human from bots using their user description and tweets modeled with different deep learning approaches, including multilayer perceptron (MLP) and different types of graph neural networks (GNN), including graph convolutional network (GCN), graph isomorphic network (GIN), and graph attention network (GAN). We also experimented with different model architectures for extracting the embedding that summarizes the users' tweets. We found that the best model is an architecture that combines MLP and GAN, giving an accuracy score of X on the Y dataset.

Dataset Format

Cresci-15 dataset contains node.json, label.csv, split.csv and edge.csv (for datasets with graph structure).

How to download Cresci-15 dataset

Cresci-15 is available at Google Drive.

Download Other-Dataset-TwiBot22-Format.zip and unzip.
Copy cresci-2015 to src/BotRGCN/datasets/.

Requirements

To setup the environment and install the requirement bash commands_local.sh. You might need to adjust the cuda version depending on the cuda version that you use.

How to run baselines

clone this repo by running git clone https://github.com/travistangvh/TwitterBotBusters
change directory to src/BotRGCN/datasets and download datasets and create new folder in ./cresci-2015
create the preprocessed data by changing the directory to src/BotRGCN/cresci_15 and run python3 ./preprocess_combined.py. This will create a preprocess data in the src/BotRGCN/cresci_15/processed
change directory to src/GCN_GAT
run experiments by executing python train.py --config gat-mlp-1.yaml. You can explore other model by changing the config file.

Name		Name	Last commit message	Last commit date
Latest commit History 201 Commits
config		config
descriptions		descriptions
notebooks		notebooks
pics		pics
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
commands_local.sh		commands_local.sh
conda_list		conda_list
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TwitterBotBusters

Introduction

Dataset Format

How to download Cresci-15 dataset

Requirements

How to run baselines

About

Releases

Packages

Languages

License

travistangvh/TwitterBotBusters

Folders and files

Latest commit

History

Repository files navigation

TwitterBotBusters

Introduction

Dataset Format

How to download Cresci-15 dataset

Requirements

How to run baselines

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages