The story starts after I discovered the CVE-2023-39141, and realized the vulnerable code is spreaded EVERYWHERE. As it was technically infeasible for me to check & report manually, we decided to team up and implement a pipeline to Find, Verify SAST, DAST(0 false positives via auto running), Assess CVSS scores, Fix(GPT-4), Patch and send Pull Requests, fully automatically.
You can access the paper Eradicating the Unseen: Detecting, Exploiting, and Remediating a Path Traversal Vulnerability across GitHub
Jafar Akhoundali, Hamidreza Hamidi, Kristian Rietveld, Olga Gadyatskaya - Published in Asia CCS 2025
Please cite the paper as:
@inproceedings{akhoundali2025eradicating,
title={Eradicating the Unseen: Detecting, Exploiting, and Remediating a Path Traversal Vulnerability across GitHub},
author={Akhoundali, Jafar and Hamidi, Hamidreza and Rietveld, Kristian and Gadyatskaya, Olga},
booktitle={Proceedings of the 20th ACM Asia Conference on Computer and Communications Security},
pages={542--558},
year={2025}
}
This project is provided strictly for research and educational purposes only.
- Do not attempt to send issues, vulnerability reports, or patches to repositories that have already received a report from us.
This is partially automated in the code, so please make sure you understand each module before execution. - The authors take no responsibility for any misuse of this project.
- Vulnerable code is executed inside Docker containers. It is expected to be safe, but there is no absolute guarantee.
- Although the source code of this pipeline is free to use, parts related to SAST (SemGrep and CodeQL) may have different licenses depending on your usage.
It is the user’s responsibility to verify and comply with all applicable license agreements.
This program was developed and tested on GNU/Linux, and partially tested on macOS.
Windows/WSL are not tested nor recommended. Running the project inside a docker container(DIND) is not recommended as this way the nested containers can bypass the sandbox.
Please read the instructions carefully before running the program, as incorrect usage may result in spamming reports.
Requirements:
- Docker and Docker Compose installed and available in your CLI.
- Pipeline must run either as root (recommended for Docker commands), or with Docker accessible to the current user (
⚠️ this may reduce security). - CodeQL and Semgrep must be installed.
# Clone repo
git clone https://github.com/JafarAkhondali/DotDotDefender.git
cd DotDotDefender
# Configure environment variables
cp env.example .env
nano .env
# Run database image
sudo docker-compose up -d
# Set up Python environment
python3 -m venv venv
# Below 3 steps must be done for each time you want to run the code
source ./venv/bin/activate
export PYTHONPATH=$(pwd)
pip3 install -r requirements.txt
# Install CodeQL - https://github.com/github/codeql-action/releases
# Install SemGrep CLI - https://semgrep.dev/docs/cli-reference
This program consists of several components. Each one is designed to run independently, contributing to the overall pipeline:
-
Scraper
Searches for specific patterns of vulnerable code using GitHub Code Search feature.
File:scrapper/recursive-scrapper.py
-
Static Analysis
Validates the vulnerability by running SemGrep with specific payloads.
File:sast/grep.py
-
PoC Checker
Executes the program with the payload.- Must be run as root for Docker commands.
- There are 3 different PoC checkers — all must be run to complete this step:
run-poc-network.py
run-poc-local.py
run-poc-dos.py
-
Reporter (Scoring & Patching)
- Calculates CVSS Score →
calculate_cvss_scores.py
- Prepares a fix using GPT-4 →
patcher.py
- Calculates CVSS Score →
-
Reporter (Verification & Patch Application)
- Verifies the vulnerability still exists →
patcher.py
- Applies a verified patch using an LLM.
- Verifies the vulnerability still exists →
-
Pull Requester (Commit History)
Retrieves the time when the first vulnerable commit appeared.
File:add_first_appeared.py
-
Pull Requester (PR Submission)
⚠️ Warning: If you are testing, DO NOT RUNrun.py
- it will re-verify the patch and send an actual pull request.

This figure illustrates how training data from vulnerable repositories can contaminate LLM outputs, causing the reproduction of insecure code patterns(more details in paper).
We welcome contributions to improve DotDot Defender! Morover you are more than welcome to maintain your own fork and research.
If you’d like to add features, fix bugs, or improve documentation, please follow these steps:
- Fork it!
- Create your feature branch:
git checkout -b my-new-feature
- Commit your changes:
git commit -am 'Add some feature'
- Push to the branch:
git push origin my-new-feature
- Submit a pull request
Although the source code of this pipeline is free to use, parts related to SAST (SemGrep and CodeQL) may have different licenses depending on your usage. It is the user’s responsibility to verify and comply with all applicable license agreements.