Audio Deep Fake Detection

Audio.DeepFake.Detection.using.Machine.Learning.mp4

System Architecture

Setup Environment

# Set up Python virtual environment
python3 -m venv venv && source venv/bin/activate

# Make sure your PIP is up to date
pip install -U pip wheel setuptools

# Install required dependencies
pip install -r requirements.txt

Install PyTorch that suits your machine: https://pytorch.org/get-started/locally/

Setup Datasets

You may download the datasets used in the project from the following URLs:

(Real) Human Voice Dataset: LJ Speech (v1.1)
- This dataset consists of 13,100 short audio clips of a single speaker reading passages from 7 non-fiction books.
(Fake) Synthetic Voice Dataset: WaveFake (v1.20)
- The dataset consists of 104,885 generated audio clips (16-bit PCM wav).

After downloading the datasets, you may extract them under data/real and data/fake respectively. In the end, the data directory should look like this:

data
├── real
│   └── wavs
└── fake
    ├── common_voices_prompts_from_conformer_fastspeech2_pwg_ljspeech
    ├── jsut_multi_band_melgan
    ├── jsut_parallel_wavegan
    ├── ljspeech_full_band_melgan
    ├── ljspeech_hifiGAN
    ├── ljspeech_melgan
    ├── ljspeech_melgan_large
    ├── ljspeech_multi_band_melgan
    ├── ljspeech_parallel_wavegan
    └── ljspeech_waveglow

Model Checkpoints

You may download the model checkpoints from here: Google Drive. Unzip the files and replace the saved directory with the extracted files.

Link to the best model: Google Drive

Training

Use the train.py script to train the model.

usage: train.py [-h] [--real_dir REAL_DIR] [--fake_dir FAKE_DIR] [--batch_size BATCH_SIZE] [--epochs EPOCHS]
                [--seed SEED] [--feature_classname {wave,lfcc,mfcc}]
                [--model_classname {MLP,WaveRNN,WaveLSTM,SimpleLSTM,ShallowCNN,TSSD}]
                [--in_distribution {True,False}] [--device DEVICE] [--deterministic] [--restore] [--eval_only] [--debug] [--debug_all]

optional arguments:
  -h, --help            show this help message and exit
  --real_dir REAL_DIR, --real REAL_DIR
                        Directory containing real data. (default: 'data/real')
  --fake_dir FAKE_DIR, --fake FAKE_DIR
                        Directory containing fake data. (default: 'data/fake')
  --batch_size BATCH_SIZE
                        Batch size. (default: 256)
  --epochs EPOCHS       Number of maximum epochs to train. (default: 20)
  --seed SEED           Random seed. (default: 42)
  --feature_classname {wave,lfcc,mfcc}
                        Feature classname. (default: 'lfcc')
  --model_classname {MLP,WaveRNN,WaveLSTM,SimpleLSTM,ShallowCNN,TSSD}
                        Model classname. (default: 'ShallowCNN')
  --in_distribution {True,False}, --in_dist {True,False}
                        Whether to use in distribution experiment setup. (default: True)
  --device DEVICE       Device to use. (default: 'cuda' if possible)
  --deterministic       Whether to use deterministic training (reproducible results).
  --restore             Whether to restore from checkpoint.
  --eval_only           Whether to evaluate only.
  --debug               Whether to use debug mode.
  --debug_all           Whether to use debug mode for all models.

Example:

To make sure all models can run successfully on your device, you can run the following command to test:

python train.py --debug_all

To train the model ShallowCNN with lfcc features in the in-distribution setting, you can run the following command:

python train.py --real data/real --fake data/fake --batch_size 128 --epochs 20 --seed 42 --feature_classname lfcc --model_classname ShallowCNN

Please use inline environment variable CUDA_VISIBLE_DEVICES to specify the GPU device(s) to use. For example:

CUDA_VISIBLE_DEVICES=0 python train.py

Evaluation

By default, we directly use test set for training validation, and the best model and the best predictions will be automatically saved in the saved directory during training/testing. Go to the directory saved to see the evaluation results.

To evaluate on the test set using trained model, you can run the following command:

python train.py --feature_classname lfcc --model_classname ShallowCNN --restore --eval_only

Run the following command to re-compute the evaluation results based on saved predictions and labels:

python metrics.py

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ROC AUC		ROC AUC
assets		assets
confusion matrix		confusion matrix
data		data
models		models
module		module
preprocess		preprocess
saved		saved
AI Generated Audio Detection using Machine Learning.mp4		AI Generated Audio Detection using Machine Learning.mp4
DataLoader.py		DataLoader.py
README.md		README.md
analyze.py		analyze.py
app.py		app.py
metrics.py		metrics.py
train.py		train.py
trainer.py		trainer.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Deep Fake Detection

System Architecture

Setup Environment

Setup Datasets

Model Checkpoints

Training

Evaluation

Authors

About

Releases

Packages

Languages

yaseeng-md/Audio-DeepFake-Detection

Folders and files

Latest commit

History

Repository files navigation

Audio Deep Fake Detection

System Architecture

Setup Environment

Setup Datasets

Model Checkpoints

Training

Evaluation

Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages