Skip to content

fmi-basel/binary-activation-maps-sqp

Repository files navigation

Speech Quality Prediction with Binary Activation Maps (BAMs)

Code used for the paper "Resource-Efficient Speech Quality Prediction through Quantization Aware Training and Binary Activation Maps".

Setup

  • Install conda env: conda env create -f env.yaml
  • Activate environment: conda activate sqp_experiments
  • Install current project: pip install -e .

Dataset

You can generate a training dataset with the following steps:

  1. Audio data: run the download and single-process data generation scripts from the Interspeech 2020 DNS Challenge; in noisyspeech_synthesizer.cfg, set total_hours: 50 to match the paper and adjust the destination paths as needed
  2. EGS files: generate the JSON file lists (EGS) using the denoiser/audio.py script from the DEMUCS Denoiser repo (check link for instructions)
  3. Labels: compute speech quality labels using the script included here, e.g.: python compute_labels.py /path/to/egs_dir /path/to/output (the default settings match the paper; run with --help for more additional info)
  4. Repeat #3 for test set labels
  5. Edit config/dataset/*.yaml with the correct paths for audio data and labels

Usage examples

Training

Train baseline model with default hyperparameters:
python train.py

Train BAM model with β = 5 for 50 epochs:
python train.py model=dnsmos_binary model.conv_activation_param=5 epochs=50

For a list of possible arguments and configurations, run:
python train.py --help

Post-training quantization

Perform post-training quantization on trained models and comparatively evaluate on validation or test data:
python quantize.py valid
python quantize.py test

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages