Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a script for finding the best section of tracks and reference with lowest AF error for generating mix #21

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
6 changes: 3 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ data/*.ipynb

__pycache__
*.egg-info
dasp-pytorch/

mix_KE_adv/**
.vscode/
logs/**
checkpoints/
debug
dasp-pytorch
*.wav
*.png
data/FXencoder_ps.pt
data/FXencoder_ps.pt
outputs/**
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "dasp-pytorch"]
path = dasp-pytorch
url = https://github.com/csteinmetz1/dasp-pytorch
Binary file added Assets/diffmst-main_modified.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed Assets/mst_final.png
Binary file not shown.
Binary file removed Assets/mst_wbg.png
Binary file not shown.
89 changes: 35 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,95 +2,76 @@
<div align="center">

# Differentiable Mixing Style Transfer
[Paper]() | [Website]()
[Paper](https://sai-soum.github.io/assets/pdf/diffmst.pdf) | [Website](https://sai-soum.github.io/projects/diffmst/)


<img src="./Assets/mst_wbg.png">
<img src="./Assets/diffmst-main_modified.jpg">

</div>

Mixing style transfer using reference mix.
<!-- Mixing style transfer using reference mix.
There are two mixing console configurations (in `modules.py`)
1. `BasicMixConsole`: Gain + Pan
2. `AdvancedMixConsole`: Gain + Pan + Diff EQ + Diff Compressor

Mixes for training can be created using either `naive_random_mix` (assigns random parameter values for mixing console to create a mix) or `knowledge_engineering_mix` (uses knowledge engineering to assign parameter values for mixing console to create a mix). Both of these modules can be found in `mixing.py`


-->
# Repository Structure
1. 'configs' - Contains configuration files for training and inference.
2. 'mst' - Contains the main codebase for the project.
- 'dataloaders' - Contains dataloaders for the project.
- 'modules' - Contains the modules for different components of the system.
- 'mixing' - Contains the mixing modules for creating mixes.
- 'loss' - Contains the loss functions for the project.
- 'panns' - contains the most basic components like cnn14, resnet, etc.
- 'utils' - Contains utility functions for the project.
3. 'scripts' - Contains scripts for running inference.

# Usage

Clone the repository and install the `mst` package.
```
git clone https://github.com/sai-soum/mix_style_transfer.git
cd mix_style_transfer
git clone --recursive https://github.com/sai-soum/Diff-MST.git
cd Diff-MST
python -m venv env
source env/bin/activate
pip install -e .
```

[dasp-pytorch](https://github.com/csteinmetz1/dasp-pytorch) is required for differentiable audio effects.
Clone the repo into the top-level of the project directory.
[dasp-pytorch](https://github.com/csteinmetz1/dasp-pytorch) is required for differentiable audio effects.
Install the dependencies for dasp-pytorch.
```
git clone https://github.com/csteinmetz1/dasp-pytorch.git
cd dasp-pytorch
pip install -e .
```

Since `dasp` is currently under development you need to pull changes periodically.
To do so change to the directory and pull.
```
cd dasp-pytorch
git pull
```

## Inference

```
CUDA_VISIBLE_DEVICES=5 python scripts/run.py \
checkpoints/20230719/config.yaml \
checkpoints/20230719/epoch=132-step=83125.ckpt \
"/import/c4dm-02/acw639/DiffMST/song 2/Kat Wright_By My Side/" \
output/ref_mix.wav \
```

## Train

First update the paths in the configuration file for both the logger and the dataset root directory.
We use [LightningCLI](https://lightning.ai/docs/pytorch/stable/) for training and [Wandb](https://wandb.ai/site) for logging.
First update the paths in the configuration file for both the logger, loss function, and the dataset root directory.
Then call the `main.py` script passing in the configuration file.

### Method 1: Training with random mixes of the same song as reference using MRSTFT loss.
```
# new model configuration with audio feature loss
CUDA_VISIBLE_DEVICES=0 python main.py fit \
-c configs/config_cjs.yaml \
-c configs/config.yaml \
-c configs/optimizer.yaml \
-c configs/data/medley+cambridge+jamendo-8.yaml \
-c configs/models/gain+eq+comp-feat.yaml
-c configs/data/medley+cambridge-8.yaml \
-c configs/models/naive.yaml
```
You can change the number of tracks, the size of training data for an epoch, and the batch size in the data configuration file located at `configs/data/`

# new model configuration with CLAP loss
### Method 2: Training with real unpaired songs as reference using AFloss.
```
CUDA_VISIBLE_DEVICES=0 python main.py fit \
-c configs/config_cjs.yaml \
-c configs/config.yaml \
-c configs/optimizer.yaml \
-c configs/data/medley+cambridge+jamendo-8.yaml \
-c configs/models/gain+eq+comp-clap.yaml
-c configs/models/naive+feat.yaml
```

## Inference
To evaluate the model on real world data, run the ` scripts/eval_all_combo.py` script.

# Stability (ignore)
```
source env/bin/activate
cd /scratch
mkdir medleydb
cd medleydb
aws s3 sync s3://stability-aws/MedleyDB ./
tar -xvf MedleyDB_v1.tar
tar -xvf MedleyDB_v2.tar
python main.py fit -c configs/config.yaml -c configs/optimizer.yaml -c configs/data/medleydb_cjs.yaml -c configs/models/naive_dmc_adv.yaml
CUDA_VISIBLE_DEVICES=7 python main.py fit -c configs/config_cjs.yaml -c configs/optimizer.yaml -c configs/data/medleydb_c4dm.yaml -c configs/models/ke_dmc_adv.yaml

CUDA_VISIBLE_DEVICES=7 python main.py fit -c configs/config.yaml -c configs/optimizer.yaml -c configs/data/medley+cambridge-4.yaml -c configs/models/naive+fx_encoder_loss.yaml

To run the paramloss code

CUDA_VISIBLE_DEVICES=2 python main.py fit -c configs/config.yaml -c configs/optimizer.yaml -c configs/data/medley+cambridge-4.yaml -c configs/models/naive+paramloss.yaml
Update the model checkpoints and the inference examples directory in the script.

```
`Python 3.10` was used for training.
37 changes: 17 additions & 20 deletions configs/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,34 +6,34 @@ trainer:
init_args:
project: DiffMST
save_dir: /import/c4dm-datasets-ext/diffmst_logs_soum

enable_checkpointing: true


callbacks:
- class_path: mst.callbacks.audio.LogAudioCallback
- class_path: pytorch_lightning.callbacks.ModelSummary
init_args:
max_depth: 2

- class_path: mst.callbacks.mix.LogReferenceMix
init_args:
root_dirs:
- /import/c4dm-datasets-ext/diffmst-examples/song1/BenFlowers_Ecstasy_Full/
- /import/c4dm-datasets-ext/diffmst-examples/song2/Kat Wright_By My Side/
- /import/c4dm-datasets-ext/diffmst-examples/song3/Titanium_HauntedAge_Full/
root_dirs:
- /import/c4dm-datasets-ext/diffmst_validation/validation_set/song1/Soren_ALittleLate_Full
- /import/c4dm-datasets-ext/diffmst_validation/validation_set/song1/Soren_ALittleLate_Full
- /import/c4dm-datasets-ext/diffmst_validation/validation_set/song2/MR0903_Moosmusic_Full
- /import/c4dm-datasets-ext/diffmst_validation/validation_set/song2/MR0903_Moosmusic_Full
- /import/c4dm-datasets-ext/diffmst_validation/validation_set/song3/SaturnSyndicate_CatchTheWave_Full
ref_mixes:
- /import/c4dm-datasets-ext/diffmst-examples/song1/ref/_Feel it all Around_ by Washed Out (Portlandia Theme).mp3
- /import/c4dm-datasets-ext/diffmst-examples/song2/ref/The Dip - Paddle To The Stars (Lyric Video).mp3
- /import/c4dm-datasets-ext/diffmst-examples/song3/ref/Architects - _Doomsday_.mp3
- /import/c4dm-datasets-ext/diffmst_validation/validation_set/song1/ref/Harry Styles - Late Night Talking (Official Video).wav
- /import/c4dm-datasets-ext/diffmst_validation/validation_set/song1/ref/Poom - Les Voiles (Official Audio).wav
- /import/c4dm-datasets-ext/diffmst_validation/validation_set/song2/ref/Justin Timberlake - Can't Stop The Feeling! [Lyrics].wav
- /import/c4dm-datasets-ext/diffmst_validation/validation_set/song2/ref/Taylor Swift - Shake It Off.wav
- /import/c4dm-datasets-ext/diffmst_validation/validation_set/song3/ref/Miley Cyrus - Wrecking Ball (Lyrics).wav
default_root_dir: null
gradient_clip_val: 10.0
devices: 3
detect_anomaly: True

devices: 1
check_val_every_n_epoch: 1
max_epochs: 10000
log_every_n_steps: 200

max_epochs: 800

log_every_n_steps: 50
accelerator: gpu
strategy: ddp_find_unused_parameters_true
sync_batchnorm: true
Expand All @@ -42,8 +42,5 @@ trainer:
num_sanity_val_steps: 2
benchmark: true
accumulate_grad_batches: 1
reload_dataloaders_every_n_epochs: 1

#reload_dataloaders_every_n_epochs: 1

# - /import/c4dm-datasets-ext/diffmst-examples/song1/BenFlowers_Ecstasy_Full/
# - /import/c4dm-datasets-ext/diffmst_validation/listening/diffmst-examples_wavref/Feel it all Around by Washed Out (Portlandia Theme).wav
46 changes: 0 additions & 46 deletions configs/config_cjs.yaml

This file was deleted.

29 changes: 0 additions & 29 deletions configs/config_param.yaml

This file was deleted.

29 changes: 0 additions & 29 deletions configs/configs_hpc.yaml

This file was deleted.

63 changes: 0 additions & 63 deletions configs/models/naive+fx_encoder_loss.yaml

This file was deleted.

Loading