sai-soum · sai-soum · Apr 1, 2024 · Apr 1, 2024 · Apr 5, 2024 · May 13, 2024
diff --git a/.gitignore b/.gitignore
@@ -7,13 +7,13 @@ data/*.ipynb
 
 __pycache__
 *.egg-info
-dasp-pytorch/
+
 mix_KE_adv/**
 .vscode/
 logs/**
 checkpoints/
 debug
-dasp-pytorch
 *.wav
 *.png
-data/FXencoder_ps.pt
+data/FXencoder_ps.pt
+outputs/**
diff --git a/.gitmodules b/.gitmodules
@@ -0,0 +1,3 @@
+[submodule "dasp-pytorch"]
+	path = dasp-pytorch
+	url = https://github.com/csteinmetz1/dasp-pytorch
diff --git a/Assets/diffmst-main_modified.jpg b/Assets/diffmst-main_modified.jpg
diff --git a/Assets/mst_final.png b/Assets/mst_final.png
diff --git a/Assets/mst_wbg.png b/Assets/mst_wbg.png
diff --git a/README.md b/README.md
@@ -2,95 +2,76 @@
 <div align="center">
 
 # Differentiable Mixing Style Transfer
-[Paper]() | [Website]()
+[Paper](https://sai-soum.github.io/assets/pdf/diffmst.pdf) | [Website](https://sai-soum.github.io/projects/diffmst/)
 
 
-<img src="./Assets/mst_wbg.png">
+<img src="./Assets/diffmst-main_modified.jpg">
 
 </div>
 
-Mixing style transfer using reference mix. 
+<!-- Mixing style transfer using reference mix. 
 There are two mixing console configurations (in `modules.py`)
 1. `BasicMixConsole`: Gain + Pan
 2. `AdvancedMixConsole`: Gain + Pan + Diff EQ + Diff Compressor
 
 Mixes for training can be created using either `naive_random_mix` (assigns random parameter values for mixing console to create a mix) or `knowledge_engineering_mix` (uses knowledge engineering to assign parameter values for mixing console to create a mix). Both of these modules can be found in `mixing.py`
 
-
+ -->
+# Repository Structure
+1. 'configs' - Contains configuration files for training and inference.
+2. 'mst' - Contains the main codebase for the project.
+    - 'dataloaders' - Contains dataloaders for the project.
+    - 'modules' - Contains the modules for different components of the system.
+    - 'mixing' - Contains the mixing modules for creating mixes.
+    - 'loss' - Contains the loss functions for the project.
+    - 'panns' - contains the most basic components like cnn14, resnet, etc.
+    - 'utils' - Contains utility functions for the project.
+3. 'scripts' - Contains scripts for running inference.  
 
 # Usage
-
 Clone the repository and install the `mst` package.
 ```
-git clone https://github.com/sai-soum/mix_style_transfer.git
-cd mix_style_transfer
+git clone --recursive https://github.com/sai-soum/Diff-MST.git
+cd Diff-MST
 python -m venv env
 source env/bin/activate
 pip install -e .
 ```
 
-[dasp-pytorch](https://github.com/csteinmetz1/dasp-pytorch) is required for differentiable audio effects.
-Clone the repo into the top-level of the project directory.
+[dasp-pytorch](https://github.com/csteinmetz1/dasp-pytorch) is required for differentiable audio effects. 
+Install the dependencies for dasp-pytorch.
 ```
-git clone https://github.com/csteinmetz1/dasp-pytorch.git
 cd dasp-pytorch
 pip install -e .
 ```
 
-Since `dasp` is currently under development you need to pull changes periodically. 
-To do so change to the directory and pull.
-```
-cd dasp-pytorch
-git pull
-```
-
-## Inference
-
-```
-CUDA_VISIBLE_DEVICES=5 python scripts/run.py \
-checkpoints/20230719/config.yaml \
-checkpoints/20230719/epoch=132-step=83125.ckpt \
-"/import/c4dm-02/acw639/DiffMST/song 2/Kat Wright_By My Side/" \
-output/ref_mix.wav \
-```
-
 ## Train
-
-First update the paths in the configuration file for both the logger and the dataset root directory.
+We use [LightningCLI](https://lightning.ai/docs/pytorch/stable/) for training and [Wandb](https://wandb.ai/site) for logging.
+First update the paths in the configuration file for both the logger, loss function, and the dataset root directory.
 Then call the `main.py` script passing in the configuration file. 
+
+### Method 1: Training with random mixes of the same song as reference using MRSTFT loss.
 ```
-# new model configuration with audio feature loss
 CUDA_VISIBLE_DEVICES=0 python main.py fit \
--c configs/config_cjs.yaml \
+-c configs/config.yaml \
 -c configs/optimizer.yaml \
--c configs/data/medley+cambridge+jamendo-8.yaml \
--c configs/models/gain+eq+comp-feat.yaml
+-c configs/data/medley+cambridge-8.yaml \
+-c configs/models/naive.yaml
+```
+You can change the number of tracks, the size of training data for an epoch, and the batch size in the data configuration file located at `configs/data/`
 
-# new model configuration with CLAP loss
+### Method 2: Training with real unpaired songs as reference using AFloss.
+```
 CUDA_VISIBLE_DEVICES=0 python main.py fit \
--c configs/config_cjs.yaml \
+-c configs/config.yaml \
 -c configs/optimizer.yaml \
 -c configs/data/medley+cambridge+jamendo-8.yaml \
--c configs/models/gain+eq+comp-clap.yaml
+-c configs/models/naive+feat.yaml
 ```
 
+## Inference
+To evaluate the model on real world data, run the ` scripts/eval_all_combo.py` script. 
 
-# Stability (ignore)
-```
-source env/bin/activate
-cd /scratch
-mkdir medleydb
-cd medleydb
-aws s3 sync s3://stability-aws/MedleyDB ./
-tar -xvf MedleyDB_v1.tar
-tar -xvf MedleyDB_v2.tar
-python main.py fit -c configs/config.yaml -c configs/optimizer.yaml -c configs/data/medleydb_cjs.yaml -c configs/models/naive_dmc_adv.yaml
-CUDA_VISIBLE_DEVICES=7 python main.py fit -c configs/config_cjs.yaml -c configs/optimizer.yaml -c configs/data/medleydb_c4dm.yaml -c configs/models/ke_dmc_adv.yaml
-
-CUDA_VISIBLE_DEVICES=7 python main.py fit -c configs/config.yaml -c configs/optimizer.yaml -c configs/data/medley+cambridge-4.yaml -c configs/models/naive+fx_encoder_loss.yaml
-
-To run the paramloss code
-
-CUDA_VISIBLE_DEVICES=2 python main.py fit -c configs/config.yaml -c configs/optimizer.yaml -c configs/data/medley+cambridge-4.yaml -c configs/models/naive+paramloss.yaml
+Update the model checkpoints and the inference examples directory in the script. 
 
-```
+`Python 3.10` was used for training. 
diff --git a/configs/config.yaml b/configs/config.yaml
@@ -6,34 +6,34 @@ trainer:
     init_args:
       project: DiffMST
       save_dir: /import/c4dm-datasets-ext/diffmst_logs_soum
-
   enable_checkpointing: true
-
-
   callbacks:
     - class_path: mst.callbacks.audio.LogAudioCallback
     - class_path: pytorch_lightning.callbacks.ModelSummary
       init_args: 
         max_depth: 2
-
     - class_path: mst.callbacks.mix.LogReferenceMix
       init_args:
-        root_dirs: 
-          - /import/c4dm-datasets-ext/diffmst-examples/song1/BenFlowers_Ecstasy_Full/
-          - /import/c4dm-datasets-ext/diffmst-examples/song2/Kat Wright_By My Side/
-          - /import/c4dm-datasets-ext/diffmst-examples/song3/Titanium_HauntedAge_Full/
+        root_dirs:
+          - /import/c4dm-datasets-ext/diffmst_validation/validation_set/song1/Soren_ALittleLate_Full
+          - /import/c4dm-datasets-ext/diffmst_validation/validation_set/song1/Soren_ALittleLate_Full
+          - /import/c4dm-datasets-ext/diffmst_validation/validation_set/song2/MR0903_Moosmusic_Full
+          - /import/c4dm-datasets-ext/diffmst_validation/validation_set/song2/MR0903_Moosmusic_Full
+          - /import/c4dm-datasets-ext/diffmst_validation/validation_set/song3/SaturnSyndicate_CatchTheWave_Full
         ref_mixes: 
-          - /import/c4dm-datasets-ext/diffmst-examples/song1/ref/_Feel it all Around_ by Washed Out (Portlandia Theme).mp3
-          - /import/c4dm-datasets-ext/diffmst-examples/song2/ref/The Dip - Paddle To The Stars (Lyric Video).mp3
-          - /import/c4dm-datasets-ext/diffmst-examples/song3/ref/Architects - _Doomsday_.mp3
+          - /import/c4dm-datasets-ext/diffmst_validation/validation_set/song1/ref/Harry Styles - Late Night Talking (Official Video).wav
+          - /import/c4dm-datasets-ext/diffmst_validation/validation_set/song1/ref/Poom - Les Voiles (Official Audio).wav
+          - /import/c4dm-datasets-ext/diffmst_validation/validation_set/song2/ref/Justin Timberlake - Can't Stop The Feeling! [Lyrics].wav
+          - /import/c4dm-datasets-ext/diffmst_validation/validation_set/song2/ref/Taylor Swift - Shake It Off.wav
+          - /import/c4dm-datasets-ext/diffmst_validation/validation_set/song3/ref/Miley Cyrus - Wrecking Ball (Lyrics).wav
   default_root_dir: null
   gradient_clip_val: 10.0
-  devices: 3
-  detect_anomaly: True
-
+  devices: 1
   check_val_every_n_epoch: 1
-  max_epochs: 10000
-  log_every_n_steps: 200
+
+  max_epochs: 800
+
+  log_every_n_steps: 50
   accelerator: gpu
   strategy: ddp_find_unused_parameters_true
   sync_batchnorm: true
@@ -42,8 +42,5 @@ trainer:
   num_sanity_val_steps: 2
   benchmark: true
   accumulate_grad_batches: 1
-  reload_dataloaders_every_n_epochs: 1
-
+  #reload_dataloaders_every_n_epochs: 1
 
-# - /import/c4dm-datasets-ext/diffmst-examples/song1/BenFlowers_Ecstasy_Full/
-# - /import/c4dm-datasets-ext/diffmst_validation/listening/diffmst-examples_wavref/Feel it all Around by Washed Out (Portlandia Theme).wav
diff --git a/configs/config_cjs.yaml b/configs/config_cjs.yaml
diff --git a/configs/config_param.yaml b/configs/config_param.yaml
diff --git a/configs/configs_hpc.yaml b/configs/configs_hpc.yaml
diff --git a/configs/models/naive+fx_encoder_loss.yaml b/configs/models/naive+fx_encoder_loss.yaml