-
MDX Leaderboard A
- Submission ID: 216209
- Submitter: kim_min_seok
- Final rank: 3rd place
- Final scores:
SDR_song SDR_bass SDR_drums SDR_other SDR_vocals 6.513 6.707 6.712 4.816 7.816
-
MDX Leaderboard B
- Submission ID: 216211
- Submitter: kim_min_seok
- Final rank: 1st place
- Final scores:
SDR_song SDR_bass SDR_drums SDR_other SDR_vocals 6.581 6.975 6.646 4.962 7.741
- Data
- All 203 tracks of the Moises dataset was used for training (no validation split)
- Augmentation
- Random chunking and mixing sources from different tracks ([1])
- Model
- A 'multi-source' version of TFC-TDF U-Net[2, 3] with some architectural improvements, including Channel-wise Subband[4]
- Final submission is an ensemble of 3 models with identical architecture and training procedure but with different random seeds
- Noise-robust Training
- Leaderboard A: Loss masking
- Intuitively, data with high training loss is likely to be audio chunks with labelnoise
- For each training batch, discard (=don't use for weight update) batch elements with higher loss than some quantile
- ex) only use half of the training batch for each weight update
- Leaderboard B: Loss masking (along temporal dimension)
- Compared to labelnoise, data with bleeding seemed to vary less in terms of the amount of noise
- A more fine-grained masking method performed better (discarding temporal bins with high loss)
- Leaderboard A: Loss masking
[1] S. Uhlich, et al., "Improving music source separation based on deep neural networks through data augmentation and network blending", ICASSP 2017.
[2] W. Choi, et al. "Investigating u-nets with various intermediate blocks for spectrogram-based singing voice separation", ISMIR 2020.
[3] M. Kim, et al. “Kuielab-mdx-net: A two-stream neural network for music demixing”, MDX Workshop at ISMIR 2021.
[4] H. Liu, et al. "Channel-wise Subband Input for Better Voice and Accompaniment Separation on High Resolution Music", INTERSPEECH 2020.
Download mdx_AB.zip, which contains
- pretrained model checkpoints
- config.yaml files (configurations for training and inference)
- Create a 'ckpts' folder under my_submission. Unzip the downloaded zip file to 'my_submission/ckpts'.
- Copy my_submission and requirements.txt to your SDX 2023 Music Demixing Track Starter Kit.
- Run submit.sh after configuring my_submission/user_config.py
- for Leaderboard A, set
MySeparationModel = A
- for Leaderboard B, set
MySeparationModel = B
- for Leaderboard A, set
- All code needed to reproduce training is in my_submission/src
- See HOW_TO_TRAIN.md