free-music-demixer

A free static website for client-side music demixing (aka music source separation) with the Demucs v4 hybrid transformer AI model. It runs on CloudFlare Pages.

The paid PRO site contains higher-quality custom ensemble models.

demucs.cpp: transliterated the original PyTorch model Python code to C++ with Eigen3, compiled to WebAssembly with Emscripten. No quantization: the weights of Demucs v4 htdemucs and htdemucs_6s are 81 MB and 53 MB respectively, stored as float16. Anything smaller affects the quality of the network, and compression only gets down to ~70 MB: not worth the extra loading time.

Dev instructions

The wav-file-encoder project has been vendored in; I manually compiled the Typescript file to Javascript with these commands:

npm install typescript
npx tsc --module es6 ../vendor/wav-file-encoder/src/WavFileEncoder.ts

Demucs v4

Fewer memory issues from segmented design (largest track tested is ~7 minutes, 'Georgia Wonder - Siren').

'Georgia Wonder - Siren' (takes ~41 minutes):

vocals          ==> SDR:   7.261  SIR:  13.550  ISR:  13.158  SAR:   6.763
drums           ==> SDR:  10.629  SIR:  17.819  ISR:  17.373  SAR:  10.829
bass            ==> SDR:  10.593  SIR:  19.696  ISR:  12.244  SAR:  10.007
other           ==> SDR:   6.324  SIR:   9.005  ISR:  13.223  SAR:   6.067

'Georgia Wonder - Siren', ~9 minutes with 8 workers:

vocals          ==> SDR:   7.181  SIR:  14.328  ISR:  12.764  SAR:   6.563
drums           ==> SDR:  10.695  SIR:  17.893  ISR:  17.335  SAR:  10.816
bass            ==> SDR:  10.588  SIR:  19.700  ISR:  12.098  SAR:   9.927
other           ==> SDR:   6.238  SIR:   8.757  ISR:  13.886  SAR:   6.144

'Zeno - Signs' (takes ~20 minutes):

vocals          ==> SDR:   8.326  SIR:  18.257  ISR:  15.927  SAR:   8.311
drums           ==> SDR:  10.041  SIR:  18.413  ISR:  17.054  SAR:  10.692
bass            ==> SDR:   3.893  SIR:  12.221  ISR:   7.076  SAR:   3.237
other           ==> SDR:   7.432  SIR:  11.422  ISR:  14.161  SAR:   8.201

'Zeno - Signs', ~5 minutes with 8 workers, 0.75s overlap + weighted sum (same as demucs segmenting):

vocals          ==> SDR:   8.297  SIR:  18.114  ISR:  15.731  SAR:   8.335
drums           ==> SDR:  10.007  SIR:  18.465  ISR:  16.983  SAR:  10.665
bass            ==> SDR:   4.054  SIR:  12.487  ISR:   6.728  SAR:   3.035
other           ==> SDR:   7.349  SIR:  11.118  ISR:  14.236  SAR:   8.159

(best sound to my ears/least segment boundaries, anecdotally)

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
.github		.github
js		js
scripts		scripts
test		test
vendor		vendor
web		web
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
gulpfile.js		gulpfile.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

free-music-demixer

Dev instructions

Demucs v4

About

Releases 7

Packages

Languages

License

sevagh/freemusicdemixer.com

Folders and files

Latest commit

History

Repository files navigation

free-music-demixer

Dev instructions

Demucs v4

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 7

Packages 0

Languages

Packages