| language | pretty_name | tags | license | license_name | license_link | ||||
|---|---|---|---|---|---|---|---|---|---|
|
MoisesDB |
|
other |
cc-by-nc-sa-4.0 |
Moises Dataset for Source Separation
- Homepage: MoisesDB homepage
- Repository: MoisesDB repository
- Paper: Moisesdb: A dataset for source separation beyond 4-stems
- Point of Contact: Igor Pereira
MoisesDB is a dataset for source separation. It provides a collection of tracks and their separated stems (vocals, bass, drums, etc.). The dataset is used to evaluate the performance of source separation algorithms.
Please download the dataset at our research website, extract it and configure the environment variable MOISESDB_PATH accordingly.
export MOISESDB_PATH=./moises-db-data
The directory structure should be
moisesdb:
moisesdb_v0.1
track uuid 0
track uuid 1
.
.
.
To verify the integrity of your downloaded dataset, you can check the following hashes:
MD5: 13cf74eda129c38b914a51ea79fb1778
SHA256: 4cde33ce416ac7c868cffcb60eb31f5c741ab7ae5601cbb9d99ed498b72c48c1
Just for reference, not directly related to the tools in this repo.
SDXDB23_LabelNoise:
MD5: 629cfce51e4c8a36eae9c22aa5b710d3
SHA256: f6d2eac4ee1e21bf8237c0dcef2f3ebb9d04001ff8f999e7528107246eee08e2
SDXDB23_Bleeding:
MD5: be3ffafbdccb46b91507f73c44dabe4a
SHA256: b18a95da6b253bea986cf79990b6f2492d219871fdc17150ce599b45576d457e
You can verify the integrity on Linux/Mac using:
md5sum moisesdb.zip
sha256sum moisesdb.zipOr on Windows using:
Get-FileHash -Algorithm MD5 moisesdb.zip
Get-FileHash -Algorithm SHA256 moisesdb.zipYou can install this package with
pip install git+https://github.com/moises-ai/moises-db.git
After downloading and configuring the path for the dataset, you can create an instance of MoisesDB to access the tracks. You can also provide the dataset path with the data_path argument.
from moisesdb.dataset import MoisesDB
db = MoisesDB(
data_path='./moisesdb',
sample_rate=44100
)
The MoisesDB object has iterator properties that you can use to access all files within the dataset.
n_songs = len(db)
track = db[0] # Returns a MoisesDBTrack object
The MoisesDBTrack object holds information about a track in the dataset, perform on-the-fly mixing for stems and multiple sources within a stem.
You can access all the stems and mixture from the stem and audio properties. The stem property returns a dictionary whith available stems as keys and nd.array on values. The audio property results in a nd.array with the mixture.
track = db[0]
stems = track.stems # stems = {'vocals': ..., 'bass': ..., ...}
mixture track.audio # mixture = nd.array
The MoisesDBTrack object also contains other non-audio information from the track such as:
track.idtrack.providertrack.artisttrack.nametrack.genretrack.sourcestrack.bleedingstrack.activity
The stems and mixture are computed on-the-fly. You can create a stems-only version of the dataset using the save_stems method of the MoisesDBTrack.
track = db[0]
path = './moises-db-stems/0'
track.save_stems(path)
We run a few source separation algorithms as well as oracle methods to evaluate the performance of each track of the MoisesDB. These results are located in csv files at the benchmark folder.
If you used the MoisesDB dataset on your research, please cite the following paper.
@misc{pereira2023moisesdb,
title={Moisesdb: A dataset for source separation beyond 4-stems},
author={Igor Pereira and Felipe Araújo and Filip Korzeniowski and Richard Vogl},
year={2023},
eprint={2307.15913},
archivePrefix={arXiv},
primaryClass={cs.SD}
}
MoisesDB is distributed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).
For the complete license terms, please visit: https://creativecommons.org/licenses/by-nc-sa/4.0/
See LICENSE file for details.