This is a Keras code repository accompanying the following paper:
Christof Weiß, Johannes Zeitler, Tim Zunner, Florian Schuberth, and Meinard Müller Learning Pitch-Class Representations from Score-Audio Pairs of Classical Music
Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2021
© Johannes Zeitler ([email protected]) and Christof Weiß ([email protected]), 2020/21
This repository only contains exemplary code and pre-trained models for most of the paper's experiments as well as some individual examples. Some of the datasets used in the paper are publicly available (at least partially), e.g.:
- Schubert Winterreise Dataset (SWD)
- MusicNet For details and references, please see the paper.
- Data: Exemplary data folder with some examples from Schubert Winterreise Dataset
- LibFMP: AudioLabs LibFMP library, see here for an update version
- Models: Pre-trained CNN models
- PrecomputedResults: Pre-computed evaluation measures, to be loaded for reproducing the figures in the paper
- TrainScripts: Python files for automated model training - just for information, not executable due to missing data!
- 01_preprocess_data_schubert_winterreise.ipynb: Dataset preprocessing, demonstrated at Schubert's Winterreise
- 02_evaluate_model_parameters.ipynb: Evaluate impact of basic model parameters (first part of Section 4 in the paper)
- 03_evaluate_datasets_and_models.ipynb: Training/testing on different datasets and with different networks (Figures 3 and 4 in the paper)
- 04_demo_estimate_pitchclasses.ipynb: Load an audio file and estimate pitch classes with a pre-trained CNN
- estimatePitchClasses.py: Command line tool to estimate chromagram with a pre-trained CNN
- customModels: CNN model definitions
- FrameGenerators: Tensorflow generators for feeding data to train, evaluate and predict functions
- harmonicCQT: Efficient implementation of the harmonic constant-Q-transform (HCQT)
- utils: Collection of useful functions for preprocessing etc.
- utils_DL: Collection of useful functions for the deep-learning pipeline
- environment.yml: To install Python/Keras environment pitchclass_cnn
Start the file estimatePitchClasses.py from a Python shell:
conda activate pitchclass_cnn
python estimatePitchClasses.py -s <audio_file.wav> -t <target_file.npy> -r <output_feature_rate> -n (L2-normalize feature sequence)
Note that you can select the specific model by uncommenting the desired one from lines 55-72.