Before running, data should be retrieved from the following link: https://www.kaggle.com/c/freesound-audio-tagging/data Unzip data to a folder in the root directory called ./input
Generate feature vectors (spectrogram, MFCC) without directly loading them into the neural networks. This can be done to visualize the inputs, or to simply speed up the training process without performing frequency domain calculations every time. Run extract_features.py located in the ./src/utils. It will generate a directory where the input images will be stored. NOTE: implementation of this was not build into main.py
- librosa
- PIL
- scipy
Run main.py in ./src. The main function has CUDA support and will attempt to run on GPU.
- above
- PyTorch
- skimage
Run main_visualize.py in ./src/visualize. CUDA support not build in yet.
Matlab code has been deprecated.