Dataset: ESC50 (50 classes, 2000 examples).
Preprocessing: MFCCs, Chromagram, data augmentaion (7 times the initial sample size).
Evaluation metrics: Accuracy, Estimated Memory Usage. Architectures: CNN, RNN-SEQ2D, RNN431, RNN60-small, RNN60-LSTM, RNN60-GRU.
Best performing model: in accuracy RNN60-LSTM (89.50% with 261.8 Mb), in accuracy with low memory usage RNN60-small (83.86% with 9.8 Mb).