Skip to content

Basic audio data analysis and music genre classifier using Tensorflow

License

Notifications You must be signed in to change notification settings

archity/music-processing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Music Processing

This repository contains work related to audio processing and music genre classifier. The documentation for each of the following sections can be found within the corresponding notebooks.

  • 1. Input Data Visualization
  • 2. MFCC Extraction
  • 3. Genre Classifier using GTZAN dataset
  • 4. Extracting & testing (predicting genre) of songs from Spotify



0. Installation

  • From the root of this repository, install the top-level package src:

    pip install -e .
    
  • Install all the libraries in the requirements.txt file:

    pip install -r requirements.txt
    

1. Input Data Visualization

Notebook

Some basic analysis of an audio file like waveform plotting, spectrum display were performed in order to understand about audio data type

  • Waveform plotting
  • Power Spectral Density (PSD) plot
  • Spectrogram
  • Mel Spectrogram
  • MFCC

2. MFCC Extraction

Notebook

Utility to read all the .wav files stored in separate folders according to their genre, extract MFCC, and store these values in json format.


3. Genre Classifier using GTZAN Dataset

Notebook

  • Neural Network
  • Improved Neural Network
  • Convolutional Neural Network (CNN)
  • RNN - LSTM

Following 10 genres were used for training:

0: "blues",
1: "classical",
2: "country",
3: "disco",
4: "hiphop",
5: "jazz",
6: "metal",
7: "pop",
8: "reggae",
9: "rock"

4. Detecting Genre of Songs from a Spotify Playlist

Script

A script based pipeline script broadly involving the following steps:

  1. Download a ~30sec sample of all songs from a public Spotify playlist
  2. Convert the songs from .mp3 to .wav
  3. Extract MFCCs from the playlist's tracks
  4. Extract MFCCs from the GTZAN dataset's tracks
  5. Train a Neural Network based model on GTZAN dataset
  6. Test and obtain results of the model on songs from Spotify playlist

Genre of a song is quite subjective, and a song can be composed of multiple genres. Instead of classifying a song into a single genre out of the 10 trained genres, the network outputs all the possible predictions of genres for each song. A song is broken into a certain number of segments (10 as chosen) and we get prediction of genre from each of the 10 segments. We simply extract the unique genres from the 10 (which usually turn out to be $\le$ 5).

Results of an LSTM based trained model on some of songs from the playlist 20th Century:

Song Name Predicted Genre
Black or White 3, '4', '7'
Danger Zone 3, 4, 7, '9'
I Ran (So Far Away) 3, 4, '9'
Fast Car 2, 4, '7', 8
Take My Breadth Away 2, '3', '9'
99 Luftballons 2, '3', 6, '9'

Indices in quotes and bold indicate the genre also reported by the respective song's Wikipedia page (including stylistic origins)

About

Basic audio data analysis and music genre classifier using Tensorflow

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published