Skip to content

Latest commit

 

History

History
22 lines (16 loc) · 2.1 KB

ANALYSIS.md

File metadata and controls

22 lines (16 loc) · 2.1 KB

How does the analysis process work?

For every song analyzed, libbliss returns a struct song which contains, among other things, four floats, each rating an aspect of the song:

  • The tempo rating follows this paper until part II. A), in order to obtain a downsampled envelope of the whole song. The song's BPM are then estimated by counting the number of peaks and dividing by the length of the song.
    The period of each dominant beat can then be deduced from the frequencies, hinting at the song's tempo. Warning: the tempo is not equal to the force of the song. As an example , a heavy metal track can have no steady beat at all, giving a very low tempo score while being very loud.

  • The amplitude rating reprents the physical « force » of the song, that is, how much the speaker's membrane will move in order to create the sound.
    It is obtained by finding the right curvature pattern of distribution of raw amplitudes.

  • The frequency rating is a ratio between high and low frequencies: a song with a lot of high-pitched sounds tends to wake humans up far more easily.
    This rating is obtained by performing a DFT over the sample array, and splitting the resulting array in 4 frequency bands: low, mid-low, mid, mid-high, and high. Using the value in dB for each band, the final formula corresponds to freq_result = high + mid-high + mid - (low + mid-low)

  • The attack rating is just a sum of the intensity of all the attacks divided by the song's length.
    As you have already guessed, a song with a lot of attacks also tends to wake humans up very quickly.

These ratings are supposed to be as disjoint as possible, to avoid any redundant feature. However, there still seem to be some correlation between the amplitude / attack rating, as can be seen in this 2D-plot for ~4000 songs:
Scatter plot of every feature against each other