You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When evaluating a a folder containing bass.wav, drums.wav, other.wav, voice.wav and accompaniment.wav, the voice target will be evaluated twice: once for the set of sources [drums, bass, other, voice] and once for [voice, accompaniment].
The second evaluation will overwrite the first one for the voice target in the json files.
One would expect the SDR definition to not dependent on the other sources. However this is not the case, as the filters are computed using the cross correlations between all available sources (4 in the first case, 2 in the second). I observed that the second evaluation obtain consistently higher SDR score for voice, in my case around 0.2 SDR.
This can lead to unfair comparison between models or hard to reproduce results if one export the accompaniment.wav file or not. For instance in the SiSec evaluation campaign, the json for the MMDenseLSTM model (TAK2) contains an accompaniment entry, showing that its vocal metrics were overwritten.
On the other hand, the OpenUnmix model did not export this file and thus will obtain a worse SDR for vocals.
While the difference is not huge, I opened this issue to verify that it is normal that the SDR depends on the other sources and not just on the current source estimate, and also to see if this behavior should be documented.
As an example, one can use the wav available in this Dropbox folder. I also included the json files. They were generated as
where the folder without_accompaniment did not export the accompaniment.wav file and with_accompaniment contained one that is equal to the sum of bass, other and drums.
when you call museval from the cli you are running the eval_dir function. What you want instead is to use the SiSEC MUS task-like scenario functions eval_mus_track or eval_mus_dir which contain specific treatment of the accompaniment.
@faroit this specific treatment is exactly what is the source of confusion. It will silently override the vocals metrics from the 4 sources scenario and replace them with slightly better metrics from the 2 sources scenario, leaving the other 3 untouched.
As far as I can see, museval from the cli does not call eval_dir. Given the entry point in setup.py and the museval function which calls eval_mus_dir, which then call _load_track_estimates and finally eval_mus_track, which will have the described behavior.
When evaluating a a folder containing
bass.wav
,drums.wav
,other.wav
,voice.wav
andaccompaniment.wav
, the voice target will be evaluated twice: once for the set of sources[drums, bass, other, voice]
and once for[voice, accompaniment]
.The second evaluation will overwrite the first one for the
voice
target in the json files.One would expect the
SDR
definition to not dependent on the other sources. However this is not the case, as the filters are computed using the cross correlations between all available sources (4 in the first case, 2 in the second). I observed that the second evaluation obtain consistently higher SDR score for voice, in my case around 0.2 SDR.This can lead to unfair comparison between models or hard to reproduce results if one export the
accompaniment.wav
file or not. For instance in the SiSec evaluation campaign, the json for the MMDenseLSTM model (TAK2) contains anaccompaniment
entry, showing that its vocal metrics were overwritten.On the other hand, the OpenUnmix model did not export this file and thus will obtain a worse SDR for vocals.
While the difference is not huge, I opened this issue to verify that it is normal that the SDR depends on the other sources and not just on the current source estimate, and also to see if this behavior should be documented.
As an example, one can use the wav available in this Dropbox folder. I also included the json files. They were generated as
where the folder
without_accompaniment
did not export theaccompaniment.wav
file andwith_accompaniment
contained one that is equal to the sum ofbass
,other
anddrums
.Then running
one obtain
The text was updated successfully, but these errors were encountered: