You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
EBU R128 uses BS.1770 as the algorithm for normalization. Using pyloudnorm should produce very similar results to ffmpeg.
Did you have a specific use case in mind? Currently pyloudnorm only measures integrated loudness but EBU R128 also includes short-term and momentary loudness. Was that what you were referring to?
My use-case is comparing relatively short Text-to-Speech (TTS) or Voice Converted (VC) samples converted between source speaker & condition to clean target speaker voice.
The samples are typically 2-14s long, with length normally distributed.
I noticed that RMS is sensitive to background noise e.g. for VC from noisy conditions to clean target conditions.
And as I want to compare side by side noisy and clean utterances I want them to be normalized to the same perceived loudness.
In general, I think that peak loudness normalization is the best.
I asked about EBU R 128 normalization because some other studies used it and it also uses peak normalization.
Hi!
Have you considered adding EBU-r128 normalization?
E.g. similar to the implementation below which however needs ffmpeg as a dependency?
https://github.com/slhck/ffmpeg-normalize#ebu-r128-normalization
The text was updated successfully, but these errors were encountered: