Evaluating Faster Whisper - A tool and issues I found with it #1296

DKWoods · 2025-05-08T19:25:47Z

DKWoods
May 8, 2025

Introducing FWEval

I have written a small tool that I use to systematically evaluate processing speed and accuracy of different models within faster_whisper. You can check it out in my FWEval repository. I hope others will find it interesting and useful.

Issues I have discovered

Overall, I've found Faster Whisper to be fantastic. It performs well overall. I'm particularly pleased that transcripts within each model for each file I've tested are identical regardless of what hardware (computer, OS, and device) I use. They differ from model to model, of course, but each model behaves consistently in very different environments.

However, I have found a few issues. Among the issues I have found using FWEval are:

For the most part, the English-specific versions of Faster Whisper models do not appear to be measurably better than the multi-lingual versions. Am I missing something here?
The Distil-Large-v2, Distil-Medium.en, and Distil-Small.en models are wildly inaccurate under most circumstances I've tested, even with high quality audio in English. The Tiny model requires really high quality audio or it, too, is not acceptably accurate. I consistently find these models unreliable. The Large and Large-v3 models appear to be more sensitive to audio quality than other models as well. Are others finding this as well?
The Distil-Large-v2 and Distil-Large-v3 models only work for English data, despite claiming to be multi-lingual. That is, these models provide English transcripts even when given non-English data, even when I expect transcripts in the language of the source file.
And (based on using a different program) it appears that when you explicitly request an English translation of a non-English data file, the Large-v3-Turbo and the Turbo models do not perform a translation, but give you a native-language transcript that differs from the native-language transcript you get when you don't request an English translation.

If anyone knows ways to resolve any of these issues, I'd love to hear about it. When it's just a few models out of the larger set of options, that makes me somewhat less inclined to assume it's my fault, but I'm certainly open to that possibility.

Purfview · 2025-05-08T23:08:14Z

Purfview
May 8, 2025

PEBKAC

0 replies

DKWoods · 2025-05-09T19:46:24Z

DKWoods
May 9, 2025
Author

Thanks, @Purfview, for that that helpful and insightful comment.

Would you be willing to enlighten me about what I'm doing wrong that makes 2 or 3 models fail at some tasks when using identical code works for the other 15 (for English) or 9 (for non-English languages) models I tested? If I'm the problem, as you suggest, I'm happy to learn how to make things work better.

3 replies

Purfview May 9, 2025

There is no issues, your wrong assumptions are the issues. For example you assume distil models are multilingual when they are not, you do translation with turbo when model is not meant to do that, and ect.

DKWoods May 10, 2025
Author

If the distil models are not multilingual, why does model.supported_languages() suggest that they support so many languages? The ".en" models don't. (EDIT: Some distil models with the ".en" convention indicate they are English only in supported_languages() calls. My comment referred to other distil models without the ".en" extention that supply a full list of languages when supported_languages() is called.) In this case, my assumption is that the supported_languages() method for the distil models would accurately reflect the languages that are supported, just like it does with all the other models. That's what that method is there for. I don't think that is a radical or mis-guided assumption to expect that method, which is there to tell us what languages are supported, to be accurate. Sorry, I don't think this should be dismissed as PEBCAK. Language support can be queried in each model. If that information is not correct, I don't see that as user error.

Yeah, I expected the turbo model to work the same way all the other models do with respect to translation. Maybe that reflects an assumption on my part. I've never found any documentation that provides information on what features of faster_whisper are or are not supported by individual models. In the absence of helper methods like supported_languages(), how an I supposed to know when to expect individual models or groups of models to deviate from well-established functionality? Is it unreasonable, in the absence of information to the contrary, to assume that new models will work the same way older models do?

MahmoudAshraf97 May 10, 2025
Maintainer

Theoretically, the distil models support other languages but the distillation process they went through almost completely erased the multilingual information learned, so model.supported_languages() showing the full language list is both correct and incorrect, depends on how you look at it.
What I understood from this discussion that faster whisper somehow assumes some prior knowledge about whisper itself as most information is scattered across multiple repos and discussions all over the internet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Evaluating Faster Whisper - A tool and issues I found with it #1296

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Evaluating Faster Whisper - A tool and issues I found with it #1296

Uh oh!

DKWoods May 8, 2025

Introducing FWEval

Issues I have discovered

Replies: 2 comments · 3 replies

Uh oh!

Purfview May 8, 2025

Uh oh!

DKWoods May 9, 2025 Author

Uh oh!

Purfview May 9, 2025

Uh oh!

Uh oh!

DKWoods May 10, 2025 Author

Uh oh!

MahmoudAshraf97 May 10, 2025 Maintainer

DKWoods
May 8, 2025

Replies: 2 comments 3 replies

Purfview
May 8, 2025

DKWoods
May 9, 2025
Author

DKWoods May 10, 2025
Author

MahmoudAshraf97 May 10, 2025
Maintainer