Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use just released Whisper 3 Turbo for 5 times faster inference #88

Open
menelic opened this issue Oct 11, 2024 · 3 comments
Open

Use just released Whisper 3 Turbo for 5 times faster inference #88

menelic opened this issue Oct 11, 2024 · 3 comments

Comments

@menelic
Copy link

menelic commented Oct 11, 2024

Open AI just released Whisper 3 turbo: https://huggingface.co/deepdml/whisper-large-v3-turbo

According to several evaliations, it offers up to 5x faster inference speed with only a minimally higher error rate in some cases: https://huggingface.co/openai/whisper-large-v3/discussions/160

Please integrate it in neScribe, a higher speed of transcription would really expand and deepen use cases for this great software.

@kaixxx
Copy link
Owner

kaixxx commented Oct 11, 2024

Nice catch, I wasn't aware of the new model. Looks very promising. It may introduce a little more hallucinations on noisy audio, a problem that is already known from the normal large-v3 (for this reason, noScribe still uses large-v2). But the speed improvements might still justify a switch. I will test it.
I have to update some libraries and test everything, so please be patient. But I will try it out for sure. I'll update this issue if I have any results.
Thanks!

@wwwebweber
Copy link

For Swiss German, large-v2 remains the best. v3 and v3-turbo have forgotten Swiss German.

@kaixxx
Copy link
Owner

kaixxx commented Oct 18, 2024

For Swiss German, large-v2 remains the best. v3 and v3-turbo have forgotten Swiss German.

Ah, interesting observation. Generally, there have been a lot of controversy if v3 is really an improvement or made things worse. Benchmarks are not everything.
I think we should implement a way for users to download and install custom models into noScribe. This would also allow using fine-tuned models for certain languages etc.
For instance, there is a model that transcribes Swiss Dialect in verbatim form, not 'translated' into German: https://huggingface.co/ss0ffii/whisper-small-german-swiss Unfortunately, it's based on the small whisper model which gives only mediocre quality. But it's interesting nonetheless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants