You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nice catch, I wasn't aware of the new model. Looks very promising. It may introduce a little more hallucinations on noisy audio, a problem that is already known from the normal large-v3 (for this reason, noScribe still uses large-v2). But the speed improvements might still justify a switch. I will test it.
I have to update some libraries and test everything, so please be patient. But I will try it out for sure. I'll update this issue if I have any results.
Thanks!
For Swiss German, large-v2 remains the best. v3 and v3-turbo have forgotten Swiss German.
Ah, interesting observation. Generally, there have been a lot of controversy if v3 is really an improvement or made things worse. Benchmarks are not everything.
I think we should implement a way for users to download and install custom models into noScribe. This would also allow using fine-tuned models for certain languages etc.
For instance, there is a model that transcribes Swiss Dialect in verbatim form, not 'translated' into German: https://huggingface.co/ss0ffii/whisper-small-german-swiss Unfortunately, it's based on the small whisper model which gives only mediocre quality. But it's interesting nonetheless.
Open AI just released Whisper 3 turbo: https://huggingface.co/deepdml/whisper-large-v3-turbo
According to several evaliations, it offers up to 5x faster inference speed with only a minimally higher error rate in some cases: https://huggingface.co/openai/whisper-large-v3/discussions/160
Please integrate it in neScribe, a higher speed of transcription would really expand and deepen use cases for this great software.
The text was updated successfully, but these errors were encountered: