-
-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
train new languages #105
Comments
Hello! Not directly from the tool, but it is could be possible, although it requires some coding skills and a good dataset (audio + transcription pairs). What language are you trying to add? Cheers |
Cape Verdean Creole (kea). Unfortunately, there is no data set at all, as it is mainly a spoken language. It would be possible, however, to find a lot of audio recordings. I wonder if AI can achieve that at all, learning to transcribe audio without having transcription examples? Probably not... |
Languages share a lot of similarities between each other and I wouldn't be surprised if AI would be able to do that soon tbh...
I'm not sure how large the dataset needs to be, but you definitely need audio+transcription pairs - so someone has to prepare the transcriptions manually for training. It also depends how similar this Creole language is to others that the Whisper model might already recognize. BTW, have you tried not selecting any language and simply running the large-v2 model on the audio you have to transcribe and translate? There's also a huge library of models on HuggingFace, BTW: https://huggingface.co/models?library=whisper If adding custom models to the tool would help, we could find a way to add it to our backlog... |
There is definitely no transcription data set available, probably because of the lack of written accounts/transcriptions. I checked a couple of sites.
Cape Verdean Creole is based on Portuguese. If I don't select a language, Portuguese gets selected, but the transcriptions are obviously quite erroneous.
I think it would be worth trying to add a new language that could be identified by the model, so that the model would somehow (how?) be able to learn by the user's feedback (who could correct the transcript). Like OCR can learn from the user's feeedback to better recognize individual images/texts. But this of course goes far beyond your current approach. |
Is there any possibility a user could train a new language?
The text was updated successfully, but these errors were encountered: