Use OpenAI Whisper for text-to-speech transcription #1

kevinjosethomas · 2024-05-30T20:02:28Z

Description

The current interface uses the browser's built-in SpeechRecognition object through the react-speech-recognition library. While this is functional, it is not as accurate as Whisper and also often misses out on the first few words.

The ideal solution would be a client-side solution to reduce the load on the server once this service is publicly accessible. However, I would also prefer to not send voice recordings to OpenAI so a locally hosted instance of Whisper on the Flask server might have to be the approach. I attempted to do this by transmitting audio from the client via websockets but it didn't quite work out. The whisper library is not really designed for relatime transcription, but rather uploaded files that are transcribed over time.

The other alternative is to use something like use-whisper and send the voice recordings to OpenAI to transcribe. This would reduce the server load and also make the transcription more reliable. Maybe toggling between two options for privacy might be the go-to solution in the future.

The text was updated successfully, but these errors were encountered:

kevinjosethomas added enhancement New feature or request expressive English → ASL technology labels May 30, 2024

kevinjosethomas changed the title ~~Use OpenAI Whisper for Text-to-Speech Transcription~~ Use OpenAI Whisper for text-to-speech transcription May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use OpenAI Whisper for text-to-speech transcription #1

Use OpenAI Whisper for text-to-speech transcription #1

kevinjosethomas commented May 30, 2024

Use OpenAI Whisper for text-to-speech transcription #1

Use OpenAI Whisper for text-to-speech transcription #1

Comments

kevinjosethomas commented May 30, 2024

Description