You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current interface uses the browser's built-in SpeechRecognition object through the react-speech-recognition library. While this is functional, it is not as accurate as Whisper and also often misses out on the first few words.
The ideal solution would be a client-side solution to reduce the load on the server once this service is publicly accessible. However, I would also prefer to not send voice recordings to OpenAI so a locally hosted instance of Whisper on the Flask server might have to be the approach. I attempted to do this by transmitting audio from the client via websockets but it didn't quite work out. The whisper library is not really designed for relatime transcription, but rather uploaded files that are transcribed over time.
The other alternative is to use something like use-whisper and send the voice recordings to OpenAI to transcribe. This would reduce the server load and also make the transcription more reliable. Maybe toggling between two options for privacy might be the go-to solution in the future.
The text was updated successfully, but these errors were encountered:
Description
The current interface uses the browser's built-in
SpeechRecognition
object through thereact-speech-recognition
library. While this is functional, it is not as accurate as Whisper and also often misses out on the first few words.The ideal solution would be a client-side solution to reduce the load on the server once this service is publicly accessible. However, I would also prefer to not send voice recordings to OpenAI so a locally hosted instance of Whisper on the Flask server might have to be the approach. I attempted to do this by transmitting audio from the client via websockets but it didn't quite work out. The whisper library is not really designed for relatime transcription, but rather uploaded files that are transcribed over time.
The other alternative is to use something like
use-whisper
and send the voice recordings to OpenAI to transcribe. This would reduce the server load and also make the transcription more reliable. Maybe toggling between two options for privacy might be the go-to solution in the future.The text was updated successfully, but these errors were encountered: