This Project involves a full speech recognition system that enables a user to transcribe and summarize podcasts,
transcribe a lecture into notes, and meetings.
The system is built with the help of a Python library called vosk. Vosk is a speech recognition toolkit.
-
Supports 20+ languages and dialects
-
Works offline, even on lightweight devices
-
Installs with simple pip3 install vosk
-
Portable per-language models are only 50Mb each, but there are much bigger server models available.
-
Allows quick reconfiguration of vocabulary for best accuracy.
-
Supports speaker identification beside simple speech recognition.