In this tutorial speech to text recognition is presented.
This tutorial uses the quartznet 15x5 model. QuartzNet performs automatic speech recognition. Its design is based on the Jasper architecture, which is a convolutional model trained with Connectionist Temporal Classification (CTC) loss. The model is available from Open Model Zoo.
If you have not installed all required dependencies, follow the Installation Guide.