Text-to-speech converts input text into human-like synthesized speech using Speech Synthesis Markup Language (SSML). Use neural voices, which are human-like voices powered by deep neural networks.
We specified some neural voices in the NeuralVoices.json file. You can add more neural voices. For a full list of neural voices, see supported languages.
-
NeuralTTS: Main page that drives the demo.
-
NeuralTTSDataLoader: A class used to save and load cached audio results.
-
NeuralVoices.json: A JSON file that contains a list of the available neural voices for this demo.
-
Authentication: The text-to-speech REST API requires an Authorization header. This class has logic to get a service access token.