Skip to content

A GLaDOS TTS, using Forward Tacotron and HiFiGAN. Inference is fast and stable, even on the CPU. A low quality vocoder model is included for mobile use. Rudimentary TTS script included. Works perfectly on Linux, partially on Maybe someone smarter than me can make a GUI.

License

Notifications You must be signed in to change notification settings

benediktkr/glados-tts

 
 

Repository files navigation

GLaDOS Text-to-speech (TTS) Voice Generator

Build Status git github MIT

Neural network based TTS Engine.

Notes about this fork

Forked by ben (:github: @benediktkr) from github:VRCWizard/glados-tts-voice-wizard, which in turn was a fork of github:R2D2FISH/glados-tts.

This fork modernizes and improves the Python code in the project and does a bunch of housekeeping.

  • [DONE]: Gets rid of the SciPy dependency (replaced with the more modern and lightwight pysoundfile (since all it was used for was writing a .wav file to disk)
  • [DONE]: Support modern stable Python 3 versions, and update dependencies.
  • [DONE]: Versioned packages with poetry and pyproject.toml
  • [DONE]: Configuration handling with click.
  • [DONE]: Better logging with loguru
  • [DONE]: Python coding style and code quality improvements (proper handling of file object, improved logging..)
  • [DONE]: Switch to using ASGI with uvicorn and fastapi instead of Flask and WSGI, and support production-capable deployments as default.
  • [DONE]: Docker support
  • [TODO]: Support Home Assistant through the notify integration
  • [TODO]: see if its possible to avoid espeak-ng as a system package dependency (python bindings, buliding the C library, etc)

No work on the speech model itself is expected.

chell

Description

The initial, regular Tacotron model was trained first on LJSpeech, and then on a heavily modified version of the Ellen McClain dataset (all non-Portal 2 voice lines removed, punctuation added).

  • The Forward Tacotron model was only trained on about 600 voice lines.
  • The HiFiGAN model was generated through transfer learning from the sample.
  • All models have been optimized and quantized.

Install

First you need to install the espeak-ng system packages.

# for debian/ubuntu:
sudo apt-get install espeak-ng

# for fedora/amazon:
sudo yum install espeak-ng

This can hopefully be improved in the future. There is a Python bindings for espeak (at a glance, found py-espeak-ng).

Then install the poetry-managed virtualenv

poetry install

Usage

If you want to just play around with the TTS, works on the shell:

poetry run gladosctl

The TTS engine can also run as a web server:

poetry run gladosctl restapi

A public instance of the http api is running at http://www.sudo.is/api/glados, where you can also read api documentation.

About

A GLaDOS TTS, using Forward Tacotron and HiFiGAN. Inference is fast and stable, even on the CPU. A low quality vocoder model is included for mobile use. Rudimentary TTS script included. Works perfectly on Linux, partially on Maybe someone smarter than me can make a GUI.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 83.5%
  • Jinja 9.9%
  • Dockerfile 6.6%