Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to add Custom TTS model (i.e Coqui TTS) #123

Open
akshatrocky opened this issue Apr 17, 2024 · 3 comments
Open

Unable to add Custom TTS model (i.e Coqui TTS) #123

akshatrocky opened this issue Apr 17, 2024 · 3 comments

Comments

@akshatrocky
Copy link

I was unable to add Custom TTS (i.e Coqui TTS). Tried to add model information in model.json but it doesn't seems to work, maybe I am doing it wrong. What is the procedure to add Custom TTS model in Speech Note application.
Thanks for making this great app for Linux :)

@mkiol
Copy link
Owner

mkiol commented Apr 18, 2024

Hi. Thanks for the report.

As you probably know, you need to edit ~/.var/app/net.mkiol.SpeechNote/data/net.mkiol/dsnote/models.json file and add new entry with model configuration.

This entry should be similar to the one below.

        {
            "name": "New cool voice",
            "model_id": "en_coqui_new_cool_model",
            "engine": "tts_coqui",
            "lang_id": "en",
            "checksum": "8bc7e85b",
            "checksum_quick": "50984d2b",
            "comp": "dir",
            "urls": [
                "file:///path/to/model/config.json",
                "file:///path/to/model/model.pth"
            ],
            "size": "100827994"
        },

Few important remarks:

  • model_id has to be unique
  • If the model files are located on your local drive, use the file:// URL type.
  • Put URLs for every file that is needed by the model (config.json and model.pth are just an example)
  • If your model uses custom vocoder you need to add it in the sups sub object (example: es_coqui_tacotron_mai from models.json)
  • To generate checksum and checksum_quick, use --gen-checksum command line option. To do this, put empty strings in both checksum and checksum_quick, save the file and run Speech Note with --verbose --gen-checksum options
flatpak run net.mkiol.SpeechNote --verbose --gen-checksums

The model will be downloaded automatically and the checksum should appear on the terminal.

[D] 18:15:52.802230735.802 0x7709dea87d00 () - all checksums were generated
models checksums:

"model_id": "fr_coqui_css100_vits",
"checksum": "a7671b81",
"checksum_quick": "7d7531cf",
"size": "100821187",

Let me know if any of this was helpful.

@akshatrocky
Copy link
Author

Thanks, this did work but what about adding a custom multi-language model i.e fine tuned XTTS model on it? Do I have to add multiple model ids for different language the XTTS model supports?

@mkiol
Copy link
Owner

mkiol commented Apr 24, 2024

XTTS? Nice :)

custom multi-language model

For multilingual models you may use "model aliases". Alias is a copy of the model entry but with changed properties (like language for instance). To create alias, define new model entry with model_alias_of param. Look at the example below.

Model multilang_coqui_xtts203 is a base model. It is hidden for the user thanks to hidden : true. This "base" model is used by en_coqui_xtts203 and pt_coqui_br_xtts203 aliases.

        {
            "name": "Multilingual (Coqui XTTS-v2.0.3)",
            "model_id": "multilang_coqui_xtts203",
            "engine": "tts_coqui",
            "lang_id": "multilang",
            "checksum": "ae3c9981",
            "checksum_quick": "ce376c5d",
            "options": "xs",
            "features": [
                "tts_voice_cloning"
            ],
            "license": {
                "id": "CPML",
                "name": "Coqui Public Model License 1.0.0",
                "url": "https://coqui.ai/cpml.txt",
                "accept_required": true
            },
            "comp": "dir",
            "urls": [
                "https://huggingface.co/coqui/XTTS-v2/resolve/69d4f754575c4b72d991f105b4775d270438ef33/model.pth",
                "https://huggingface.co/coqui/XTTS-v2/resolve/69d4f754575c4b72d991f105b4775d270438ef33/config.json",
                "https://huggingface.co/coqui/XTTS-v2/resolve/69d4f754575c4b72d991f105b4775d270438ef33/vocab.json"
            ],
            "size": "1868302897",
            "hidden": true
        },
        {
            "name": "English (Coqui XTTS-v2.0.3)",
            "model_id": "en_coqui_xtts203",
            "model_alias_of": "multilang_coqui_xtts203",
            "lang_id": "en"
        },
        {
            "name": "Português brasileiro (Coqui XTTS-v2.0.3)",
            "model_id": "pt_coqui_br_xtts203",
            "model_alias_of": "multilang_coqui_xtts203",
            "lang_id": "pt"
        },

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants