Add extra Arabic diacritic and TTS models #141

Kentoseth · 2024-06-04T17:50:37Z

Hi there,

Thanks again for this wonderful project. I think we previously discussed that the format of the models for your app should be .ort. Fortunately for us, we now have more of these for Arabic.

Here are 2 diacritic models in .onnx format:

https://github.com/nipponjo/arabic_vocalizer

(I found some limitations with libtashkeel and opened an issue with the author to clarify: mush42/libtashkeel#2 )

I think the only complicated part here will be in the selection option (not present) of the vocalizer model. Right now it seems to default to the only model available.

And here is the .onnx TTS model:

https://github.com/nipponjo/tts_arabic

I wasn't able to detect the model file in the repo though.

(I don't know if the .onnx format will be an issue, as it is an intermediate model and not the production option)

The text was updated successfully, but these errors were encountered:

mkiol · 2024-06-05T17:02:47Z

I'm very happy that there is more Arabic support :) I will definitely check out these models.

I think the only complicated part here will be in the selection option (not present) of the vocalizer model. Right now it seems to default to the only model available.

Yes, this is a missing part and have to be implemented.

I don't know if the .onnx format will be an issue

No, it is not an issue. Onnx is used by piper and mimic3, so all needed libraries are already integrated and packed into Flatpak package.

Kentoseth · 2024-06-05T20:54:56Z

I've been discussing with libtashkeel author: mush42/libtashkeel#2 (comment)

He informed me that the piper model you are using from piper-phonemize is an MVP model and he has since updated to a better model.

It may be best to drop the MVP model entirely and use the .onnx available here:

https://github.com/mush42/libtashkeel/blob/main/libtashkeel_base/data/ort/model.onnx

To summarize, if you drop the MVP model, then there will be three new diacritics models available and one new Arabic TTS model for the app.

mkiol · 2024-06-07T16:29:47Z

Thanks a lot for all the insights!

Indeed, Speech Note currently uses tashkeel re-implemented to C++ version borrowed from Piper project. This version doesn't work with the latest ONNX model. To enable it, I need to integrate the newest libtashkeel. The problem is that libtashkeel uses Rust, so I need to introduce new compiler in my tool-chain. It is a lot of hassle but it is perfectly doable. I will try to do something for the next version (or next after next).

mkiol added the enhancement New feature or request label Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add extra Arabic diacritic and TTS models #141

Add extra Arabic diacritic and TTS models #141

Kentoseth commented Jun 4, 2024 •

edited

Loading

mkiol commented Jun 5, 2024

Kentoseth commented Jun 5, 2024

mkiol commented Jun 7, 2024

Add extra Arabic diacritic and TTS models #141

Add extra Arabic diacritic and TTS models #141

Comments

Kentoseth commented Jun 4, 2024 • edited Loading

mkiol commented Jun 5, 2024

Kentoseth commented Jun 5, 2024

mkiol commented Jun 7, 2024

Kentoseth commented Jun 4, 2024 •

edited

Loading