Skip to content

Fine-Tune Other languages #5

@anlcakmak19

Description

@anlcakmak19

Hi,
I’m Anıl from the Turkish TTS community. I’ve been fine-tuning several TTS models such as Dia, Orpheus, and KaniTTS. While these models work reasonably well, many of them struggle with live performance, especially under parallel requests and real-time inference speed.

From the outside, your work looks significantly faster (almost x100 in speed), which is why I’m very interested in your approach.

I currently have ~250 hours of Turkish speech data. The data was scraped from YouTube, segmented, and transcribed. Audio durations range from 3 seconds to ~20 seconds.

I have a few questions:

Do you think ~250 hours of Turkish data is sufficient to teach a new language on top of an existing base model?

Are you planning to share the training / fine-tuning code?

Do you have a roadmap for releasing a multilingual model?

Thanks in advance for your work and for sharing your insights.
Looking forward to your response.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions