Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split up long sentences to avoid errors from Google Cloud TTS #103

Open
bertfrees opened this issue Jul 23, 2024 · 1 comment
Open

Split up long sentences to avoid errors from Google Cloud TTS #103

bertfrees opened this issue Jul 23, 2024 · 1 comment
Labels
acus enhancement New feature or request

Comments

@bertfrees
Copy link
Member

bertfrees commented Jul 23, 2024

Occasionally, Google Cloud TTS returns the following error:

Some sentences generate audio that is too long. Consider splitting up long sentences with sentence breaking punctuation (e.g. periods), and/or removing SSML <break> tags.

Since sentence detection is currently done based on ".", it could indeed happen that sentences are very long, e.g. if they contain a lot of commas, colons and/or semicolons.

@bertfrees bertfrees added enhancement New feature or request acus labels Jul 23, 2024
@bertfrees
Copy link
Member Author

Modifying EuroSentenceDetector in order to make it split on e.g. ";" in addition to "." would be a possible solution for this issue, but it might not be semantically correct anymore to call the result "sentences".

Another approach could be to make it the responsibility of GoogleRestTTSEngine to split the SSML into smaller parts if this error is encountered, similar to how DefaultSSMLMarkSplitter does it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acus enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant