Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate better ways to determine which language scripts require or omit sentence-separating whitespace #950

Open
nordzilla opened this issue Dec 2, 2024 · 0 comments

Comments

@nordzilla
Copy link
Contributor

Description

#945 introduces functionality to support CJK languages via the Intl.Segmenter, however for now we have hard coded the respective language tags zh, ja, and ko.

We should investigate and implement a more robust solution for determining the sentence-separation characteristics of our supported languages.

@nordzilla nordzilla changed the title Investigate better ways to implement which language scripts require or omit sentence-separating whitespace Investigate better ways to determine which language scripts require or omit sentence-separating whitespace Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant