Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non ascii phrases aren't correctly determined #6

Open
Tarnak-public opened this issue Jun 4, 2021 · 0 comments
Open

non ascii phrases aren't correctly determined #6

Tarnak-public opened this issue Jun 4, 2021 · 0 comments

Comments

@Tarnak-public
Copy link

When using custom model with non English phrases (exactly Polish words with accents) I had problems with correct classifying texts using is_spam().
As a workaround I've used accents remover during train and checking( code: https://gist.github.com/AdoHaha/a76157c6de5155bf6b0adc77988724d9 ) which works great.
So, could you add normalizing parameter into code or fix accents somehow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant