Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add option to disable language-based diacritic stripping #2

Open
patch opened this issue Feb 3, 2014 · 0 comments
Open

add option to disable language-based diacritic stripping #2

patch opened this issue Feb 3, 2014 · 0 comments

Comments

@patch
Copy link
Owner

patch commented Feb 3, 2014

Some of the stemming algorithms will strip specific diacritical marks from the entire word. This type of word normalization in addition to stemming isn't always desired. Let's add an object attribute to optionally disable it.

For example, the to-be-implemented German stemmer replaces ä with a, ö with o, and ü with u.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant