Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparing text with Unicode characters #51

Open
MiattoRocha opened this issue Jan 4, 2019 · 4 comments
Open

Comparing text with Unicode characters #51

MiattoRocha opened this issue Jan 4, 2019 · 4 comments

Comments

@MiattoRocha
Copy link

MiattoRocha commented Jan 4, 2019

Hello people.

I'm designing a system and want to use SearchJS to my advanced search solution, but I'm having trouble with comparing strings with accents, since I'm working to a Latin company, our first users will be using the website in Portuguese. (é; ê; ã; á; â; à; õ; ó; ô; ç; etc)
Using the Text option as true, would be nice to have a accent folding.
I was looking for it online and found an option, javascript has the String.prototype.toLocaleLowerCase(), using it instead could be a solution to i18n the SearchJS.

What do you think about this?

@deitch
Copy link
Owner

deitch commented Jan 7, 2019

I like the idea, although we should leave it as an option. Some may want to match only if the diacritics match.

javascript has the String.prototype.toLocaleLowerCase(), using it instead could be a solution to i18n the SearchJS

I don't think that does it. My little bit of experience with it shows that it keeps the accents.

When you say "accent folding", you mean that, e.g. any of àáâãäå would match a, etc.?

@deitch
Copy link
Owner

deitch commented Jan 7, 2019

There is a good sample here (for reference)

@MiattoRocha
Copy link
Author

Yeah, I'm talking about match strings with same base character, exactly like you àáâãäå example.
I see, toLocaleLowerCase isn't a way to solve this.

That's is a good sample, but what did you think is best, implement a solution like that or use a lib to handle this replace?
Using a small but solid lib could avoid maintenance of adding a new letter every time someone need.

@deitch
Copy link
Owner

deitch commented Jan 9, 2019

Using a small but solid lib could avoid maintenance of adding a new letter every time someone need.

As long as we could package it in. This does work in the browser now, and I wouldn't want to lose that ability.

The sheer number of languages with diacritical marks would make it a maintenance challenge. I speak Hebrew fluently, and the same thing exists (fortunately, most people write it without the additional accent characters).

Do you know of any good libs to use?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants