-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename patched filters so the normalizer uses the default ones #409
Rename patched filters so the normalizer uses the default ones #409
Conversation
0e369eb
to
d3c20ca
Compare
invenio_vocabularies/contrib/funders/mappings/os-v2/funders/funder-v2.0.0.json
Outdated
Show resolved
Hide resolved
d6f3378
to
ae3b4f2
Compare
@@ -14,8 +14,8 @@ | |||
"type": "custom", | |||
"char_filter": ["strip_special_chars"], | |||
"filter": [ | |||
"lowercase", | |||
"asciifolding", | |||
"lowercasepreserveoriginal", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wgresshoff could I ask you to move all the changes to funder-v3.0.0.json files?
in case of breaking changes like these we need to upgrade the version of the mapping, otherwise instance upgrades are complex
so we need to (for both os-v1 and os-v2)
- copy the mappings to funder-v3.0.0.json
- move the changes you proposed here to funder-v3.0.0.json file
- revert funders-v2.0.0.json changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since in practice this doesn't break existing indices created with the old mapping but only affects search behavior/accuracy, I feel we can merge without bumping the mapping version.
This means that for deployment, one needs to either:
- Do the usual recreation of indices (with downtime)
- Do a live migration with some special handling since the old and new index name will be the same (only the timestamp suffix changes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We used the recreation of indices with downtime and the missing funders were reindexed without problem.
ae3b4f2
to
6cbca0b
Compare
❤️ Thank you for your contribution!
Close #408
Description
The normalizer that was introduced in the funders index with v5.1.0 generates two tokens when used on documents that contain diacritics in the acronym. This behaviour occurs when in the filter definition
"preserve_original": true
is used. This is ok for analyzers, so this PR just renames the filters and takes care the analyzers use the new ones and the normailzer uses the original ones.Checklist
Ticks in all boxes and 🟢 on all GitHub actions status checks are required to merge:
Frontend
Reminder
By using GitHub, you have already agreed to the GitHub’s Terms of Service including that: