Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct STRLEN function for UTF-8 multibyte characters #1584

Merged
merged 8 commits into from
Nov 1, 2024

Conversation

DuDaAG
Copy link
Contributor

@DuDaAG DuDaAG commented Oct 25, 2024

The STRLEN function now correctly counts the number of UTF codepoints in the Input (previously: The number of bytes in the UTF-8 serialization). So for example STRLEN("Bäh") now correctly returns 3`.

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much. I will let the tests run through once more and then merge this.

Copy link

codecov bot commented Oct 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.10%. Comparing base (2ebca4d) to head (a83d74b).
Report is 17 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1584      +/-   ##
==========================================
+ Coverage   88.97%   89.10%   +0.13%     
==========================================
  Files         368      371       +3     
  Lines       33819    34462     +643     
  Branches     3826     3899      +73     
==========================================
+ Hits        30090    30708     +618     
- Misses       2473     2482       +9     
- Partials     1256     1272      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two very minor suggestions to make the tools happy.

src/engine/sparqlExpressions/StringExpressions.cpp Outdated Show resolved Hide resolved
src/engine/sparqlExpressions/StringExpressions.cpp Outdated Show resolved Hide resolved
@DuDaAG DuDaAG requested a review from joka921 October 31, 2024 07:57
@sparql-conformance
Copy link

Conformance check passed ✅

Test Status Changes 📊

Number of Tests Previous Status Current Status
2 Failed Intended

Details: https://qlever.cs.uni-freiburg.de/sparql-conformance-ui?cur=a83d74b51114a97dce246be584f2d213c86893fa&prev=9510d463f1591bd2d36400dbc991777af60ae71d

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much!

@joka921 joka921 changed the title strlen with UTF-8 support Correct STRLEN function for UTF-8 multibyte characters Nov 1, 2024
@joka921 joka921 merged commit 7bd2438 into ad-freiburg:master Nov 1, 2024
3 of 18 checks passed
Copy link

sonarcloud bot commented Nov 1, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants