Correct `STRLEN` function for UTF-8 multibyte characters #1584

DuDaAG · 2024-10-25T07:43:58Z

The STRLEN function now correctly counts the number of UTF codepoints in the Input (previously: The number of bytes in the UTF-8 serialization). So for example STRLEN("Bäh") now correctly returns 3`.

joka921

Thank you very much. I will let the tests run through once more and then merge this.

codecov · 2024-10-28T10:57:55Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.10%. Comparing base (2ebca4d) to head (a83d74b).
Report is 17 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1584      +/-   ##
==========================================
+ Coverage   88.97%   89.10%   +0.13%     
==========================================
  Files         368      371       +3     
  Lines       33819    34462     +643     
  Branches     3826     3899      +73     
==========================================
+ Hits        30090    30708     +618     
- Misses       2473     2482       +9     
- Partials     1256     1272      +16

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

joka921

Two very minor suggestions to make the tools happy.

src/engine/sparqlExpressions/StringExpressions.cpp

Co-authored-by: Johannes Kalmbach <[email protected]>

sparql-conformance · 2024-10-31T10:04:29Z

Conformance check passed ✅

Test Status Changes 📊

Number of Tests	Previous Status	Current Status
2	Failed	Intended

Details: https://qlever.cs.uni-freiburg.de/sparql-conformance-ui?cur=a83d74b51114a97dce246be584f2d213c86893fa&prev=9510d463f1591bd2d36400dbc991777af60ae71d

joka921

Thank you very much!

sonarcloud · 2024-11-01T13:29:10Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

DuDaAG added 5 commits October 24, 2024 11:18

STRLEN für UTF 8 angepasst

47feb71

Test

4877e8f

test rückgäning

aefd889

find pull-request

b2cb8c6

Fix test

96b1959

joka921 approved these changes Oct 28, 2024

View reviewed changes

joka921 requested changes Oct 28, 2024

View reviewed changes

src/engine/sparqlExpressions/StringExpressions.cpp Outdated Show resolved Hide resolved

src/engine/sparqlExpressions/StringExpressions.cpp Outdated Show resolved Hide resolved

Update src/engine/sparqlExpressions/StringExpressions.cpp

b7806b8

Co-authored-by: Johannes Kalmbach <[email protected]>

DuDaAG requested a review from joka921 October 31, 2024 07:57

Update src/engine/sparqlExpressions/StringExpressions.cpp

a83d74b

Co-authored-by: Johannes Kalmbach <[email protected]>

Format

e73d1ab

joka921 approved these changes Nov 1, 2024

View reviewed changes

joka921 changed the title ~~strlen with UTF-8 support~~ Correct STRLEN function for UTF-8 multibyte characters Nov 1, 2024

joka921 merged commit 7bd2438 into ad-freiburg:master Nov 1, 2024
3 of 18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct `STRLEN` function for UTF-8 multibyte characters #1584

Correct `STRLEN` function for UTF-8 multibyte characters #1584

DuDaAG commented Oct 25, 2024 •

edited by joka921

Loading

joka921 left a comment

codecov bot commented Oct 28, 2024 •

edited

Loading

joka921 left a comment

sparql-conformance bot commented Oct 31, 2024

joka921 left a comment

sonarcloud bot commented Nov 1, 2024

Correct STRLEN function for UTF-8 multibyte characters #1584

Correct STRLEN function for UTF-8 multibyte characters #1584

Conversation

DuDaAG commented Oct 25, 2024 • edited by joka921 Loading

joka921 left a comment

Choose a reason for hiding this comment

codecov bot commented Oct 28, 2024 • edited Loading

Codecov Report

joka921 left a comment

Choose a reason for hiding this comment

sparql-conformance bot commented Oct 31, 2024

Conformance check passed ✅

Test Status Changes 📊

joka921 left a comment

Choose a reason for hiding this comment

sonarcloud bot commented Nov 1, 2024

Quality Gate passed

Correct `STRLEN` function for UTF-8 multibyte characters #1584

Correct `STRLEN` function for UTF-8 multibyte characters #1584

DuDaAG commented Oct 25, 2024 •

edited by joka921

Loading

codecov bot commented Oct 28, 2024 •

edited

Loading