Skip to content

Conversation

torotake
Copy link
Contributor

@torotake torotake commented Aug 31, 2025

PR Details

Add UTF-16 code unit aware helpers (utf16UnitCountInString, truncateUTF16Units) and update sheet name validation to use code unit length.

Description

This change introduces two new internal helpers:

  • utf16UnitCountInString: counts UTF-16 code units (BMP runes =1, supplementary runes =2).
  • truncateUTF16Units: safely truncates a string to a maximum number of UTF-16 code units without splitting surrogate pairs.

Sheet name validation (checkSheetName) now enforces the 31 unit limit using UTF-16 code units instead of plain rune count, aligning behavior with Excel’s internal UTF-16 length semantics (emoji / supplementary characters consume two units). This prevents accepting names that Excel would later reject or truncating inside a surrogate pair in downstream consumers.

Related Issue

#2205

Motivation and Context

The Excel limit is effectively a UTF-16 unit, not a Unicode scalar number.
The previous length check did not properly handle strings containing surrogate pairs, resulting in problematic Excel files.
This will be fixed so that correct Excel files are output.

How Has This Been Tested

  • Count correctness for BMP-only strings.
  • Mixed BMP + supplementary (emoji) sequences.
  • Boundary where max falls exactly before / inside a surrogate pair.
  • Zero and negative max cases (empty result).
  • No-op when already within limit.
  • Truncation preserves whole characters (no half surrogate).

Testing Environment:
Go version: go version go1.24.2 darwin/arm64
OS: macOS 15.5

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Stricter rejection of sheet names that previously slipped through when they exceeded the true UTF-16 unit limit via supplementary characters. This aligns behavior with Excel and is considered a correctness fix.

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

Copy link
Member

@xuri xuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR. I've left some comments.

@xuri xuri added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Sep 1, 2025
@xuri xuri moved this to Bugfix in Excelize v2.10.0 Sep 1, 2025
@xuri xuri linked an issue Sep 1, 2025 that may be closed by this pull request
2 tasks
Copy link

codecov bot commented Sep 1, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.24%. Comparing base (845a274) to head (eacfa6e).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2206   +/-   ##
=======================================
  Coverage   99.24%   99.24%           
=======================================
  Files          32       32           
  Lines       30574    30581    +7     
=======================================
+ Hits        30343    30350    +7     
  Misses        153      153           
  Partials       78       78           
Flag Coverage Δ
unittests 99.24% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@xuri xuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made some changes based on your branch, the code review issues have been fixed.

@xuri xuri merged commit 2e1d05f into qax-os:master Sep 1, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
Status: Bugfix
Development

Successfully merging this pull request may close these issues.

Issue with cell character count limitation
2 participants