fix: correct Chinese and special characters display in HTML renderer #305
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue Description
The current HTML renderer has issues with displaying non-ASCII characters (like Chinese, Japanese, Korean) correctly. This is because:
encoding
is explicitly specified in the data URLRoot Cause
The issue occurs because Base64-encoded HTML content needs proper character encoding handling regardless of whether the charset is explicitly specified. When
atob()
decodes Base64 content, it returns a string of bytes using Latin1 encoding, which needs to be properly decoded using the correct charset. For more information about Base64, see MDN documentation.Changes Made
Key Improvements
Uint8Array.from()
for more concise and efficient buffer creationTesting
Tested with HTML files containing:
All characters now display correctly regardless of whether charset is explicitly specified in the data URL.