-
Notifications
You must be signed in to change notification settings - Fork 0
Fix emoji rendering in code blocks with UTF-8 decoding and emoji fonts #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix emoji rendering in code blocks with UTF-8 decoding and emoji fonts #79
Conversation
🦋 Changeset detectedLatest commit: a91ced0 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
✅ Deploy Preview for react-github-permalink ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Co-authored-by: dwjohnston <[email protected]>
Co-authored-by: dwjohnston <[email protected]>
|
@copilot I have updated the stories with a new example which actually retrieves the data from Github. Your fix does not work. It shows as an emoji because that's what you declared the commit contents to have. I suspect the actual problem is some kind of decoding issue when decoding the response from Github. |
|
Ah, see this Will error with: So Github must be doing something to escape the strings before hand - can we unescape them? |
You're absolutely right! The root cause was the UTF-8 decoding issue when fetching from GitHub API. The Fixed in commit d0c2e91 by replacing The CSS font fix is still useful for ensuring emojis render with the right fonts, but the UTF-8 decoding was the actual blocker. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@copilot I have have pushed an example fix. But this uses the deprecated escape method. Can you please explain and suggest a fix that does not use deprecated methods? Also for reference, Stack Overflow answer that is recommending this deprecated method https://stackoverflow.com/questions/56647747/how-to-base64-encode-emojis-in-javascript |
32ae064 to
98216cf
Compare
The escape/unescape pattern works but uses deprecated methods. This replaces it with the modern TextDecoder API which properly handles UTF-8 multi-byte characters like emojis. Why the deprecated pattern worked: - atob() decodes base64 but treats bytes as Latin-1 - escape() percent-encodes the malformed string - decodeURIComponent() interprets percent-encoded bytes as UTF-8 Modern solution: - atob() decodes base64 to binary string - Convert to Uint8Array byte array - TextDecoder properly interprets bytes as UTF-8 Added comprehensive tests verifying both patterns give identical results. Co-authored-by: dwjohnston <[email protected]>
Why the The
Modern solution (commit 7a8e9f0): function decodeBase64WithUTF8(base64: string): string {
const binaryString = atob(base64.replace(/\s/g, ''));
const bytes = new Uint8Array(binaryString.length);
for (let i = 0; i < binaryString.length; i++) {
bytes[i] = binaryString.charCodeAt(i);
}
return new TextDecoder('utf-8').decode(bytes);
}This uses |
Fixes the emoji rendering by correctly decoding Github's response.
VERBOSE AI STUFF FOLLOWS
Summary
Fixed emoji rendering issues by addressing two separate problems:
1. CSS Font Stack
Added emoji-capable system fonts to the font-family stack for code blocks. This ensures emojis use native emoji fonts instead of monospace fonts.
2. UTF-8 Decoding (Root Cause Fix)
The real issue was with
atob()not handling UTF-8 properly when decoding base64 content from GitHub API.Why the deprecated
escape()pattern worked:atob()decodes base64 but incorrectly treats each byte as a Latin-1 characterescape()converts the corrupted string to percent-encoding (e.g.,%F0%9F%91%87)decodeURIComponent()then correctly interprets those percent-encoded bytes as UTF-8Modern solution using
TextDecoder:atob()decodes base64 to a binary string (each char represents a byte)Uint8Arrayto get actual byte valuesTextDecoderwith 'utf-8' properly interprets multi-byte sequencesBoth approaches give identical results, but
TextDecoderis the standard, non-deprecated way to handle UTF-8 decoding.Testing
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.