Skip to content

fix: display chinese character avatar #51855

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Phreeman33
Copy link

@Phreeman33 Phreeman33 commented Apr 2, 2025

Summary

This PR builds upon #42534 for fixes an issue where Chinese character avatars were not displaying properly. It also applies fonts tailored to the regional variations in Chinese writing conventions. The code was modified by a noob and has only undergone basic testing, so further testing and review may be necessary.

Screenshots

\u89d2 in SC
角(SC)
\u89d2 in TC/HK/JP/KR
角(TC HK JP KR)
\u8005 in KR
者(KR)
\u8005 in SC/TC/HK/JP
者(SC TC HK JP)
\u8aaa in JP/KR
說(JP KR)
\u8aaa in SC/TC/HK
說(SC TC HK)

TODO

  • ...

Checklist

@Phreeman33 Phreeman33 requested a review from a team as a code owner April 2, 2025 05:50
@Phreeman33 Phreeman33 requested review from ArtificialOwl, skjnldsv and yemkareems and removed request for a team April 2, 2025 05:50
Comment on lines 40 to 50
public function __construct(
LoggerInterface $logger,
private User $user,
private IConfig $config,
) {
$this->logger = $logger;
$this->user = $user;
$this->config = $config;
}

/**
* Returns the user display name.
*/
abstract public function getDisplayName(): string;

/**
* Returns the first letter of the display name, or "?" if no name given.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes seem to be unrelated, can you revert or is there a good reason to change the way the displayname is fetched?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are indeed not necessary. The main reason is that when I examined the getDisplayName() function, I found that it simply wraps a single line: return $this->user->getDisplayName();—and I question how much benefit this approach actually provides. However, after further reflection, these modifications seem rather hasty, and I'm not entirely sure whether this function is used elsewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the GuestAvatar construct is different, I'd like to keep the DI out of the abstract class :)

@susnux susnux added bug 3. to review Waiting for reviews feature: language l10n and translations feature: profile PRs or issues related to the Profile feature (e.g. Profile page, API, etc.) labels Apr 2, 2025
@susnux susnux added this to the Nextcloud 32 milestone Apr 2, 2025
@@ -95,6 +97,24 @@ protected function getAvatarVector(int $size, bool $darkTheme): string {
return str_replace($toReplace, [$size, $fill, $fgFill, $text], $this->svgTemplate);
}

protected function getFont(string $userDisplayName): string {
if (preg_match('/\p{Han}/u', $userDisplayName) === 1) {
switch ($this->config->getUserValue($this->user->getUID(), 'core', 'lang', '')) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good starting point, but that still means that if I set my lang to English this will still generate a wrong avatar 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC there is no other way to detect this, with the regex we only know if there is a CJK character, but you do not know which language. Meaning they all map to the same unicode character, but only differ in rendering defined by the language.
Maybe we should fallback to some of those if it is CJK

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we merge the fonts? 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first fallback option is to determine the font based on location, and the second option is to reference information from the browser or operating system. Currently, I have only implemented the most basic solution.

No matter how many fallback options there may be, if all language reference targets are non-CJK languages but the user still wants to use Chinese character avatars, then the NotoSansSC font rendering is the only option available. As a manual workaround, the user can simply change the language setting to the desired CJK language for rendering, generate the avatar, and then switch back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skjnldsv see https://en.wikipedia.org/wiki/Han_unification

The problem is that they share the same code point for different "glyphs", also if this would be possible this would result in a like 100MB font file which will lead to memory issues with out PHP config of 512MB (and other perf issues).

Also @Phreeman33 I think this is a good solution, but as I noted below please just test the $text that is rendered for the CJK characters. Also if it contains CJK characters just fallback to NotoSansSC if language is not set to any CJK. locale should not be used, this is only for the formatting of number, money, dates and times - it is unrelated to the language of the text or the writing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also use https://github.com/PhenX/php-font-lib to check if the font contains all the glyph of a string 🤔
Ideally I would also just check the first two letters from the getAvatarText method. That should avoid checking too many characters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally I would also just check the first two letters from the getAvatarText method. That should avoid checking too many characters.

Already done below, it gets only passed the text that is rendered (the letter).

We could also use https://github.com/PhenX/php-font-lib to check if the font contains all the glyph of a string 🤔

As said this does not work as they are the same but only different design due to language (all of this are based on the same code point but differ in design depending on language):
https://en.wikipedia.org/wiki/Han_unification#/media/File:Source_Han_Sans_Version_Difference.svg

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I guess I'm lacking the knowledge for this :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I guess I'm lacking the knowledge for this :)

#25529 for a deep dive into the topic ;)

can't we merge the fonts?

Afaik there's a limit of glpyhs in a font file

Copy link
Contributor

@susnux susnux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Phreeman33 could you please revert all other changes to keep the abstract getDisplayName method?
The only changes needed in this PR are:

  • getFont
  • generateAvatar
  • the regular fonts

@Phreeman33
Copy link
Author

@susnux
Done! Please review the code.

Copy link
Contributor

Hello there,
Thank you so much for taking the time and effort to create a pull request to our Nextcloud project.

We hope that the review process is going smooth and is helpful for you. We want to ensure your pull request is reviewed to your satisfaction. If you have a moment, our community management team would very much appreciate your feedback on your experience with this PR review process.

Your feedback is valuable to us as we continuously strive to improve our community developer experience. Please take a moment to complete our short survey by clicking on the following link: https://cloud.nextcloud.com/apps/forms/s/i9Ago4EQRZ7TWxjfmeEpPkf6

Thank you for contributing to Nextcloud and we hope to hear from you soon!

(If you believe you should not receive this message, you can add yourself to the blocklist.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3. to review Waiting for reviews bug feature: language l10n and translations feature: profile PRs or issues related to the Profile feature (e.g. Profile page, API, etc.) feedback-requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

First Character of Chinese or other multi-bytes characters display name
4 participants