Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I18N: Support more valid locales: language-script and language-script-region #724

Open
hanguokai opened this issue Nov 15, 2024 · 9 comments
Labels
i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. implemented: safari Implemented in Safari needs-triage: chrome Chrome needs to assess this issue for the first time neutral: firefox Not opposed or supportive from Firefox topic: localization

Comments

@hanguokai
Copy link
Member

hanguokai commented Nov 15, 2024

Locale Format

Locale identifier consists of language subtag, script subtag, region subtag, and one or more variant subtags.

For example,

  • zh-CN and zh-TW are language + region. They mean Chinese in different regions.
  • zh-Hans and zh-Hant are language + script. They mean Simplified Chinese and Traditional Chinese without regions.
  • zh-Hans-CN, zh-Hans-SG, zh-Hant-HK and zh-Hant-TW are language + script + region. They mean Simplified Chinese and Traditional Chinese in different regions.

For historical reasons, zh-CN and zh-TW are often used to represent Simplified Chinese and Traditional Chinese, but this is not rigorous or accurate.

The current level of support

Goal

All browsers support all valid locales, including:

  • language + region
  • language + script
  • language + script + region

Note: The languages supported by the browser and the languages supported by the extension store should be two different things, but there is some correlation.

Other discussions

@github-actions github-actions bot added needs-triage: chrome Chrome needs to assess this issue for the first time needs-triage: firefox Firefox needs to assess this issue for the first time needs-triage: safari Safari needs to assess this issue for the first time labels Nov 15, 2024
@hanguokai hanguokai added implemented: safari Implemented in Safari topic: localization i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. and removed needs-triage: safari Safari needs to assess this issue for the first time labels Nov 15, 2024
@carlosjeurissen
Copy link
Contributor

Also relevant: #131
Which compiles a list of the current situation in terms of what locale tags are supported by what browser/stores:
https://docs.google.com/spreadsheets/d/1N0f9Nl-PM_nafpJEfYbS8xV9Qvb6o6DvuBFWLTXRr2Y/edit#gid=1518318975

We should also align on properly formatted locale tags:
#642

The webExtension Stores should also receive support for this. Currently the Naver Whale extension store does not support loading extensions with locale tags which include script subtags (like sr-Latn or zh-Hans).

@carlosjeurissen
Copy link
Contributor

@hanguokai in terms of support. What do you expect from browsers? Say the browser uses "zh-CN" as locale. And the extension only has "zh-Hans" and "zh-Hant", should it actively search for a best match and pick "zh-Hans"? Or should it just be able to accept extensions which have locale files using language tags containing a script subtag?

@Rob--W Rob--W added neutral: firefox Not opposed or supportive from Firefox and removed needs-triage: firefox Firefox needs to assess this issue for the first time labels Nov 21, 2024
@hanguokai
Copy link
Member Author

@carlosjeurissen and other guys in last meeting, you seem to be discussing how to match and fallback. These are the implementation details, I am not very concerned about them yet. (Currently, no one has sponsored the implementation of it). I think the people who designed Locale/I18N system must have considered these issues, so we shouldn't reinvent anything new here.

I found a bit of relevant information (I18N experts are welcome to provide more useful information and reference implementations). BCP 47:

My high-level thoughts are:

  1. Browsers support these locale format messages files. For specific parts that are not supported (different browser implementations may vary), the browser can ignore it, or at least don't throw errors or refuse to publish the extension to the store.
  2. Make the best possible matches and fallbacks.

For developers, when browser support is incomplete or inconsistent, developers can at least copy multiple copies of the same content as a workaround. For example, developers can provide both zh-CN, zh-TW, zh-HK, and zh-Hans, zh-Hant in the extension. Note that Chinese is just a typical example here, there are many other languages and scripts or regional differences in the world.

@carlosjeurissen
Copy link
Contributor

@hanguokai If matching and fallbacks are not of any concern. What do you expect browsers to change here? What is the actionable items for this issue?

@hanguokai
Copy link
Member Author

@carlosjeurissen Yes, how to best fallback is another issue. Currently, the browser just doesn't support these locales. For example, you set your OS/Browser to zh_Hant_HK or zh_Hans_HK , and put an exactly message file in extension/_locales/ directory, the i18n.getMessage() doesn't return these exactly matching files. So, my goal is to support more common countries/regions, languages, and their combinations, at least when they are 100% match.

BTW: I read the file structure and data content of CLDR and they have a hierarchy of structure and override handling, similar to extensions.

@carlosjeurissen
Copy link
Contributor

carlosjeurissen commented Mar 29, 2025

@hanguokai right. I believe this is by design. The Browser has a limited set of languages it has translations for. It will try to find the closest match based on the preference of your OS/Browser. This is not related to some browsers allowing you to customise the accept-languages header.

For extensions, browsers will use the language the browser resolved to as to prevent having mixed languages. Correct me if I am wrong but I believe this is how it works in all browsers currently. For missing messages, a specific fallback algorithm is used, see #296

This is one of the motivations of introducing i18n.getPreferredSystemLanguages() and i18n.getSystemUILanguage() as specified in https://github.com/w3c/webextensions/blob/main/proposals/i18n-system-languages.md which could be used for your i18n.setLanguage() proposal (#641) and some i18n.getMessage() with a specified language (#274)

@hanguokai
Copy link
Member Author

@carlosjeurissen I never talk about HTTP accept-language header or the current browser implementation. There is not much to discuss about this issue itself, it is just a feature to be supported (implemented) in the future.

@carlosjeurissen
Copy link
Contributor

carlosjeurissen commented Mar 30, 2025

@hanguokai Currently there is no browser which throws warnings or errors when an extension is loaded with locales tags including script subtags (like zh_Hant_CN).

The only extension store rejecting such tags is Whale Extension Store. See this post I opened in 2021 on their forum https://forum.whale.naver.com/topic/39749/. The Whale Extension Store only accepts the primary subtags (en, zh), or primary subtags connected with a region subtag (zh_CN, es_419). While simply not accepting extension packages including tags like ca_valencia and zh_Hant.

Just trying to understand you here. Do you wish to change the browsers behaviour to directly use the OS language for extension messages independently of what language the browser UI is displayed in? (Instead of the current behaviour in which the browser UI language is used for extension messages).

@hanguokai
Copy link
Member Author

Do you wish to change the browsers behaviour to directly use the OS language for extension messages independently of what language the browser UI is displayed in? (Instead of the current behaviour in which the browser UI language is used for extension messages)

Good question! But my answer is not a simple yes or no, it depends on the situation.

When the browser follows the OS locale and the browser supports the OS's language

In this situation (this is the case for most users), I believe both the browser and extensions should follow OS locale, because the OS provides the most comprehensive locale information (e.g. the OS allows you to set zh-Hant-HK, but the browser only supports zh-Hant).

Image

When the user manually specifies the browser language, or the browser does not support the OS's language

In this situation, the user has explicitly specified a language different from the OS's, in which case the extension should follow the browser's language, not the OS's.

Image

How to specify the browser language

Chrome on desktop:

  • Command-line options: chrome --lang=locale
  • On Windows: chrome://settings/ allow users to change the browser UI language
  • On Mac and Linux, no browser settings UI to do it, the browser follows the OS.

Safari is similar to Chrome on Mac, no browser settings UI to do it, the browser follows the OS.

Firefox allow users to change the browser UI in it's settings, like Chrome on Windows.

The difference between locale and language

Strictly speaking, locale not only contains language, it also contains information such as region, script, calendar, currency, collation, etc. Operating systems usually allow users to set all of this information, but no browser provides a UI for users to set all of this information. At most, browsers can only set the language plus a few regions.

So, in the context of browsers and extensions, locale usually refers to the language rather than the complete locale information.

About this issue

  1. This issue itself does not require any new API.
  2. Regarding the extension store, it is certainly better if the extension store supports more locales. But even if the extension store does not support a specific locale, this should not affect the support of the extension.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. implemented: safari Implemented in Safari needs-triage: chrome Chrome needs to assess this issue for the first time neutral: firefox Not opposed or supportive from Firefox topic: localization
Projects
None yet
Development

No branches or pull requests

3 participants