Skip to content

Conversation

sffc
Copy link
Member

@sffc sffc commented Aug 27, 2025

Checklist

  • Required: Issue filed: ICU-23177
  • Required: The PR title must be prefixed with a JIRA Issue number. Example: "ICU-1234 Fix xyz"
  • Required: Each commit message must be prefixed with a JIRA Issue number. Example: "ICU-1234 Fix xyz"
  • Issue accepted (done by Technical Committee after discussion)
  • Tests included, if applicable
  • API docs and/or User Guide docs changed or added, if applicable

@sffc sffc requested a review from Copilot August 27, 2025 03:43
@sffc sffc marked this pull request as ready for review August 27, 2025 03:43
@sffc
Copy link
Member Author

sffc commented Aug 27, 2025

API proposal has been sent. This is the C++ version. I will try to add the Java version tomorrow.

Copilot

This comment was marked as outdated.

Comment on lines +796 to +804
UnicodeString* fWideAmPms;
int32_t fWideAmPmsCount;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does increase the stack size of DateFormatSymbols, but it is already so big that the delta is only about 2-3%.

@sffc sffc requested a review from Copilot August 27, 2025 23:20
Copilot

This comment was marked as outdated.

@pedberg-icu
Copy link
Contributor

API proposal has been sent. This is the C++ version. I will try to add the Java version tomorrow.

@sffc But this is not yet in the ICU 78 API Proposals document

@@ -2396,23 +2450,23 @@ DateFormatSymbols::initializeData(const Locale& locale, const char *type, UError
assignArray(fStandaloneNarrowMonths, fStandaloneNarrowMonthsCount, fShortMonths, fShortMonthsCount);
}

// Load AM/PM markers; if wide or narrow not available, use short
// Load AM/PM markers.
// Narrow falls back to Abbr which falls back to Wide.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not the way the dayPeriod fallback works in CLDR. There, within a given style (format or stand-alone), wide and narrow both fall back to abbreviated; and stand-alone abbreviated falls back to format abbreviated. The root locale only has the format abbreviated forms.

@sffc

This comment was marked as resolved.

@sffc sffc requested review from pedberg-icu and Copilot August 29, 2025 01:01
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds width-aware day period support for AM/PM strings in SimpleDateFormat and DateFormatSymbols, allowing for abbreviated (default), wide, and narrow formatting styles. It also fixes data loading by properly handling separate AmPmMarkersAbbr and AmPmMarkers resources.

  • Implements width-aware AM/PM string formatting in SimpleDateFormat using pattern length (a, aaaa, aaaaa)
  • Adds new APIs to DateFormatSymbols for getting/setting AM/PM strings with width specification
  • Refactors resource loading to distinguish between abbreviated and wide AM/PM markers

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file
File Description
SimpleDateFormat.java Updates AM/PM formatting logic to use width-specific strings based on pattern length
DateFormatSymbols.java Adds ampmsWide field and width-aware getter/setter methods for AM/PM strings
dtfmtsym.h Adds C++ API declarations for width-aware AM/PM string methods
dtfmtsym.cpp Implements C++ width-aware AM/PM string handling and resource loading fixes
smpdtfmt.cpp Updates C++ SimpleDateFormat to use width-specific AM/PM formatting and parsing
Test files Adds comprehensive tests for new AM/PM width functionality and updates existing test expectations

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@pedberg-icu
Copy link
Contributor

I am traveling but should have a chance to take a closer look at this sometime in the next 3 days or so.

@pedberg-icu
Copy link
Contributor

@sffc This all looks good to me, with my one comment being that we should change the loading order & fallback for both C and J so that we load abbreviated first, and then have both wide and narrow fall back to abbreviated if they are missing. That way C and J would be consistent with each other and with the CLDR fallback order (in which wide/narrow fallback to abbreviated, and root only provides abbreviated). I will go ahead and approve in case you need to get this in and make those changes later, but I would like those changes at some point.

pedberg-icu
pedberg-icu previously approved these changes Sep 6, 2025
Copy link
Contributor

@pedberg-icu pedberg-icu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me, with my one comment being that we should change the loading order & fallback for both C and J so that we load abbreviated first, and then have both wide and narrow fall back to abbreviated if they are missing. That way C and J would be consistent with each other and with the CLDR fallback order (in which wide/narrow fallback to abbreviated, and root only provides abbreviated). I will go ahead and approve in case you need to get this in and make those changes later, but I would like those changes at some point.

@macchiati
Copy link
Member

Hmmm. I wonder if there is a way with conformance tests to verify the order and targets of aliases.
Just a thought: Perhaps we can have one or more artificial locales in CLDR with constant data having "holes" in it to force fallback. It wouldn't be accessible in production releases, but could be used in conformance tests to verify that the order and targets of inheritance matches order and targets of CLDR aliases. @sffc @echeran

@sffc
Copy link
Member Author

sffc commented Sep 9, 2025

I don't believe that the manual fallback code as written in C++ actually does anything. The data loading sink should see the alias in root and do the fallback itself. I consider the C++ code to be merely a robust coding style, consistent with the rest of the symbols loading. In Java, most of the symbols also do not have explicit fallbacks written in code.

@sffc
Copy link
Member Author

sffc commented Sep 9, 2025

Just a thought: Perhaps we can have one or more artificial locales in CLDR with constant data having "holes" in it to force fallback. It wouldn't be accessible in production releases, but could be used in conformance tests to verify that the order and targets of inheritance matches order and targets of CLDR aliases. @sffc @echeran

Having a test locale in CLDR to exercise these sorts of things would be valuable. We could leverage en-XA perhaps.

@@ -436,20 +436,27 @@ class U_I18N_API_CLASS DateFormatSymbols final : public UObject {
/**
* Gets AM/PM strings with the specified width. For example: "A" and "P".
* @param count Filled in with length of the array.
* @param context The usage context: FORMAT, STANDALONE. Currently ignored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should note that the values actually returned are the format styles.


/**
* Sets AM/PM strings with the specified width. For example: "A" and "P".
* @param ampms The new AM/PM strings. (not adopted; caller retains ownership)
* @param count The number of strings in the array.
* @param context The usage context: FORMAT, STANDALONE. Currently ignored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the values returned are the format styles.

@sffc sffc requested a review from pedberg-icu September 9, 2025 23:09
Copy link
Contributor

@pedberg-icu pedberg-icu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update!

@jira-pull-request-webhook
Copy link

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

@sffc sffc merged commit 71b9755 into unicode-org:main Sep 10, 2025
103 checks passed
@sffc sffc deleted the ICU-23177-ampm branch September 10, 2025 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants