Skip to content

Conversation

@mkavulich
Copy link
Collaborator

This is the second round of proposed rules changes based on our ongoing discussion to make both rules and names as consistent as possible. Major updates include:

Standard Names

  • New "base_names" section, with subsections
  • Convert instances of "surface_X" to "X_at_surface"
  • Convert some instances of "air_X" or "X_of_air"
  • Explicitly state what mass mixing ratios are with respect to in long_name
  • Convert "mixing_ratio" to "water_vapor_mixing_ratio_wrt_moist_air"
  • Convert "surface_albedo" to simply "albedo", "surface_roughness_length" to simply "roughness_length"

Rules

  • Update construction template to include [non-instant time] and [non-current time], swap [at level] and [in medium]
  • More detailed description of transformations and how they can work on multiple base names
  • Add a few more transformations

write_standard_name_table.py

  • Updated to allow sub-sections in the standard names list

You can see these changes in the form of a Google doc for better visualization here: https://docs.google.com/document/d/19ysUCWDhv53W8fQbElW7opr_1Pm7ck95QRUyKM_qy4E/edit?tab=t.0

Note: with this update, we have 347 standard names that are fully compliant with the rules we have set out. A big portion of the remainder are "flag"-type variables, indices, etc, which are not fully accounted for in the rules yet.

long_name="Molecular oxygen, O₂">
</standard_name>
<standard_name name="ozone"
long_name="Ozone, O₃">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will these non-ASCII characters work? (also line 246)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

long_name should be able to handle any unicode characters since it is not used in any FORTRAN code context. Even if long names were eventually used in comments or even character variables, that would be fine according to the FORTRAN 90 standard*, so long as it's a character set supported by the OS. I think anything in UTF-32 should be allowed for long names, since Python can handle those characters and it's essentially universally supported among OS's.

This does bring up the fact that we haven't defined a character set for the standard names; I'll add that to the to-do list. So long as we make anything that may possibly be used as a fortran variable/subroutine/other code strictly within the allowed Fortran character set, we should be good.

  • The relevant section on page 18 is here:

Additional characters may be representable in the processor, but may appear only in comments (3.3.1.1,
15 3.3.2.1), character constants (4.4.4), input/output records (9.1.1), and character string edit descriptors
16 (10.2.1).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we talk about this at the next standard names meeting, please? My immediate reaction is to stick with the basic ASCII characters and nothing else, but maybe I can be convinced otherwise ;-) At the very least, we need to run a test with CCPP (SCM, UFS, ...) if the parsers (prebuild and capgen) can handle those characters.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From today's discussion, we will restrict standard names (including long names) to ASCII character set.

<type kind="kind_phys" units="K m s-1">real</type>
</standard_name>
<standard_name name="surface_upward_specific_humidity_flux_for_mellor_yamada_janjic_surface_layer_scheme">
<standard_name name="upward_flux_of_water_vapor_mixing_ratio_wrt_moist_air_at_surface_for_mellor_yamada_janjic_surface_layer_scheme"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably define MYJ as the abbreviation to use in the standard name, and the full name in the long name? Like PBL and GWD in my open PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good idea, but I will add it to my next round of changes to avoid scope creep. There are probably more abbreviations to introduce beyond this, like MYNN.

<type kind="kind_phys" units="m">real</type>
</standard_name>
<standard_name name="surface_upward_latent_heat_flux_from_coupled_process">
<standard_name name="upward_latent_heat_flux_at_surface_from_coupled_process">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between for_coupling and from_coupled_process (not for vs from, but coupling vs coupled process)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point: In general I think the use of "coupling" and "coupled" in variable names is ambiguous and problematic. Probably needs a deeper dive after this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Collaborator Author

@mkavulich mkavulich Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will use "for coupling" suffix, and include a detailed description of what that means (pending further potential discussion).

Also potentially need new suffixes depending on CESM use case, dependent on group names? Maybe drop "coupling" all together in favor of "timing"/"order" suffixes

Copy link
Collaborator

@gold2718 gold2718 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great (and great big) piece of work!
A few requested changes and a lot of questions along with a few suggestions.
Also, should all long names begin with an capitalized word? Right now, there is a mix.

Copy link
Collaborator Author

@mkavulich mkavulich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gold2718 @climbfuji Thanks for your thorough reviews. I've addressed most of your comments/suggestions, and those that I haven't I replied with a follow-up comment or question. Let me know if anything else needs resolving.

Copy link
Collaborator

@nusbaume nusbaume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @mkavulich! I had a couple of change requests, but nothing that requires a full re-review.

comment="These names are used as bases for other names, but may\n
also be considered standard names on their own. See the\n
full list of standard names for further details.\n">
<standard_name name="absolute_vorticity"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these can be considered formal standard names should we include the units here as well? (I'm happy to hold this off for a future PR if that would be easier).

Copy link
Collaborator

@gold2718 gold2718 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks better but not all the issues are resolved (e.g., vis to visible), at least as of cd5ba37.

@gold2718
Copy link
Collaborator

Will #86 be merged in before final review?

@mkavulich
Copy link
Collaborator Author

@gold2718 I did not plan to resolve conflicts between the main branch and release branch at this time, since the changes are already so divergent. But this is a good idea; I will make a follow-up PR right now to bring in the relevant changes since this branch was cut.

I thought I had responded to your latest comments but I guess I forgot; I made the changes you requested re: vis to visible, was there anything else that needed attention or are you okay with this being merged?

@svahl991
Copy link
Collaborator

svahl991 commented Mar 3, 2025

This PR changes many of the names that we just adapted in JEDI last fall during a code sprint that took considerable effort. Before merging this, can we create a tag or version of this repository that JEDI can reference?

@gold2718
Copy link
Collaborator

gold2718 commented Mar 4, 2025

@gold2718 @climbfuji Thanks for your thorough reviews. I've addressed most of your comments/suggestions, and those that I haven't I replied with a follow-up comment or question. Let me know if anything else needs resolving.

Please just open issues for any changes that need changing (based on comments on this PR) so that they are not lost. I do not see one for the weight to scaling_factor change.

@gold2718
Copy link
Collaborator

gold2718 commented Mar 4, 2025

This PR changes many of the names that we just adapted in JEDI last fall during a code sprint that took considerable effort. Before merging this, can we create a tag or version of this repository that JEDI can reference?

No objections here. v0.1.2?

@mkavulich
Copy link
Collaborator Author

This PR changes many of the names that we just adapted in JEDI last fall during a code sprint that took considerable effort. Before merging this, can we create a tag or version of this repository that JEDI can reference?

@svahl991 Just a note, this PR is not being merged to main, but to a branch release/v1 as part of a multi-stage effort to make the names and rules consistent. We will not be applying these changes to the main branch until there has been substantial input from all interested parties.

We had discussed creating a tag back in the fall for JEDI's "stable" version but last time we had that discussion there were still some changes being debated. Is the current state of main the preferred version to make this tag? Or would it be a previous hash?

@mkavulich
Copy link
Collaborator Author

@gold2718 I have opened some issues that I think covers all of your concerns; I have been tracking them in a google doc but making sure they are tracked here as well is a good idea.

@svahl991
Copy link
Collaborator

svahl991 commented Mar 4, 2025

Just a note, this PR is not being merged to main, but to a branch release/v1 as part of a multi-stage effort to make the names and rules consistent.

Thanks @mkavulich . That's an important detail I missed. So, to make sure I understand:

  1. The main branch is still the "official" ESM (formerly CCPP) name for the time being.
  2. The release/v1 branch is building up a major re-work of the rules and names, but is not yet "official".
  3. The plan is to build this re-work up in the release branch, get consensus, and merge it into main at some point in the future.

I apologize if I missed and/or forgot past communications about this.

Copy link
Collaborator

@gold2718 gold2718 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good now, thanks!
(and thanks to @mkavulich for getting all those issues open)

* `volumetric_soil_moisture_between_soil_bottom_and_water_table`: Volumetric soil moisture between soil bottom and water table
* `real(kind=kind_phys)`: units = m3 m-3
* `water_vapor_mixing_ratio_wrt_moist_air_at_2m`: mixing ratio of the mass of water vapor to the mass of moist air, at two meters above surface
* `water_vapor_mixing_ratio_wrt_moist_air_at_2m`: Specific humidity (water vapor mass mixing ratio with respect to moist air) at two meters above surface
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have found in our work on JEDI that the term "specific humidity" does not have an internationally agreed-upon meaning. I suggest not using it at all to avoid confusion. Or, if it must be used, to qualify it. (When the UKMO uses the term "specific humidity" they mean water_vapor_mixing_ratio_wrt_moist_air_and_condensed_water, which is two lines below this one.)

@mkavulich
Copy link
Collaborator Author

mkavulich commented Mar 4, 2025

@svahl991 Your understanding is correct. This re-factor is part of a lead-up to standing up a more formal governance outside of the CCPP framework's purview; we don't yet have anything official but I have been putting out feelers to ensure that all interested groups are represented. As an active contributor from JCSDA, I'll certainly be sure you're in the loop when we start scheduling meetings.

With regards to specific humidity, you'll note that this PR removes specific_humidity from all standard names, exactly because of the problems you mentioned (conflicting/non-standardized definitions), see here for example. But because it is a common term that people are likely to search for, we decided it should stay in long_name entries where relevant, with a parenthetical qualifier of the exact definition as you see here. I could see where it might still be an issue though, so I will open an issue where we can discuss this in more detail.

@mkavulich mkavulich merged commit 5a7b80f into ESCOMP:release/v1 Mar 4, 2025
3 checks passed
mkavulich added a commit that referenced this pull request Dec 8, 2025
## Description
This PR merges The `release/v1` branch into the `main` branch.

The `release/v1` branch was split off from `main` about a year ago, with the intention to make major rules and name changes to improve the consistency and maintainability of both the rules and the names, without giving major inconveniences/disruption to those currently making use of the main branch. After a year of discussion and changes, it is time to bring these changes to their final resting place in the main branch.

The major breaking and/or non-back-compatible changes can be summarized as such (See the subsections below for specific details about these changes):

 - the physics `kind` field is removed
 - the `long_name` field is changed to `description`
 - Several changes to particular terms and components of existing names have been made


After this PR is merged, a `v1.0.0` tag will be created, representing the first true "versioned" version of the ESM Standard Names. While more rule changes are likely in the future to resolve open and future issues, this should be a more stable jumping-off point to allow updates and reconciliation with the names in the CCPP physics repository, which has not been resolved in many years now.

For those who have not been following along with the discussion and changes related to the **v1** branch, here is a summary of each of the changes made on this branch:

### #85 First rules update, fixing misspelled standard names
This first change introduced some changes to the Rules document based on discussion in the CCPP framework regular meetings. These rules changes can be summarized as follows
 - Introduced a more rigorous and standardized formula for constructing new standard names, with specific rules and definitions of each component of the name, attempting to cover all possible cases 
 - Introduced the concept of "suffixes" to compliment prefixes, with mixing_ratio_wrt``_Y`` being the first example
 - Introduced the concept of "Reserved phrases"...for now only including "CCPP" as a reference to CCPP-specific variables

In addition, a large number of misspellings within the existing names and rules were fixed.

### #87 Second rules update in v1 branch, update several name types
This second change introduced the concept of "base names"; representing the main entity from which a standard name is constructed. Some existing prefixes (`surface_X` and `air_X`) were converted to suffixes (`X_at_surface` and `X_of_air`) to improve consistency with other existing names and rules, and some superfluous `surface` wording was removed from several names. The definitions for mixing ratios were improved, and the rules for constructing new names were updated and improved.

### #104 Add techincal specification, substitute abbreviations, include base name definitions
This third rules change included some info about technical specifications of the standard names repository, and some formatting improvements. Instances of the term `weight` were changed to `scaling_factor` to avoid potential confusion with the physical property of weight. `long_name` descriptions were added for all the new "base names" with a few minor exceptions. Some new abbreviations were defined to help shorten names. Some unused and duplicate entries were removed. Finally, CCPP-specific variables were consolidated into their own section.

### #116 Rename long_name --> description, update description rules, expand list of abbreviations
This fourth change renamed the `long_name` field to `description`, clarifying that this field should be unique, and improving/fixing some existing descriptions. Some more duplicate entries were removed. Some additional new abbreviations were defined to continue shortening names. Finally, continued defining more abbreviations to help shorten standard names further.

### #124 Remove `kind` entry, clarify rules for units and disallowed terms
This fifth and final change to the v1 branch (aside from another PR to resolve intervening changes from the main branch) updated the rules to clarify disallowed terms and the role of `units`, and removed the `kind` entry. The `constants` section was organized alphabetically, and changed some names regarding `dry_air`. 

## Issues
Resolves
 - #48 
 - #68 
 - #94 
 - #102

Also reference already-closed issues:
 - #92 
 - #93 
 - #95 
 - #103 

---------

Co-authored-by: Jesse Nusbaumer <[email protected]>
Co-authored-by: Dom Heinzeller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants