Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source_id registration of E3SM-2-1 #1218

Open
chengzhuzhang opened this issue Jan 29, 2024 · 35 comments
Open

source_id registration of E3SM-2-1 #1218

chengzhuzhang opened this issue Jan 29, 2024 · 35 comments
Labels
Awaiting data publication awaiting data to be published on ESGF

Comments

@chengzhuzhang
Copy link

chengzhuzhang commented Jan 29, 2024

label = E3SM 2.1
label_extended = E3SM 2.1 (Energy Exascale Earth System Model)
source_id = E3SM-2-1
institution_id = E3SM-Project
release_year = 2024
activity_participation = [CMIP]

aerosol:
description = MAM4 w/ new resuspension, marine organics, secondary organics, and dust (atmos physics grid)
nominal_resolution = 100 km

atmos:
description =EAM (v2.0, cubed sphere spectral-element grid; 5400 elem., 30x30 per cube face. Dynamics: degree 3 (p=3) polynomials within each spectral element, 112 km average resolution. Physics: 2x2 finite volume cells within each spectral element, 1.5 degree (168 km) average grid spacing; 72 vertical layers w/ top at 60 km).
nominal_resolution = 112 km

atmosChem:
description = Troposphere specified oxidants (except passive ozone with the lower boundary sink) for aerosols. Stratosphere linearized interactive ozone (LINOZ v2) (atmos physics grid)
nominal_resolution = 100 km

land:
description = ELM (v1.0, atmos physics grid, satellite phenology mode), MOSART (v1.0, 0.5 deg lat/lon grid)
nominal_resolution = 100 km

landIce:
description = none
nominal_resolution = none

ocean:
description = MPAS-Ocean (E3SMv2.0, EC30to60E2r2 unstructured SVTs mesh with 236853 cells and 719506 edges, variable resolution 60 km to 30 km; 60 levels; top grid cell 0-10 m)
nominal_resolution = none

ocnBgchem:
description = none
nominal_resolution = none

seaIce:
description = MPAS-Seaice (E3SMv2.0, MPAS-Ocean grid; 5 ice categories, 7 ice layers, 5 snow layers)
nominal_resolution = none

@durack1
Copy link
Member

durack1 commented Jan 29, 2024

@chengzhuzhang I presume all model component *v2.0* entries should be updated to v2.1 following the https://github.com/E3SM-Project/E3SM/releases/tag/v2.1.0 notation?

@chengzhuzhang
Copy link
Author

Thanks for catching it. I copied the information from v2.0.
Below is the E3SM v2.1 tag.
https://github.com/E3SM-Project/E3SM/releases/tag/v2.1.0
I'm tagging @rljacob to double check if each E3SM component has a separate versioning, or can we use v2.1 for all components.

@rljacob
Copy link

rljacob commented Jan 29, 2024

We don't separately version the components. So you can use 2.1 if you have to. Do we need to specify a version for components?

@durack1
Copy link
Member

durack1 commented Jan 29, 2024

@rljacob for many model configs that contributed to CMIP6, they were using code bases that are managed separately. As E3SM seems to have migrated all component codebases into https://github.com/E3SM-Project/E3SM/ and has unified releases, this is less of an identity issue.

Having said that, it would be useful to provide any obvious component version identifier updates, such that in the case information was lifted to tabulate E3SM v2.1 (ala AR6 Table AII.5) you have all the relevant information included

@chengzhuzhang
Copy link
Author

chengzhuzhang commented Jan 30, 2024

Hi @durack1 , thanks, in this case, I think we can just reference E3SMv2.1 for all components, or leave the version number out. I have updated as follows to include v2.1.

label = E3SM 2.1
label_extended = E3SM 2.1 (Energy Exascale Earth System Model)
source_id = E3SM-2-1
institution_id = E3SM-Project
release_year = 2024
activity_participation = [CMIP]

aerosol:
description = MAM4 w/ new resuspension, marine organics, secondary organics, and dust (atmos physics grid)
nominal_resolution = 100 km

atmos:
description =EAM (E3SMv2.1, cubed sphere spectral-element grid; 5400 elem., 30x30 per cube face. Dynamics: degree 3 (p=3) polynomials within each spectral element, 112 km average resolution. Physics: 2x2 finite volume cells within each spectral element, 1.5 degree (168 km) average grid spacing; 72 vertical layers w/ top at 60 km).
nominal_resolution = 112 km

atmosChem:
description = Troposphere specified oxidants (except passive ozone with the lower boundary sink) for aerosols. Stratosphere linearized interactive ozone (LINOZ v2) (atmos physics grid)
nominal_resolution = 100 km

land:
description = ELM (E3SMv2.1, atmos physics grid, satellite phenology mode), MOSART (E3SMv2.1, 0.5 deg lat/lon grid)
nominal_resolution = 100 km

landIce:
description = none
nominal_resolution = none

ocean:
description = MPAS-Ocean (E3SMv2.1, EC30to60E2r2 unstructured SVTs mesh with 236853 cells and 719506 edges, variable resolution 60 km to 30 km; 60 levels; top grid cell 0-10 m)
nominal_resolution = none

ocnBgchem:
description = none
nominal_resolution = none

seaIce:
description = MPAS-Seaice (E3SMv2.1, MPAS-Ocean grid; 5 ice categories, 7 ice layers, 5 snow layers)
nominal_resolution = none

@durack1
Copy link
Member

durack1 commented Jan 30, 2024

@chengzhuzhang ok looks good. Pulling this across for registration I'll correct a couple of things, the atmos nominal_resolution to 100 km, the closest other options are 50 or 250 km. ocean and seaIce also needs a nominal_resolution, so 50 km is the best fit here. Let me know if you disagree with any of this.

Curious that you have a river model (MOSART) embedded in the model on a different grid (0.5deg vs ~1.0deg atmos). We had anticipated that a landWater realm would be needed at some stage, it seems that we're there - so will need to consider this for CMIP6Plus specs (ping @matthew-mizielinski @wolfiex)

@chengzhuzhang
Copy link
Author

chengzhuzhang commented Jan 30, 2024

@chengzhuzhang ok looks good. Pulling this across for registration I'll correct a couple of things, the atmos nominal_resolution to 100 km, the closest other options are 50 or 250 km. ocean and seaIce also needs a nominal_resolution, so 50 km is the best fit here. Let me know if you disagree with any of this.

Thanks for checking across our registration! The changes look good.

Curious that you have a river model (MOSART) embedded in the model on a different grid (0.5deg vs ~1.0deg atmos). We had anticipated that a landWater realm would be needed at some stage, it seems that we're there - so will need to consider this for CMIP6Plus specs (ping @matthew-mizielinski @wolfiex)

River and atm/land are on different grids before E3SM v3. For the upcoming v3 simulations, land and river will be on a same grid, but atm on a different grid...

@durack1
Copy link
Member

durack1 commented Jan 30, 2024

@chengzhuzhang E3SM-2-1 is registered, peek at CMIP6_source_id.html to peruse the contents - note that I had to trim the atmosphere description, it exceeded our 1023 max chars.

I'll leave this open so that when your first data is published, we can toggle "Registered" across to "Published" and close out this issue

@durack1 durack1 added the Awaiting data publication awaiting data to be published on ESGF label Jan 30, 2024
@chengzhuzhang
Copy link
Author

Awesome, thank you @durack1

@durack1
Copy link
Member

durack1 commented Jan 30, 2024

the cmip6-cmor-tables should update overnight, so when @TonyB9000 starts the CMORizing process, everything should be in place.

@durack1
Copy link
Member

durack1 commented Feb 13, 2024

@TonyB9000
Copy link

First pass of E3SM-2-1 CMIP6 generation should complete by early next week (MPAS cmorizing can take a week on its own.) I tend not to engage publication until CMIP6 generation is completed for all cases. 736 of 997 CMIP6 sets are completed.

@chengzhuzhang
Copy link
Author

Just add that, we have to hold publication until hearing from the science group.

@TonyB9000
Copy link

Good thing they are not in a hurry. I always underestimate the time required to cmorize the MPAS datasets. As of 3 PM Friday, we are at 785 of 997 sets completed. Some runs were "broken" when acme1 name resolution somehow gets lost. Some failures due to e2c/mpas.py/XArray issues were encountered. Will continue in appropriate GIT discussion.

@durack1
Copy link
Member

durack1 commented Apr 17, 2024

@TonyB9000 just circling around on this, is the E3SM-2-1 data likely to be published soon?

@TonyB9000
Copy link

@durack1 @chengzhuzhang We are at 97% completion (970/997 datasets completed. Of the 296 MPAS sets, 27 are failing to generate, and we (Tom, Jill and I) continue to investigate. As far as "permission to publish" (from the "science group") the successful CMIP6 sets, I will have to defer to Jill.

@chengzhuzhang
Copy link
Author

yeah, I don't think we are going to publish soon. We will update once the data is ready and we get permission to publish...

@TonyB9000
Copy link

@chengzhuzhang Hi Jill. As ALL v2_1 CMIP6 are generated, and (I think) we have the "E3SM-2-1" source_id registration in-hand, all that remains for publication is the haggle over the paper title. Should we close this issue?

@chengzhuzhang
Copy link
Author

@chengzhuzhang let's keep it open until the paper title is finalized and updated, we should hear about it soon from Kat, I think

@durack1
Copy link
Member

durack1 commented Jun 28, 2024

@TonyB9000 @chengzhuzhang as there are no data available on ESGF, yet (see here), the status of "cohort":"Registered", whereas this will be toggled to "Published" once we have data live, with a version number - so will leave this open until your paper title, and the data publication status is updated

@durack1
Copy link
Member

durack1 commented Aug 2, 2024

@TonyB9000 any update when this data will be published and live?

@TonyB9000
Copy link

@durack1 @chengzhuzhang I'm happy you asked. It was only a few days ago that we received word of the final paper references that must be edited into the datafile metadata (10800 files). I have a routine for doing that, and can launch it very soon. We have decided to publish the data "as usual" from our local esgf-datanode and index. However, within the next few months, these CMIP6 datasets will be copied to ANL and republished there, as our "Tier 2" storage utility is being abandoned in 3-6 months - making any existing (E3SM LLNL) publications unavailable. All of our previous CMIP datasets have already been copied and re-published using the ANL index node. Please feel free to poke us next week for updates on this - much is in flux.

@TonyB9000
Copy link

@durack1 @chengzhuzhang Change of plans (as usual ...) The v2_1 CMIP datasets have been transferred to ANL (/lcrc/group/e3sm2/DSM/Staging/Data/) and will be published to the ANL index node, as soon as we are advised of the procedure to access the publication disk ("eagle" filesystem) programmatically.

@durack1
Copy link
Member

durack1 commented Aug 12, 2024

@TonyB9000 ok great, when these files are live, please ping this thread back, I'll harvest the date of publication, update the license info and close this out

@durack1
Copy link
Member

durack1 commented Sep 24, 2024

@TonyB9000 just circling back on this one, it seems these data are still not published - see here

@TonyB9000
Copy link

@durack1 (I can't believe it has been 5 weeks of struggling with ANL/ALCF and ANL/LCRC to get publication operational for E3SM.)

I am hoping to get v2_1 published before end of Sept. However, it will not include ocean 3D variables until a masking fix that affects ALL v2/v2_1 ocean 3D vars is applied in regeneration of those CMIP6 sets. This affects 80 v2_1 sets, and 320 v2 sets.

(Xylar was able to puzzle through the issue - an obscure behavior involving NaNs and FillValues in certain conditions.)

@durack1
Copy link
Member

durack1 commented Sep 25, 2024

@TonyB9000 no problem, just wanted to check. Once we have some data live, please ping back on this thread so I can close out the open issue - we need a first version (e.g., 20240925) to update the license info for the data

@chengzhuzhang
Copy link
Author

@durack1 it looks like the ESGF ANL node is back online. Please check here: https://esgf-node.cels.anl.gov/search?project=CMIP6&activeFacets=%7B%22source_id%22%3A%22E3SM-2-1%22%7D. for the E3SM-2-1 data.

@durack1
Copy link
Member

durack1 commented Dec 17, 2024

This data is published into the ESGF node at ANL, but is not findable in the CMIP6 project index, across the LLNL, DKRZ or CEDA (or other) nodes... So while you have E3SM-2-1 data live, no one really knows about it..

@TonyB9000
Copy link

@durack1 @chengzhuzhang Understood. Several weeks ago, Sasha Ames explained that he needed to work on federating these indexes. I haven't heard anything since.

@durack1
Copy link
Member

durack1 commented Dec 18, 2024

@TonyB9000 the challenge is getting ANL (and ORNL) folks to engage to get this work done, it requires work on those systems which to-date hasn't been prioritized. @sashakames does not have accounts of these systems to make the changes required.

This definitely needs a nudge along and engagement across the ESGF2-US community, and add to the list item for early 2025

@TonyB9000
Copy link

@durack1 @chengzhuzhang I pulsed Sasha on this, and he replied:

This issue is outside of our control: it’s up to the Argonne team to expose their Solr to us via an ingress/ firewall config. I’ve mentioned this in our project meetings that it will be an issue. The alternative is to wait until we have a single index for all three labs, but the wait time might be unacceptable.

Sounds like this needs to be elevated. (Who oversees these labs, anyway? ;) )

@durack1
Copy link
Member

durack1 commented Dec 18, 2024

As E3SM is an ESGF "customer" and ANL has taken responsibility to get these data published and accessible, it's going to need a nudge in the early new year - let's circle on this the first full week back, @sashakames can advise us how best to escalate the dialogue to getting this done

@sashakames
Copy link

I'd hope it doesn't take PM pressure. My reminders to people can only go so far... I'll put a note on the running meeting agenda with Forrest so does not get forgotten.

@chengzhuzhang
Copy link
Author

@sashakames many thanks for helping pushing this along..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting data publication awaiting data to be published on ESGF
Projects
None yet
Development

No branches or pull requests

5 participants