Skip to content

Conversation

@CoryMartin-NOAA
Copy link
Contributor

@CoryMartin-NOAA CoryMartin-NOAA commented Jan 5, 2026

Description

This PR addresses the renaming of the stat files from GSI to include the .tar suffix as well as some other changes to properly handle GSI ncdiags in the anlstat job, such as combining variables into one output file.

Resolves #4317
Related to #4400
Resolves NOAA-EMC/GDASApp#2018
Resolves NOAA-EMC/GDASApp#2022

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this change expected to change outputs (e.g. value changes to existing outputs, new files stored in COM, files removed from COM, filename changes, additions/subtractions to archives)? YES/NO (If YES, please indicate to which system(s))
    • GFS
    • GEFS
    • SFS
    • GCAFS
  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? YES

How has this been tested?

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • Any new scripts have been added to the .github/CODEOWNERS file with owners
  • I have made corresponding changes to the system documentation if necessary

@RussTreadon-NOAA
Copy link
Contributor

Thank you @CoryMartin-NOAA . I am currently cloning CoryMartin-NOAA:feature/gsi-diags-fix on Dogwood. I'll build, link, and run a failed g-w CI case from PR #4386.

@CoryMartin-NOAA
Copy link
Contributor Author

Thanks @RussTreadon-NOAA This PR will need the GDAS branch also in it. FYI I think this combination should solve 2 issues 1) the error you found (should be solved by the GDAS PR) and 2) a silent error where the code would run but produce nothing because the files are now cnvstat.tar and not cnvstat

@CoryMartin-NOAA
Copy link
Contributor Author

Output should be in (for example)
runtests/COMROOT/C96C48_hybatmDA/gdas.20211221/06/products/atmos/anlmon/gdas.t06z.atmos_gsi_stats.txt

and look something like:

==========================================================================================
Observation Space: sfc_gsi     Analysis Time: 2021-12-21T09:00:00Z     nlocs=109202
------------------------------------------------------------------------------------------
Obs Space                Variable                 Group          Use           Stat     | All Bins |    1000+ | 1000-900 |  900-800 |  800-600 |  600-400 |  400-300 |  
300-250 |  250-200 |  200-150 |  150-100 |   100-50
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------
sfc_gsi                  stationPressure          ombg           assimilated   mean     |  19.5647 |  27.3723 |  10.4172 |  10.1808 |  76.4917 |      NaN |      NaN |  
    NaN |      NaN |      NaN |      NaN |      NaN
sfc_gsi                  stationPressure          ombg           monitored     mean     |  280.775 |  783.833 |  263.065 | -510.818 | -683.833 |  240.639 |      NaN |  
    NaN |      NaN |      NaN |      NaN |      NaN
sfc_gsi                  stationPressure          ombg           rejected      mean     |      NaN |      NaN |      NaN |      NaN |      NaN |      NaN |      NaN |  
    NaN |      NaN |      NaN |      NaN |      NaN

@RussTreadon-NOAA
Copy link
Contributor

C96_gcafs_cycled

gcdas_anlstat for 2021122018 fails with the error

^[[38;21m2026-01-05 18:17:08,587 - INFO     - analysis_stats: Preparing to copy /lfs/h2/emc/ptmp/russ.treadon/COMROOT/C96_gcafs_cycled_pr4401/gcdas.20211220/18//analysis/a\
tmos/gcdas.t18z.oznstat.tar to /lfs/h2/emc/stmp/russ.treadon/RUNDIRS/C96_gcafs_cycled_pr4401/gcdas.2021122018/anlstat.396196/atmos_gsi/atmos_gsi_diags/gcdas.t18z.oznstat.tar^[[0m
^[[38;5;226m2026-01-05 18:17:08,588 - WARNING  - file_utils  : WARNING: No files/directories were included for copy_opt command^[[0m
^[[38;5;226m2026-01-05 18:17:08,589 - WARNING  - file_utils  : WARNING: No files/directories were included for copy_opt command^[[0m
^[[38;5;226m2026-01-05 18:17:08,589 - WARNING  - file_utils  : WARNING: No files/directories were included for copy_opt command^[[0m
^[[38;21m2026-01-05 18:17:08,589 - INFO     - analysis_stats: Converting GSI guess diag files to IODA files^[[0m
^[[38;21m2026-01-05 18:17:08,590 - INFO     - analysis_stats: Converting GSI analysis diag files to IODA files^[[0m
^[[38;21m2026-01-05 18:17:08,591 - INFO     - analysis_stats: Combining conventional GSI IODA files by obspace^[[0m
^[[38;21m2026-01-05 18:17:08,591 - INFO     - analysis_stats: Combining conventional GSI IODA files for obspace sondes^[[0m
Traceback (most recent call last):
  File "/lfs/h2/emc/da/noscrub/russ.treadon/git/global-workflow/pr4401/scripts/exglobal_analysis_stats.py", line 39, in <module>
    AnlStats.convert_gsi_diags()
  File "/lfs/h2/emc/da/noscrub/russ.treadon/git/global-workflow/pr4401/sorc/wxflow/src/wxflow/logger.py", line 252, in wrapper
    retval = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/lfs/h2/emc/da/noscrub/russ.treadon/git/global-workflow/pr4401/ush/python/pygfs/task/analysis_stats.py", line 264, in convert_gsi_diags
    gsios.combine_obsspace(FileList, combined_outfile, False)
  File "/lfs/h2/emc/da/noscrub/russ.treadon/git/global-workflow/pr4401/ush/python/gsincdiag_to_ioda/combine_obsspace.py", line 55, in combine_obsspace
    obsspace = ios.ObsSpace(FileList[0])
                            ~~~~~~~~^^^
IndexError: list index out of range
+ JGLOBAL_ANALYSIS_STATS[48]export err=1

The C96_gcafs_cycled case runs with

DO_AERO_ANL=YES
DO_JEDIATMVAR=NO

Given this, the following logic in exglobal_analysis_stats.py

    # Create list based on DA components                                                                                                                                    
    config.STAT_ANALYSES = []
    if config.DO_AERO_ANL:
        config.STAT_ANALYSES.append('aero')
    if config.DO_JEDISNOWDA:
        config.STAT_ANALYSES.append('snow')
    if config.DO_JEDIATMVAR:
        config.STAT_ANALYSES.append('atmos')
    else:
        config.STAT_ANALYSES.append('atmos_gsi')

sets 'STAT_ANALYSES': ['aero', 'atmos_gsi'].

This is problematic. The C96_gcafs case does NOT run the GSI. There are no GSI diagnostic files to process.

@CoryMartin-NOAA : How do we want to handle the case in which neither JEDI nor GSI based atmospheric DA is run?

@CoryMartin-NOAA
Copy link
Contributor Author

Thanks for catching this, @RussTreadon-NOAA , in that case it should not run this at all. I'll make the necessary changes later today or tomorrow

@RussTreadon-NOAA
Copy link
Contributor

@CoryMartin-NOAA : The following set of ad hoc changes yields successful anlstat runs in various g-w CI cases

dev/scripts/exglobal_analysis_stats.py

@@ -28,14 +28,14 @@ if __name__ == '__main__':
         config.STAT_ANALYSES.append('snow')
     if config.DO_JEDIATMVAR:
         config.STAT_ANALYSES.append('atmos')
-    else:
+    if config.DO_GSIATMVAR:
         config.STAT_ANALYSES.append('atmos_gsi')
 
     # Instantiate the analysis stats task
     AnlStats = AnalysisStats(config)
 
     # Initialize JEDI variational analysis
-    if not config.DO_JEDIATMVAR:
+    if config.DO_GSIATMVAR:
         AnlStats.convert_gsi_diags()
     AnlStats.initialize()
     for anl in config.STAT_ANALYSES:

dev/parm/config/gcafs/config.anlstat and dev/parm/config/gfs/config.anlstat

@@ -10,4 +10,13 @@ source "${EXPDIR}/config.resources" anlstat
 
 export TASK_CONFIG_YAML="${PARMgfs}/gdas/anlstat/anlstat_config.yaml.j2"
 
+export DO_GSIATMVAR="YES"
+if [[ "${DO_JEDIATMVAR:-NO}" == "YES" ]] ; then
+    export DO_GSIATMVAR="NO"
+fi
+if [[ "${DO_AERO_ANL:-NO}" == "YES" ]] ; then
+    export DO_JEDIATMVAR="NO"
+    export DO_GSIATMVAR="NO"
+fi
+
 echo "END: config.anlstat"

Variable DO_GSIATMVAR is added to exglobal_analysis_stats.py to differentiate between GSI and JEDI atmospheric DA. Logic is added to the gcafs and gfs config.anlstat to initialize DO_GSIATMVAR and toggle DO_JEDIATMVAR and/or DO_GSIATMVAR based on the settings of other variables.

While these changes yield the desired behavior, they are kludgy. I'm open to more elegant solutions.

Note: At present, the gcafs and gfs config.anlstat are identical. If this remains the case moving forward, we could soft link the gcafs config.anlstat to point at the gfs file. Several gcafs config files already point at gfs config files.

@RussTreadon-NOAA
Copy link
Contributor

WCOSS2 g-w CI update

Of the 15 g-w CI cases started on Dogwood only C96_atm3DVar_extended remains running. All other cases completed or reached a point beyond which they can not run due to DEAD jobs.

DEAD jobs

/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C48_S2SW_extended_pr4401
202103231200       gfs_arch_tar_gfsa                    89304226                DEAD                 271         2          40.0
 
/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C96_atm3DVar_extended_pr4401
202112210600              gfs_sfcanl                    89297457                DEAD                 271         2          71.0

/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C96C48_hybatmsnowDA_pr4401
202112210000            gdas_snowanl                    89294257                DEAD                 271         2          70.0
202112210000       enkfgdas_esnowanl                    89294810                DEAD                 271         2          70.0

/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C96C48mx500_S2SW_cyc_gfs_pr4401
202112210000             gfs_snowanl                    89295612                DEAD                 271         2          74.0
202112210000        enkfgfs_esnowanl                    89297489                DEAD                 271         2          71.0
202112210000            gdas_snowanl                    89295613                DEAD                 271         2          73.0
202112210000       enkfgdas_esnowanl                    89297491                DEAD                 271         2          70.0

No anlstat jobs failed when the above local patch is included.

Known problems

Failures related to sorc/gdas.cd being ahead of g-w
The snowanl and esnowanl failures are likely due to sorc/gdas.cd in this PR including changes found in g-w PR #4386 without the corresponding workflow changes in #4386 being in this PR.

@emcbot emcbot added the CI-Ursa-Building **Bot use only** CI testing is cloning/building on Ursa label Jan 7, 2026
@DavidHuber-NOAA
Copy link
Contributor

Oh, right, this needs to be on Hera. I am now killing the Ursa CI suite and will relaunch on Hera shortly.

@CoryMartin-NOAA
Copy link
Contributor Author

Thanks @DavidHuber-NOAA yes it's still a mystery as to why Gocart is so slow on Ursa...

@emcbot emcbot added the CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera label Jan 7, 2026
@DavidHuber-NOAA DavidHuber-NOAA removed the CI-Ursa-Building **Bot use only** CI testing is cloning/building on Ursa label Jan 7, 2026
@emcbot emcbot added CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress and removed CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera labels Jan 7, 2026
@NOAA-EMC NOAA-EMC deleted a comment from emcbot Jan 7, 2026
@DavidHuber-NOAA DavidHuber-NOAA removed the CI-Gaeac6-Building **Bot use only** CI testing is cloning/building on Gaea C6 label Jan 7, 2026
@emcbot
Copy link

emcbot commented Jan 7, 2026

C96C48_ufs_hybatmDA FAILED on Hera (pipeline ID: 6845)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4401_82280662_6845/RUNTESTS/EXPDIR/C96C48_ufs_hybatmDA_82280662-6845

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4401_82280662_6845/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_82280662-6845/logs/2024022400/enkfgdas_atmensanlobs.log
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4401_82280662_6845/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_82280662-6845/logs/2024022400/gdas_atmanlvar.log
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4401_82280662_6845/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_82280662-6845/logs/2024022400/gfs_atmanlvar.log

View Error Logs: (enkfgdas_atmensanlobs.log) (gdas_atmanlvar.log) (gfs_atmanlvar.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@emcbot emcbot added CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed and removed CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress labels Jan 7, 2026
@RussTreadon-NOAA
Copy link
Contributor

@DavidHuber-NOAA and @CoryMartin-NOAA

The C96C48_ufs_hybatmDA failures are due to work on GDASApp PR #2030 and jcb-gdas PR #207. Let me revert changes to the radiance bias correction tarball.

@RussTreadon-NOAA
Copy link
Contributor

@DavidHuber-NOAA and @CoryMartin-NOAA : gdas.t18z.rad_varbc_params.tar has been restored to its original state.

@CoryMartin-NOAA
Copy link
Contributor Author

@RussTreadon-NOAA thanks, how about a combined tar file with 'old' and 'new' names in it?

@RussTreadon-NOAA
Copy link
Contributor

@CoryMartin-NOAA . Agreed. I just created a tarball on Hera with both naming conventions.

@DavidHuber-NOAA
Copy link
Contributor

I will reboot the jobs on Hera manually and continue testing.

@DavidHuber-NOAA DavidHuber-NOAA added CI-Hera-Running (CM) CI testing is being run locally on Hera. and removed CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed labels Jan 7, 2026
@DavidHuber-NOAA
Copy link
Contributor

@RussTreadon-NOAA @CoryMartin-NOAA the same three jobs failed again on reboot. Is this expected? Does the GDASApp need to be rebuilt or the link script re-run first?

@CoryMartin-NOAA
Copy link
Contributor Author

@DavidHuber-NOAA the *init jobs probably need to be rerun also to stage the files in $DATA

@DavidHuber-NOAA
Copy link
Contributor

Ah, that makes sense. Thanks.

@DavidHuber-NOAA
Copy link
Contributor

All tests passed. Merging.

@DavidHuber-NOAA DavidHuber-NOAA added CI-Hera-Passed (cm) Manual CI passed on Hera and removed CI-Hera-Running (CM) CI testing is being run locally on Hera. labels Jan 7, 2026
@DavidHuber-NOAA DavidHuber-NOAA merged commit de48f20 into NOAA-EMC:develop Jan 7, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI-Hera-Passed (cm) Manual CI passed on Hera GFS Change This PR, if merged, will change results for the GFS.

Projects

None yet

6 participants