-
Notifications
You must be signed in to change notification settings - Fork 42
Add scripts to preprocess the GHCN snow data #2019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds preprocessing capabilities for GHCN (Global Historical Climatology Network) snow depth observations by introducing a Python converter script that transforms GHCN CSV data into IODA-format NetCDF files for use in snow data assimilation workflows.
- Adds
ghcn_snod2ioda.pyscript to convert GHCN snow depth CSV data to IODA format - Implements data filtering for valid snow depth observations and date selection
- Adds YAML configuration template for GHCN data preparation workflow
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 20 comments.
| File | Description |
|---|---|
ush/snow/ghcn_snod2ioda.py |
New converter script that reads GHCN CSV data, merges with station metadata, filters observations, and outputs IODA-format NetCDF files |
parm/snow/prep/prep_ghcn.yaml.j2 |
Configuration template defining file staging and conversion workflow for GHCN snow data preprocessing |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@jiaruidong2017 Thanks for doing this. I added a co-pilot review above. I left it in case there's anything you want to respond to, but I would probably ignore most of it. In particular, it looks like you've copied the GHCN to IODA converter from the JEDI repo. Did you need to make any changes to it to run outside of JEDI? Co-pilot has suggested some clean-up (all good suggestions), but I'm wondering if we want to try to keep this version the same as the one in the JEDI repo. @CoryMartin-NOAA What do you think? |
Thanks @ClaraDraper-NOAA for launching the co-polit review. I agree that we should keep this version consistent with the one in the JEDI repo unless we plan to apply the same changes in JEDI repo as well. Wait for @CoryMartin-NOAA’s suggestions. |
|
I think we can largely ignore the Copilot suggestions and keep it consistent with what is in ioda-converters. IMO all of the ioda-converters are not completely robust, so I would be supportive of rewriting this in C++ (eventually) so that we can leverage IODA directly and not through the multi layered Python API. |
This PR adds the `def_jedi_utils.py` python script from ioda-converters to this repository. This contributes to NOAA-EMC/GDASApp#2019
Modify the GHCN obs path to use the data from the `prep` subdirectory under the running directory. This PR contributes to NOAA-EMC/GDASApp#2019 Co-authored-by: Cory Martin <[email protected]>
CoryMartin-NOAA
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiaruidong2017 I think we just need the jcb-gdas hash updated here, then I can merge this, and then update the GDASApp hash in your workflow PR
# Description This PR updates the `da-utils` commit hash to include the required utilities. This PR also addresses Copilot comments in `ghcn_snod2ioda.py`. In particular, it fixes the use of AttrData and DimDict as mutable module-level dictionaries, which is error-prone—especially since DimDict is modified within the class. These structures are now handled in a safer, more maintainable way. This PR is a supplementary update to PR #2019 and contributes to NOAA-EMC/global-workflow#4386 # Issues Resolves #2018 # Automated CI tests to run in Global Workflow <!-- Which Global Workflow CI tests are required to adequately test this PR? --> - [ ] atm_jjob <!-- JEDI atm single cycle DA !--> - [ ] C96C48_ufs_hybatmDA <!-- JEDI atm cycled DA !--> - [ ] C96C48_hybatmsnowDA <!-- JEDI snow cycled DA !--> - [ ] C96_gcafs_cycled <!-- JEDI aerosol cycled DA !--> - [ ] C48mx500_3DVarAOWCDA <!-- JEDI low-res marine 3DVar cycled DA !--> - [ ] C48mx500_hybAOWCDA <!-- JEDI marine hybrid envar cycled DA !--> - [ ] C96C48_ufsgsi_hybatmDA <!-- JEDI atm Var with GSI EnKF cycled DA !--> - [ ] C96C48_hybatmDA <!-- GSI atm cycled DA !-->
Modify the GHCN obs path to use the data from the `prep` subdirectory under the running directory. This PR contributes to #2019 Co-authored-by: Cory Martin <[email protected]>
# Description This PR adds a `ghcn_snod2ioda.py` script to processes the GHCN data in csv format to IODA-format files for snow DA in the global-workflow. # Companion PRs This PR depends NOAA-EMC/jcb-gdas#220, NOAA-EMC/DA-utils#53, and NOAA-EMC/global-workflow#4388 This PR contributes NOAA-EMC/global-workflow#4386 # Issues Resolves #2018 # Automated CI tests to run in Global Workflow <!-- Which Global Workflow CI tests are required to adequately test this PR? --> - [ ] atm_jjob <!-- JEDI atm single cycle DA !--> - [ ] C96C48_ufs_hybatmDA <!-- JEDI atm cycled DA !--> - [ ] C96C48_hybatmsnowDA <!-- JEDI snow cycled DA !--> - [ ] C96_gcafs_cycled <!-- JEDI aerosol cycled DA !--> - [ ] C48mx500_3DVarAOWCDA <!-- JEDI low-res marine 3DVar cycled DA !--> - [ ] C48mx500_hybAOWCDA <!-- JEDI marine hybrid envar cycled DA !--> - [ ] C96C48_ufsgsi_hybatmDA <!-- JEDI atm Var with GSI EnKF cycled DA !--> - [ ] C96C48_hybatmDA <!-- GSI atm cycled DA !--> --------- Co-authored-by: Cory Martin <[email protected]>
# Description This PR updates the `da-utils` commit hash to include the required utilities. This PR also addresses Copilot comments in `ghcn_snod2ioda.py`. In particular, it fixes the use of AttrData and DimDict as mutable module-level dictionaries, which is error-prone—especially since DimDict is modified within the class. These structures are now handled in a safer, more maintainable way. This PR is a supplementary update to PR #2019 and contributes to NOAA-EMC/global-workflow#4386 # Issues Resolves #2018 # Automated CI tests to run in Global Workflow <!-- Which Global Workflow CI tests are required to adequately test this PR? --> - [ ] atm_jjob <!-- JEDI atm single cycle DA !--> - [ ] C96C48_ufs_hybatmDA <!-- JEDI atm cycled DA !--> - [ ] C96C48_hybatmsnowDA <!-- JEDI snow cycled DA !--> - [ ] C96_gcafs_cycled <!-- JEDI aerosol cycled DA !--> - [ ] C48mx500_3DVarAOWCDA <!-- JEDI low-res marine 3DVar cycled DA !--> - [ ] C48mx500_hybAOWCDA <!-- JEDI marine hybrid envar cycled DA !--> - [ ] C96C48_ufsgsi_hybatmDA <!-- JEDI atm Var with GSI EnKF cycled DA !--> - [ ] C96C48_hybatmDA <!-- GSI atm cycled DA !-->
Description
This PR adds a
ghcn_snod2ioda.pyscript to processes the GHCN data in csv format to IODA-format files for snow DA in the global-workflow.Companion PRs
This PR depends NOAA-EMC/jcb-gdas#220, NOAA-EMC/DA-utils#53, and NOAA-EMC/global-workflow#4388
This PR contributes NOAA-EMC/global-workflow#4386
Issues
Resolves #2018
Automated CI tests to run in Global Workflow