Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Preporcessing Scripts and a README to the sponges tools #96

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

theresa-morrison
Copy link
Contributor

This PR includes three pre-processing scripts I wrote and the filling script from Andrew. Together, these scripts manipulate the GLORYS data on uda into a subset region that is consistent with the input expected by write_nudging_data.py.

The README lists the steps for submitting these scripts. It would be gret if they could be merged and generalized to work for multiple domains.

The README also includes the overrides for MOM6 and lines that should be added to an xml to use the sponge data that is produced by these tools.

Theresa Morrison and others added 5 commits September 23, 2024 14:37
Partial readme including steps for using the preprocessing scripts
and what changes need to be done in the xml to use temperature and
salinity sponges in MOM6.

Preprocessing scripts subset and average daily data from uda, fill
the monthly data, and merge the monthly T and S averages into one
file for each year.
Add more details to the readme
Add paths variables for archive and work.
sbatch fill_glorys_nn_monthly.sh <YEAR> <MONTH>
```

3. Finally, once the filled data for every month in a given yeas have been created, the merge script can be used.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to "every month in a given year has been created"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theresa-morrison, I think this PR is almost ready, except for a small comment from @uwagura that hasn’t been addressed yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I was waiting to see if more typos would be found before submitting a commit. I will take care of this!

@@ -0,0 +1,38 @@
#!/bin/tcsh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor, but if these are cshell scripts then maybe we should change the file extension to .csh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change the extension, would we prefer to have them not be cshell scripts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theresa-morrison , since you're using C shell syntax (e.g., set), it might be simpler to rename the script to *.csh to reflect that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

Theresa Morrison added 2 commits September 23, 2024 16:04
Change file extensions for cshell scripts
# Regionally-slice and convert daily to monthly GLORYS reanalysis on archive beforehand.

# dmget all of the files for this month from archive.
dmget /archive/tnm/datasets/glorys/GLOBAL_MULTIYEAR_PHY_001_030/monthly/so/GLORYS_so_arctic_${year}_${month}.nc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two dmgets could be combined into one (at least, I've always assumed dmget is happier getting everything at once instead of in multiple commands).

foreach filename (/uda/Global_Ocean_Physics_Reanalysis/global/daily/so/${year}/so_mercatorglorys12v1_gl12_mean_${year}${month}*.nc)
echo $filename
set short_name='so_arctic_'$day
ncks -d latitude,39.,91. --mk_rec_dmn time $filename ${apath}/so_${year}_${month}/${short_name}'_bd.nc'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally these temporary files would be written to $TMPDIR and only the final result would be copied to /archive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - only the final file is written to archive.

set day = `expr $day + 1`
echo $day
end
ncra -O --cnk_plc=r1d --cnk_dmn=time,1 ${apath}/so_${year}_${month}/so_arctic_*.nc ${apath}/GLORYS_so_arctic_${year}_${month}.nc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is doing the time average, but for clarity can you add a comment stating that, and also describing what --cnk_plc=r1d --cnk_dmn=time,1 is doing?

Copy link
Contributor Author

@theresa-morrison theresa-morrison Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was meant to help the averaging be faster, but it's not needed any more.

Copy link
Contributor

@yichengt900 yichengt900 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theresa-morrison, thank you for uploading the example scripts for generating the necessary files for our sponge tools. I have tested them and can confirm that they work as expected. I'll be able to approve this PR once most of @uwagura and @andrew-c-ross's comments have been addressed. Please feel free to reach out if you need any assistance with those comments.

- change get_so and get_thetao to use a TMPDIR as scratch space
- remove cnk options since they aren't needed
- combine dmget statement
# Regionally-slice and convert daily to monthly GLORYS reanalysis on archive beforehand.

# dmget all of the files for this month from archive.
dmget ${apath}/so/GLORYS_so_arctic_${year}_${month}.nc ${apath}/thetao/GLORYS_thetao_arctic_${year}_${month}.nc
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrew-c-ross I think this combines the dmget. Nothing is being dmget in my testing, so I'm not sure if it is working.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update: not working, it is writing over my files as expected.

A typo in get_thetao mean so was being used instead. This has been fixed and
not a variable ${var} has been added. This means that lines 18 to 29 should
be the same in get_so and get_thetao.
cp -f ${wpath}/GLORYS_so_arctic_${year}.nc ${wpath}/GLORYS_arctic_${year}.nc

# Append temperature data to renamed salinity data
ncks -A ${wpath}/GLORYS_thetao_arctic_${year}.nc ${wpath}/GLORYS_arctic_${year}.nc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theresa-morrison, sorry for being picky, but do you think it’s a good idea to reduce redundant file copies and minimize repetitive processing? We could try something like the following:

!/bin/tcsh
#SBATCH --ntasks=1
#SBATCH --job-name=fill_glorys_arctic
#SBATCH --time=2880
#SBATCH --partition=batch

# Usage: sbatch merge_so_thetao_year.csh <YEAR>

module load cdo
module load nco/5.0.1
module load gcp

set year = $1
set wpath = '/work/Theresa.Morrison/datasets/glorys/GLOBAL_MULTIYEAR_PHY_001_030/monthly/filled'

# Define the file variables for salinity and temperature
set so_file = "${wpath}/GLORYS_so_arctic_${year}.nc"
set thetao_file = "${wpath}/GLORYS_thetao_arctic_${year}.nc"
set final_file = "${wpath}/GLORYS_arctic_${year}.nc"

# Concatenate monthly averages into single files for salinity and temperature
foreach var (so thetao)
    ncrcat -O ${wpath}/GLORYS_${var}_arctic_${year}_*.nc ${wpath}/GLORYS_${var}_arctic_${year}.nc
end

# Append temperature data to salinity file directly without copying
ncks -A ${thetao_file} ${so_file}

# Rename the combined file to final name
mv -f ${so_file} ${final_file}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind, I appreciate the suggestions!

Theresa Morrison and others added 3 commits September 24, 2024 15:38
- simplify code based on YCT suggestion
- update usage comment and job name
- fix typo
-change file names from .sh to .csh

## Using these files in MOM6

To use the sponges generated by these scripts in MOM6 we reccomend the following settings:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOPS, It's my bad but I found another one: "recommend"......

@theresa-morrison
Copy link
Contributor Author

There are a few other changes I would like to make before this is merged.
(1) I think that once the merged file is created the individual monthly filled files should be removed
(2) make domain name a variable so that it can be changed in just one place

There is more that could be improved to streamline these scripts, but those are the two that I think make sense before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants