Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagnostic for calculating Lamb Weathertypes #3691

Open
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

thomaskroi1996
Copy link

@thomaskroi1996 thomaskroi1996 commented Jun 27, 2024

Hello everyone!

At the Wegener Center in Graz, Austria, we have been doing some research with Lamb Weathertypes for quite some while now, and recently started developing a diagnostic for ESMValTool to make things easier, and potentially be of use for others!

The features are:

Calculate the 27 Lamb Weathertypes over a region specified in the corresponding recipe
Plot means, anomalies as well as standard deviations for those weathertypes for psl, tas and prcp
Combining the weathertypes based on precipitation patterns over an area specified in the corresponding recipe as done here: https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2020JD032824 (Table 2)
Create the same plots for those combined weathertypes

The weathertypes are calculated for ERA5 and model data, and the precipitation data for correlation calculations are taken from ERA5 and E-OBS.

The branch we are working on is called weathertyping_wegc.

We are happy to discuss this project with the community, get some feedback and features you would like to see!

Thank you very much and with kind regards,

Thomas Kroißenbrunner and Martin Jury

Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.

New or updated recipe/diagnostic

New or updated data reformatting script


@thomaskroi1996 thomaskroi1996 linked an issue Jun 27, 2024 that may be closed by this pull request
@valeriupredoi
Copy link
Contributor

@thomaskroi1996 glad to see this in PR form, well done 👏 🍺 I can perform a technical review of it, but for a scientific review, I'd suggest we think who's the closest expert amongst us to perform the review. Just a couple questions before the review, please:

  • are you happy to have this reviewed now, or you still working on it? If work is still to be performed, then I'd recommend you convert this PR to a Draft PR, and when it's ready, make it back ready for review;
  • to find the best scientific reviewers, it'd be good if you could suggest one or more folks that already have experience with this type of diagnostic, if not maybe we can ask the list (scientific reviewers, which I can invoke via a GitHub tag)

Cheers 🍺

@thomaskroi1996
Copy link
Author

thomaskroi1996 commented Jun 27, 2024

@valeriupredoi Hey there! I am still doing some work, and definitely have to do all the tests, therfore I will convert it to a draft now, thank you so much!
As for the 2nd point, I can talk to my supervisor next week, and then maybe we have some suggestions for scientific reviewers, in which case I will reply to you here again!

Cheers! 🍻

@thomaskroi1996 thomaskroi1996 marked this pull request as draft June 27, 2024 14:45
@thomaskroi1996 thomaskroi1996 marked this pull request as ready for review August 9, 2024 08:10
@thomaskroi1996
Copy link
Author

@valeriupredoi Hello! :) I have now finished all the main work now, and implemented everything we wanted to! Now I think it is time for the technical review, correct? Thank you so much for all the help!

@valeriupredoi
Copy link
Contributor

many thanks @thomaskroi1996 - and my apologies, I was on summer holidays until a few days ago, let me do a tech review now, cheers 🍺

Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first off - the nagging bits: could you please have a look at the code formatting - we are currently using flake8 to check for it, so it'd be good if you ran flake8 inside your dir once, to see the bits it complains about, alternatively please have a look at the list of issues it reports on the CircleCI output here

Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi again @thomaskroi1996 very many thanks for this considerable amount of work! I had a look over the added files - all looks great already, and have a couple relatively minor points, in addition to the few review remarks I posted inline code:

  • you mention E-OBS, do you plan on writing a cmorizer script for that OBS dataset too?
  • API documentation for your public functions looks good but it would have been nice to have it in Numpy-style format - no biggie though, because we let the users choose the way they document their code, since we don't build API docs from diagnostic
  • (the more important point) would it be possible to compact all those data-typed recipes in one single one, or, at least, a couple of them? I see they are differentiated by the input data - you could allow multiple diagnostics typed on various datasets into one single recipe - I reckon that'd be a bit more convenient both for the users, and for us, when we test a new ESMValTool version - we'd not have to run a busload of recipes, but just one 😀

from esmvaltool.diag_scripts.shared import (
run_diagnostic,
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you could please run isort on this, that'll fix the import order; also - are you importing everything from wt_utils ? - could do the otherwise bad from wt_utils import * if so, and save the reader looking at a huge imports list 😁

--------

A diagnostic to calculate Lamb weathertypes over a given region. Furthermore,
correlations between weathertypes and precipitation patterns over a given are can be calculated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
correlations between weathertypes and precipitation patterns over a given are can be calculated
correlations between weathertypes and precipitation patterns over a given area can be calculated

@valeriupredoi
Copy link
Contributor

valeriupredoi commented Sep 19, 2024

@ESMValGroup/scientific-lead-development-team any of you good folk want to have a look at this new diagnostic, and sign up for a scientific review, please? It looks very cool, but unfortunately, I am no expert in weather types, all I know is "rainy" in Britain 🇬🇧 😁

@bettina-gier
Copy link
Contributor

I'll look at it tomorrow, not an expert in weather types but can still see if each part is scientifically sound, esp if there's a linked paper to compare it to!

@bettina-gier bettina-gier self-assigned this Sep 19, 2024
@valeriupredoi
Copy link
Contributor

I'll look at it tomorrow, not an expert in weather types but can still see if each part is scientifically sound, esp if there's a linked paper to compare it to!

wonderful, many thanks, Tina 🍺

Copy link
Contributor

@bettina-gier bettina-gier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great contribution to the ESMValTool! I gotta admit I didn't read all the python code in the utils file, but the formulas seem like the paper you linked and the plots my run got looked like the ones you put in the documentation - units look fine to me and patterns reasonable. So general scientific review gets a big check. I've got some general points though:

On the technical side (YO @valeriupredoi I thought I was doing the sci review!):

  1. Your provenance doesn't seem right, I got the files in the wrong folder (where I started the diagnostics), and I'm not sure which of your output files they actually belong to, calling them the same as your output files is recommended. I think provenance for the plots is missing (best to include it in your plotting functions).
  2. You're outputting both pdf and png files hardcoded, while we have the input_file_type variable in the User config, would be cool to use that instead

On the scientific side, I've got less required changes since the output looks reasonable, but maybe some more quality of life:

  1. Your two observational datasets seem pretty hardcoded, how much of that is that required or could you (within reasonable time) check if you could support other observations/reanalysis? The more modular the better.
  2. Could you put more information on the selection of the extracted region, what is necessary for the computation, what for the output? The recipes use two different extract_regions. It might be nice to know for the user which of these they can change.
  3. Similarly, some of your recipes use different timeframes - obviously to account for the available data - but none of your filenames or plot titles mention the period used for the computation, just "mean". I think this would be nice to add. Maybe also the name of the dataset in the plottitle, the first two plots in the documentation is very similar until you read the caption.

doc/sphinx/source/recipes/recipe_weathertyping.rst Outdated Show resolved Hide resolved

* correlation_threshold: correlation threshold for selecting similar WT pairs, only needed if automatic_slwt==True and predefined_slwt==False. default=0.9
* rmse_threshold: rmse threshold for selecting similar WT pairs, only needed if automatic_slwt==True and predefined_slwt==False. default=0.002
* plotting: if true, create plots of means, anomalies and std for psl, tas, prcp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* plotting: if true, create plots of means, anomalies and std for psl, tas, prcp
* plotting: if true, create plots of means, anomalies and std for psl, tas, pr

(2) see headers of reformat scripts for non-obs4MIPs data for download
instructions.*

* E-OBS: European Climate Assessment & Dataset gridded observationl daily precipitation sum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* E-OBS: European Climate Assessment & Dataset gridded observationl daily precipitation sum
* E-OBS: European Climate Assessment & Dataset gridded daily precipitation sum

add ERA5 as well


description: |
A diagnostic to calculate Lamb weathertypes over a given region. Furthermore,
correlations between weathertypes and precipitation patterns over a given are can be calculated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
correlations between weathertypes and precipitation patterns over a given are can be calculated
correlations between weathertypes and precipitation patterns over a given area can be calculated

automatic_slwt: true
#predefined_slwt: {1: [1, 2, 7, 8, 10, 11, 18, 19, 20, 25, 26], 2: [3, 13], 3: [4, 5, 6, 9, 14, 15, 16, 17], 4: [24, 23]}
plotting: true
script: ../diag_scripts/weathertyping/weathertyping.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
script: ../diag_scripts/weathertyping/weathertyping.py
script: weathertyping/weathertyping.py

ESMValTool automatically looks in the diag_scripts folder for the diagnostics scripts, no need for a relative path from the recipe

Comment on lines +118 to +121
plt.savefig(f'{output_path}/{dataset_name}_{ensemble}_'
f'{wt_string}_rel_occurrence.png')
plt.savefig(f'{output_path}/{dataset_name}_{ensemble}_'
f'{wt_string}_rel_occurrence.pdf')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to only output one plot filetype using the cfg['output_file_type'] from the user_config. You could use the get_plot_filename function from esmvaltool.diag_scripts.shared

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Figures should be saved in the plot_dir, either in both .pdf and .png format (preferred), or respect the output_file_type specified in the [User configuration file]."
I found this in the "Making a New Diagnostic or Recipe", which is why I decided to hardcode it in the first place! Writing only one plot with output_file_type would certainly save some time though!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, that shows you how much I read the documentation =D I personally prefer using the output_file_type but if it says so in the documentation you can keep it if you want to make as few changes as possible.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:D Thank you for the quick reply! I think it certainly makes sense to let the user have a say in the output file type! I will also talk to my supervisor, and then decide! Thank you!

['Lamb Weathertypes'],
False, False)

log_provenance(f'{dataset}_wt_prov', cfg, provenance_record)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All your calls to log_provenance are only giving filenames without the paths, resulting in the provenance files being created in the folder where I start the tool, not together with the files.

@thomaskroi1996
Copy link
Author

@valeriupredoi @bettina-gier Hello both of you and thank you so much for the replies! Not sure how I missed that, so sorry for the late reply!

Thank you so much for the input, I will correct everything as soon as i have some time on my hands! Unfourtunately I have switched to a different project, but it should be fine to allocate some hours throughout the next weeks to work on the mistakes/improvements!

@thomaskroi1996
Copy link
Author

@valeriupredoi @bettina-gier Hello! Currently the Circle CI test seems to fail because I use "from wt_utils import *", just thought i would comment that here!

Also the minor things are dealt with now, leaving only including the possibility for more observational datasets (for which I will talk to @mwjury), and compacting the recipes!

I also need to rerun the recipes so that I can update the documentation with the new plots etc., however since I currently need our computational resources for another project, hopefully next week I can tend to it!

Thanks again for the great input! :)

@valeriupredoi
Copy link
Contributor

@thomaskroi1996 take your time, no rush at all! I can help sometime next week too, knock off some of those Flake8 issues. You are absolutely correct - and my bad - Flake8 flags star imports - always discouraged people do that, but looks like Flake8 is now the Punisher 😁

@thomaskroi1996
Copy link
Author

@valeriupredoi Alright, over the next couple of days I will import only the necessary functions, and I only just discovered that there were more issues with the code, even though flake8 didn't output any anymore, I should have also run pytest!

One issue is "too many local variables". Those started popping up because I tried to fix "too many arguments" by passing Dicts and then reassigning them to local variables in the functions! Those are all marked as minor issues, should I fix those issues? In my opinion I think the way I have it now makes it a bit more readable and less clustered, and if I remember correctly I once saw a PR where it was said that a few of these issues are okay!

@valeriupredoi
Copy link
Contributor

One issue is "too many local variables". Those started popping up because I tried to fix "too many arguments" by passing Dicts and then reassigning them to local variables in the functions! Those are all marked as minor issues, should I fix those issues? In my opinion I think the way I have it now makes it a bit more readable and less clustered, and if I remember correctly I once saw a PR where it was said that a few of these issues are okay!

That's great you tried to clean up, but don't worry too much about the "too many variables" linter message, scientific computing for ESMs is bound to have lots of variables anyway 🍺

@thomaskroi1996
Copy link
Author

One issue is "too many local variables". Those started popping up because I tried to fix "too many arguments" by passing Dicts and then reassigning them to local variables in the functions! Those are all marked as minor issues, should I fix those issues? In my opinion I think the way I have it now makes it a bit more readable and less clustered, and if I remember correctly I once saw a PR where it was said that a few of these issues are okay!

That's great you tried to clean up, but don't worry too much about the "too many variables" linter message, scientific computing for ESMs is bound to have lots of variables anyway 🍺

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New recipe and diagnostic for calculating Lamb Weathertypes
4 participants