Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make dask.yml path configurable #2369

Open
malininae opened this issue Mar 13, 2024 · 6 comments
Open

Make dask.yml path configurable #2369

malininae opened this issue Mar 13, 2024 · 6 comments
Labels
dask related to improvements using Dask enhancement New feature or request

Comments

@malininae
Copy link
Contributor

@Karen-A-Garcia and I are operationalizing ESMValTool and running a few recipes sequentially. Although our dask expertise are quite limited, we've got a problem that this recipe doesn't seem to work with any dask setup we've tried, while the other recipes do work. We'll try to figure out what the deal is, but, I can see the value of adding the optional configurable dask.yml path to the config_user.yml, sort of like config_developer.yml is set up. Any objections? If someone could volunteer themselves, great, if not I can do it in a dream land called 'after April 15th'.

@malininae malininae added enhancement New feature or request dask related to improvements using Dask labels Mar 13, 2024
@bouweandela
Copy link
Member

bouweandela commented Apr 3, 2024

It looks like that recipe is using a preprocessor function that is not yet lazy (see #674 and SciTools/iris#5795), therefore your best bet is the default threaded scheduler (this can be done by removing the file ~/.esmvaltool/dask.yml). If you run out of memory, try reducing the number of workers (num_workers) in your Dask configuration. This can e.g. be done by creating a file ~/.config/dask/dask.yml with the content num_workers: 4. See the Dask docs for more information.

@bouweandela
Copy link
Member

I can see the value of adding the optional configurable dask.yml path to the config_user.yml, sort of like config_developer.yml is set up. Any objections?

Improving how users can configure Dask is indeed something we would like to do, see #2040 for previous discussion on the topic. However, the plan is to tackle this as part of a larger overhaul of the configuration, as users often find the current way of configuring things, where settings are spread out over multiple files, confusing. See #2371 for the plan.

@k-a-webb
Copy link

I'd like to encourage the option of having a configurable the dask configuration.
Ideally it would be easy to select a specific dask configuration (or no configuration) for individual recipes, just as you can choose a specific config_user file.

At the moment, if one of my recipes does not play nice with dask, it's too much effort to interrupt my workflow to move/rename the ~/.esmvalcore/dask.yml file. For example, I'd have to run all recipes in groups according to their required dask configuration. It's faster to run everything in parallel without dask! This is a shame given the effort to implement dask within ESMValTool.

While I appreciate that there is an effort to make this happen (as part of #2371, as I understand it), I think there could be an immediate benefit to resolving this issue.

@schlunma
Copy link
Contributor

Hi @k-a-webb, I fully agree with you, making dask configurable per recipe would be a very convenient feature!

There is currently a pull request in review that needs to be merged before this issue here can be tackled (#2448). Would you be willing to have a look on it from a user's perspective? We already had a round of technical reviews, so we are mainly interested in the user-friendliness of the proposed approach.

That would be very helpful and speed up the process. Thank you! 👍

@k-a-webb
Copy link

Hi, apologies for the delayed response.
I spent a day muddling through trying to install and test the new configuration, but ran into a few hiccups. I then went on vacation, and have not had time to make progress upon my return. I hope that my inability to do a dedicated test of the pull request does not cause a delay in its progress. If I find the time to document my issues, or make further progress on the test, I will be in touch.

@schlunma
Copy link
Contributor

Thanks @k-a-webb for looking into this and also sorry for the late answer (I just returned from vacation). Is there anything we can help with regarding installation? To test this, you would need an installation from source of ESMValTool and ESMVCalCore.

Our plan was to merge this by the end of September, but if you still want to test this and need more time, that's also no problem at all. Thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dask related to improvements using Dask enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants