Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate RTW user config file #3392

Closed
alistairsellar opened this issue Oct 24, 2023 · 7 comments · Fixed by #3615
Closed

Validate RTW user config file #3392

alistairsellar opened this issue Oct 24, 2023 · 7 comments · Fixed by #3615
Assignees
Labels
enhancement Recipe Test Workflow (RTW) Items relevant to the Recipe Test Workflow

Comments

@alistairsellar
Copy link
Contributor

alistairsellar commented Oct 24, 2023

Is your feature request related to a problem? Please describe.
The recipe test workflow creates a user config file that encapsulates settings (e.g. DRS) defined in the cylc workflow config. This user config file should be validated in, or immediately after, the configure task, so that the workflow fails faster in the event of a config error.

Part of #2786

Would you be able to help out?
Yes

@alistairsellar alistairsellar added enhancement Recipe Test Workflow (RTW) Items relevant to the Recipe Test Workflow labels Oct 24, 2023
@alistairsellar alistairsellar self-assigned this Oct 24, 2023
@alistairsellar
Copy link
Contributor Author

There doesn't seem to be an esmvaltool command line argument to validate a config file, so I think the best solution is to import esmvalcore.config.CFG. According to the docs:

All values entered into the config are validated to prevent mistakes, for example, it will warn you if you make a typo in the key:

CFG['output_directory'] = '~/esmvaltool_output'
InvalidConfigParameter: `output_directory` is not a valid config parameter. 

@ehogan
Copy link
Contributor

ehogan commented Oct 26, 2023

Possible scope creep, but whatever you develop as part of this issue could be added as a command line option? I think this could be achieved by adding a validate or equivalent method to the esmvalcore._main.Config class?

@alistairsellar
Copy link
Contributor Author

Yes, scope creep. Will keep it in mind and open a separate issue if it looks like a straightforward extension. New subcommand esmvaltool config validate would need wider consultation of course.

@alistairsellar
Copy link
Contributor Author

My thinking thus far was to use esmvalcore.config.CFG directly in
configure.py
most likely as a parallel to the dictionary update step that I just removed with #3398. This would then validate at the point of adding the values from the RTW environment. Or even perhaps inside get_config_values_from_task_env instead of creating plain dict. Will show you what I mean next time we're on E1.

@ehogan
Copy link
Contributor

ehogan commented Oct 26, 2023

I wonder whether it would be worth validating the configuration file written to disk, since that file will be used by the process tasks? 🤔 It can still be done in configure.py 😊

@chrisbillowsMO
Copy link

chrisbillowsMO commented May 22, 2024

Changes to User Configuration Files

Be aware that user configuration files are changing:

Useful Objects/Files in ESMValCore

  • Key files are in esmvalcore/config/
  • _config_object.py is home to Config() and CFG
  • _config_validators.py is home to the validation functions and the _validators dict
  • Config() inherits from ValidatedConfig() which is where (I think) the validators are maybe, actually called.
  • ValidatedConfig() live in _validated_config.py`.
  • ValidatedConfig() inherits from MutableMapping from collections.abc` - which was as deep as I went so far.

Basics

Some basics for myself in case useful to others.

By default, instantiating a CFG object loads and validates the user configuration file from your default location:

from esmvalcore.config._config_object import CFG
my_local_CFG = CFG
print(my_local_CFG)

If you try and change a value to an invalid value you'll get an error:

my_local_CFG['run_diagnostic'] = 100 # expects a bool

To load a specific user configuration file:

non_default_usr_cfg = CFG._load_user_config(Path("path/to/non_default_user_config.yml"))
print(non_default_usr_cfg)

The CFG object instantiates a Config() object under the hood and can be used directly (not recommended).

my_local_Config = Config()
my_local_Config.load_from_file(Path("path/to/user_config_file_.yml)")
print(my_local_Config)

Important

When a user config file is loaded it is automatically validated and will throw an error e.g. InvalidConfigParameter: Key 'exit_on_warning': Could not convert '100' to 'bool'

Solution 1 - Instantiate a CFG object from the user config YAML file we create

This could easily be used to validate the user config file created in configure.py. HOWEVER, this will throw on the first error invalid value encountered; multiple errors would require multiple re-runs and multiple indivual file corrections.

Exploring better:

Solution 2 - Call the Validators manually

There is a constant _validators that can be accessed as an attribute on CFG or Configure() objects.

CFG._validate
Configure._validate

or imported directly:

from esmvalcore.config._config_validators import ValidationError, _validators

It returns a dict where the key is a user configuration field and the value is the validation function for that field.

{
    'auxiliary_data_dir': <function validate_path at 0x7f0e1964e950>, 
    'compress_netcdf': <function validate_bool at 0x7f0e1964e7a0>
    'config_developer_file': <function validate_config_developer at 0x7f9a0e787370>, 
    'download_dir': <function validate_path at 0x7f9a0e786950>, 
    'drs': <function validate_drs at 0x7f9a0e7872e0>, 
    'exit_on_warning': <function validate_bool at 0x7f9a0e7867a0>
    ...
}

The validation functions live in _config_validators.py.

Currently configure.py has config_values = get_config_values_from_task_env(). We could iterate over the dictionary config_values and run the appropriate validation function within a try/except clause, allowing us to capture all the errors and raise at the end - displaying ALL invalid fields to the user at once.

E.g.

def validate_user_config_file(user_config_content):
    """Collects errors from running ``ESMValCore.config._validators`` functions.
    
    Parameters
    ----------
    user_config_content: dict
        An ESMValTool user configuration file loaded into memory as a Python
        dictionary.
    """
    errors = []
    for cfg_key, cfg_value in user_config_content.items():
        try:
            validatation_function = _validators[cfg_key]
        except KeyError as err:
            errors.append(f'Key Error for {cfg_key.upper()}. May not be a valid ESMValTool user configuration key\nERROR: {err}\n') 
        else:
            try:
                print(f'Validating {cfg_key.upper()} with value "{cfg_value}" using function {validatation_function.__name__.upper()}.')
                validatation_function(cfg_value)
            except ValidationError as err:
                errors.append(f'Validation error for {cfg_key.upper()} with value "{cfg_value}"\nERROR: {err}\n')
    if errors:
        print(errors)
        raise ValidationError("\n".join(errors))

Key Errors

In addition to ValidationErrors this code also catches (intentionally raised) KeyErrors. I haven't yet figured out how ESMValCore responds to rogue fields in the user config file. There seems to be something relating to this in ValidatedConfig() but given

I think this code suggest that _validate is an exhaustive list of all valid configuration fields:

    def __setitem__(self, key, val):
        """Map key to value."""
        if key not in self._validate:
            raise InvalidConfigParameter(
                f"`{key}` is not a valid config parameter."
            )

And _valid is populated from _validators - if so we can assume our KeyErrors are "definitely" rogue fields. (This is something to delve into more deeply later if we decide to go this route).

Raise Now vs Raise Later

CFG, Configure() and ValidatedConfig() seem (forgive my possible ignorance) to be built on the principle to raise errors immediately. Changing this on first glance would seem to be a major refactor e.g. altering every validation method individually (again, I may be wrong here).

If we do want a CLI option in ESMValCore per ESMValGroup/ESMValCore#2246 one way to achieve this might be a class decorator. I've played around with this a tiny bit and seems possible but a thought for another day.

Discussion with @ehogan

We discussed that Solution No. 1 is too inelegant and runs against the "fail quickly" imperative behind the initial.

I am therefore going to implement a fully tested version of Solution 2 - raising Key Errors with possible rogue fields noted as well as Validation Errors.

We'll get it to that stage and then at Draft PR potentially seek wise counsel on some of the broader considerations.

@chrisbillowsMO
Copy link

chrisbillowsMO commented May 23, 2024

This is how it behaves when run within the workflow.

With Valid values

The configure tasks runs as expected with current get_config_values_from_task_env values. This output is added to std.out

Validating AUXILIARY_DATA_DIR with value "" using function VALIDATE_PATH.
Validating CHECK_LEVEL with value "DEFAULT" using function VALIDATE_CHECK_LEVEL.
Validating DOWNLOAD_DIR with value "" using function VALIDATE_PATH.
Validating DRS with value "{'ana4mips': 'default', 'CMIP3': 'default', 'CMIP5': 'BADC', 'CMIP6': 'BADC', 'CORDEX': 'default', 'native6': 'default', 'OBS': 'default', 'obs4MIPs': 'default', 'OBS6': 'default'}" using function VALIDATE_DRS.
Validating EXTRA_FACETS_DIR with value "[]" using function VALIDATE_PATHLIST.
Validating MAX_PARALLEL_TASKS with value "4" using function VALIDATE_INT_OR_NONE.
Validating OUTPUT_DIR with value "XXXXXXX" using function VALIDATE_PATH.
Validating REMOVE_PREPROC_DIR with value "False" using function VALIDATE_BOOL.
Validating ROOTPATH with value "{'ana4mips': '', 'CMIP3': '', 'CMIP5': 'XXXXXXX', 'CMIP6': 'XXXXXXX', 'CORDEX': '', 'native6': '', 'OBS': 'XXXXXXX', 'obs4MIPs': 'XXXXXXX', 'OBS6': '', 'RAWOBS': ''}" using function VALIDATE_ROOTPATH.
All validation checks passed.

One Invalid Value

Made this change to get_config_values_from_task_env:

-        "download_dir": "",
+        "download_dir": 100, 

The configure tasks now fails. This is the std.err output:

    raise ValidationError("\n".join(errors))
esmvalcore.config._config_validators.ValidationError: There were validation errors in your user configuration file. See details below.

Validation error for DOWNLOAD_DIR with value "100"
ERROR: Expected a path, but got 100

[FAIL] rtw-env configure.py # return-code=1
2024-05-23T14:54:37Z CRITICAL - failed/ERR

Multiple Invalid Values & A Key Error

Made these change to get_config_values_from_task_env:

+        "INVALID_KEY" : "value",

-        "download_dir": "",
+        "download_dir": 100, 

-        "max_parallel_tasks": int(os.environ["MAX_PARALLEL_TASKS"]),
+        "max_parallel_tasks": "lots and lots", 

-        "remove_preproc_dir": False,
+        "remove_preproc_dir": 100, 

The configure task again fails and this is the std.err.

File "XXXXXXXX/configure.py", line 127, in validate_user_config_file
    raise ValidationError("\n".join(errors))
esmvalcore.config._config_validators.ValidationError: There were validation errors in your user configuration file. See details below.

Key Error for INVALID_KEY. May not be a valid ESMValTool user configuration key
ERROR: 'INVALID_KEY'

Validation error for DOWNLOAD_DIR with value "100"
ERROR: Expected a path, but got 100

Validation error for MAX_PARALLEL_TASKS with value "lots and lots"
ERROR: Could not convert 'lots and lots' to int

Validation error for REMOVE_PREPROC_DIR with value "100"
ERROR: Could not convert `100` to `bool`

[FAIL] rtw-env configure.py # return-code=1
2024-05-23T14:48:56Z CRITICAL - failed/ERR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Recipe Test Workflow (RTW) Items relevant to the Recipe Test Workflow
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants