Integrate BMI init config automatic generation into workflows #607

robertbartel · 2024-05-07T16:30:48Z

Integrate use of the BMI init config generation tools from the ngen-config-gen package in the ngen-cal repo, having BMI init config dataset be derived from realization config and hydrofabric datasets at the AWAITING_DATA job exec step.

aaraney · 2024-05-13T18:50:25Z

python/lib/modeldata/dmod/modeldata/data/bmi_init_config_auto_generator.py

+    See https://www.lynker-spatial.com/copyright.html for license details on hydrofabric data.
+    """
+
+    _module_to_model_map: Dict[str, Any] = {"CFE": Cfe, "PET": Pet}


What are your thoughts on moving this out of the class and passing this in as a parameter to __init__? My thinking is this will provide a smoother pathway for adding support for new modules as they are introduced.

Hmm, not sure. I see the benefit of parameterizing things; I added the other_builder_hook_types init param to future proof along those lines. But I don't see any reason why we ever wouldn't want these available to an instance, certainly for the moment but probably also in the long term. So they should be hard-coded somewhere. And where is better than here?

Yeah that is a fair point. My thinking was that the code that uses this would parametrize it. So, that would most likely be in a service instead of a library. My thinking is it will be easier / less painful to update service code rather than library code.

My thinking is it will be easier / less painful to update service code rather than library code.

Maybe, but only if we only need to do that once. If we are putting BmiInitConfigAutoGenerator into a library, we intend it to at least eventually receive usage in more than just one place in one service (it's a fair question as to whether we should do this or not, but for the moment we weren't asking that).

Again though, I assume for this that an unrecognized model_type_name in the realization config should be treated as invalid, from which follows that certain builders should always be available to an instance.

aaraney

In general, this looks really good. I left a few minor comments that shouldnt take long to get through.

aaraney · 2024-05-13T18:53:29Z

python/lib/modeldata/dmod/modeldata/data/bmi_init_config_auto_generator.py

+
+    _module_to_model_map: Dict[str, Any] = {"CFE": Cfe, "PET": Pet}
+    """ Map of config strings to builders, for modules with builders than can be easily init with no more info. """
+    _no_init_config_modules = {"SLOTH"}


This should also be moved out. I think just for the first go, we can keep the builders and "ignored" modules as separate __init__ params, but it might be useful at some point to combine these ideas into a dataclass.

Similarly to the above, I didn't see a scenario when we could avoid excluding SLOTH. So hard-coding the exclusion seemed appropriate.

I suppose (and again, this lines up with the earlier param) this all assumes that an unrecognized model_type_name in the realization config is invalid, rather than something the user wants ignored. I feel like that is more appropriate in this context: fail immediately, instead of silently trusting that the user didn't make a mistake (e.g., a typo) and turning any actual errors here into failures at a later step. But I'm open to discussing that further.

Same thinking as above. It will be easier to update service code rather than library code.