-
Notifications
You must be signed in to change notification settings - Fork 3
Home
Welcome to the formulation-selector wiki!
With several models/modules to choose from for a modeling objective, it can get overwhelming to make decisions. RaFTS is here to provide support for making data-informed modeling decisions.
As NOAA OWP and its federal and academic partners build the model-agnostic Next Generation Water Resources Modeling Framework (NextGen), we will need to know how to optimally select model formulations and estimate parameter values for ungauged catchments. This problem becomes intractable when considering the unique combinations of current and future model formulations combined with the innumerable possible parameter combinations across the continent. To simplify the model selection problem, we apply an analytical tool that predicts hydrologic formulation performance using community-generated data.
The Regionalization and Formulation Testing & Selection (RaFTS) tool readily predicts how models might perform, or how hydrologic signatures might appear, across ungauged catchments based on catchment attributes such as climate characteristics and geology. RaFTS does this by training an algorithm on the relationships between the response variables (model metrics or hydrologic signatures) and the predictor variables (corresponding catchment attributes). The trained RaFTS algorithm(s) then predict model formulation performance or hydrologic signatures in ungauged catchments based on known catchment attributes. The RaFTS tool is designed such that as the hydrologic modeling community generates more results, better decisions can be made on where model formulations would be best suited. Here we present the baseline results on formulation selection and demonstrate how the hydrologic modeling community may participate in improving and/or using the RaFTS tool to inform NextGen.
The full RaFTS framework consists of multiple tools in python and R. More details may be found in the RaFTS processing scripts wiki section. These tools consist of the following:
The fsds_proc
python tool converts metrics or hydrologic signature data into a standardized format (netcdf file) that is used by subsequent processing/algorithm applications.
This R tool proc.attr.hydfab
retrieves the desired catchment attributes to use for training algorithms. The current capabilities allow the user to request catchment attribute data from the hydroatlas version 1 and/or the USGS NHD network attributes from nhdplusTools. The hydrofabric processor retrieves a catchment's desired attribute data and saves it in a standard format, requiring the catchment COMID as the standardized location identifier.
The fs_algo
python tool trains a random forest and/or multilayer perceptron on the catchment attributes acquired using proc.attr.hydfab
to predict formulation metrics or hydrologic signatures. The tool saves the trained algorithms and generates evaluation metrics as metadata to those trained algorithms. The fs_algo
module allows users to specify multiple hyperparameter values if a grid-search of an optimal hyperparameter algorithm is desired during the algorithm training process.
These steps are custom, and very specific to the end-user's goals. This is where the end-user would take trained algorithms that they deem suitable for additional predictions, and then make those predictions. While this code base has some examples of how this could look like, these are not formalized processing steps like the aforementioned processing steps, and reference paths to datasets that are not part of this repo.