Skip to content
GuyLitt-NOAA edited this page Jan 9, 2025 · 7 revisions

Welcome to the formulation-selector wiki!

Regionalization and Formulation Selection & Testing (RaFTS) 🛟

With several models/modules to choose from for a modeling objective, it can get overwhelming to make decisions. RaFTS is here to provide support for making data-informed modeling decisions.


As NOAA OWP and its federal and academic partners build the model-agnostic Next Generation Water Resources Modeling Framework (NextGen), we will need to know how to optimally select model formulations and estimate parameter values for ungauged catchments. This problem becomes intractable when considering the unique combinations of current and future model formulations combined with the innumerable possible parameter combinations across the continent. To simplify the model selection problem, we apply an analytical tool that predicts hydrologic formulation performance using community-generated data.

The Regionalization and Formulation Testing & Selection (RaFTS) tool readily predicts how models might perform, or how hydrologic signatures might appear, across ungauged catchments based on catchment attributes such as climate characteristics and geology. RaFTS does this by training an algorithm on the relationships between the response variables (model metrics or hydrologic signatures) and the predictor variables (corresponding catchment attributes). The trained RaFTS algorithm(s) then predict model formulation performance or hydrologic signatures in ungauged catchments based on known catchment attributes. The RaFTS tool is designed such that as the hydrologic modeling community generates more results, better decisions can be made on where model formulations would be best suited. Here we present the baseline results on formulation selection and demonstrate how the hydrologic modeling community may participate in improving and/or using the RaFTS tool to inform NextGen.

RaFTS Overview

The full RaFTS framework consists of multiple tools in python and R. More details may be found in the RaFTS processing scripts wiki section. These tools consist of the following:

1. Metrics/hydrologic signature dataset processor

The fsds_proc python tool converts metrics or hydrologic signature data into a standardized format (netcdf file) that is used by subsequent processing/algorithm applications.

2. Hydrofabric attribute processor

This R tool proc.attr.hydfab retrieves the desired catchment attributes to use for training algorithms. The current capabilities allow the user to request catchment attribute data from the hydroatlas version 1 and/or the USGS NHD network attributes from nhdplusTools. The hydrofabric processor retrieves a catchment's desired attribute data and saves it in a standard format, requiring the catchment COMID as the standardized location identifier.

3. Algorithm training and prediction

The fs_algo python tool trains a random forest and/or multilayer perceptron on the catchment attributes acquired using proc.attr.hydfab to predict formulation metrics or hydrologic signatures. The tool saves the trained algorithms and generates evaluation metrics as metadata to those trained algorithms. The fs_algo module allows users to specify multiple hyperparameter values if a grid-search of an optimal hyperparameter algorithm is desired during the algorithm training process.

4 & 5. Examples of how to predict the response variables in other locations

These steps are custom, and very specific to the end-user's goals. This is where the end-user would take trained algorithms that they deem suitable for additional predictions, and then make those predictions. While this code base has some examples of how this could look like, these are not formalized processing steps like the aforementioned processing steps, and reference paths to datasets that are not part of this repo.