Skip to content

Commit

Permalink
Merge pull request #257 from jmccreight/feat_docs_for_release
Browse files Browse the repository at this point in the history
update docs for release
  • Loading branch information
jmccreight authored Dec 15, 2023
2 parents be79e76 + d736df5 commit d2d286d
Show file tree
Hide file tree
Showing 52 changed files with 2,092 additions and 1,537 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ autotest/codecov
autotest/.coverage
autotest/temp

# test_data
test_data/.test_data_version*

# sphinx
generated/

Expand Down
139 changes: 53 additions & 86 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,53 +20,33 @@
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Table of Contents**

- [Purpose](#purpose)
- [About](#about)
- [Installation](#installation)
- [Contributing](#contributing)
- [Example Notebooks](#example-notebooks)
- [Overview of Repository Contents](#overview-of-repository-contents)
- [Getting started / Example notebooks](#getting-started--example-notebooks)
- [Community engagement](#community-engagement)
- [Disclaimer](#disclaimer)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

## Purpose
## About

The purpose of this repository is to refactor and redesign the [PRMS modeling
system](https://www.usgs.gov/software/precipitation-runoff-modeling-system-prms)
while maintaining its functionality. Code modernization is a step towards
unification with [MODFLOW 6 (MF6)](https://github.com/MODFLOW-USGS/modflow6).
Welcome to the pywatershed repository!

The following motivations are taken from our [AGU poster from December
2022](https://agu2022fallmeeting-agu.ipostersessions.com/default.aspx?s=05-E1-C6-40-DF-0D-4D-C7-4E-DE-D2-61-02-05-8F-0A)
which provides additional details on motivations, project status, and current
directions of this project as of approximately January 2023.
Pywatershed is Python package for simulating hydrologic processes motivated by
the need to modernize important, legacy hydrologic models at the USGS,
particularly the
[Precipitation-Runoff Modeling System](https://www.usgs.gov/software/precipitation-runoff-modeling-system-prms)
(PRMS, Markstrom et al., 2015) and its role in
[GSFLOW](https://www.usgs.gov/software/gsflow-coupled-groundwater-and-surface-water-flow-model>)
(Markstrom et al., 2008).
The goal of modernization is to make these legacy models more flexible as process
representations, to support testing of alternative hydrologic process
conceptualizations, and to facilitate the incorporation of cutting edge
modeling techniques and data sources. Pywatershed is a place for experimentation
with software design, process representation, and data fusion in the context
of well-established hydrologic process modeling.

Goals of the USGS Enterprise Capacity (EC) project include:

* A sustainable integrated, hydrologic modeling framework for the U.S.
Geological Survey (USGS)
* Interoperable modeling across the USGS, partner agencies, and academia

Goals for EC Watershed Modeling:

* Couple the Precipitation-Runoff Modeling System (PRMS, e.g. Regan et al,
2018)  with MODFLOW 6 (MF6, e.g. Langevin et al, 2017) in a sustainable
way
* Redesign PRMS to be more modern and flexible
* Prioritize process representations in the current National Hydrological
Model (NHM) based on PRMS 5.2.1

Prototype an EC watershed model: "pywatershed"

* Redesign PRMS quickly in python
* Couple to MF6 via BMI/XMI interface (Hughes et al, 2021; Hutton et al, 2020)
* Establish a prototyping ground for EC codes that couples to the compiled
framework: low cost proof of concepts (at the price of potentially less
computational performance) * Enable process representation hypothesis testing
* Use cutting-edge techniques and technologies to improve models
* Machine learning, automatic differentiation
* Address challenges of modeling across space and time scales
* Transition prototype watershed model to compiled EC code
For more information on the goals and status of pywatershed, please see the [pywatershed docs](https://pywatershed.readthedocs.io/).


## Installation
Expand All @@ -81,7 +61,7 @@ all platforms.
The `pywatershed` package is [available on
conda-forge](https://anaconda.org/conda-forge/pywatershed). The installation
is the quickest way to get up and running by provides only the minimal set of
dependencies (not including jupyter nor all packages needed for running the
dependencies (not including Jupyter nor all packages needed for running the
example notebooks, also not suitable for development purposes).

We recommend the following installation procedures to get fully-functional
Expand All @@ -92,7 +72,7 @@ repository before installing `pywatershed` itself. Mamba will be much faster
than Ananconda (but the conda command could also be used).

If you wish to use the stable release, you will use `main` in place of
`<branch>` in the following commands. If you want to follow developemnt, you'll
`<branch>` in the following commands. If you want to follow development, you'll
use `develop` instead.

Without using `git` (directly), you may:
Expand Down Expand Up @@ -120,68 +100,55 @@ you will also need to activate this environment by name.)


We install the `environment_w_jupyter.yml` to provide all known dependencies
including those for running the eample notebooks. (The `environment.yml`
does not contain jupyter or jupyterlab because this interferes with installation
on WholeTale, see Example Notebooks seection below.)
including those for running the example notebooks. (The `environment.yml`
does not contain Jupyter or JupyterLab because this interferes with installation
on WholeTale, see Getting Started section below.)

## Contributing

See the [developer documentation](./DEVELOPER.md) for instructions on setting up
a development environment. See the [contribution guide](./CONTRIBUTING.md) to
contribute to this project.
## Getting started / Example notebooks

## Example Notebooks
Please note that you can browse the API reference, developer info, and index
in the [pywatershed docs]((https://pywatershed.readthedocs.io/)). But
*the best way to get started with pywatershed is to dive into the example
notebooks*.

For introductory example notebooks, look in the
[`examples/`](https://github.com/EC-USGS/pywatershed/tree/main/examples>)
directory in the repository. Numbered starting at 00, these are meant to be
completed in order. Non-numbered notebooks coveradditional topics. These
notebooks are note yet covered by testing and so may be expected to have some
issues until they are added to testing. In `examples/developer/` there are
notebooks of interest to developers who may want to learn about running the
software tests.

Though no notebook outputs are saved in Github, these notebooks can easily
navigated to and run in WholeTale containers (free but sign-up or log-in
required). This is a very easy and quick way to get started without needing to
install pywatershed requirements yourself. WholeTale is an NSF funded project
and supports logins from many institutions, e.g. the USGS, and you may not need
to register.

There are containers for both the `main` and `develop` branches.
completed in order. Numbered starting at 00, these are meant to be completed
in order. Notebook outputs are not saved in Github. But you can run these
notebooks locally or using WholeTale (an NSF funded project supporting logins
from many institutions, free but sign-up or log-in required)
where the pywatershed environment is all ready to go:

[![WholeTale](https://raw.githubusercontent.com/whole-tale/wt-design-docs/master/badges/wholetale-explore.svg)](https://dashboard.wholetale.org)

* [WholeTale container for latest release (main
branch)](https://dashboard.wholetale.org/run/64ae29e8a887f48b9f173678?tab=metadata)
* [WholeTale container for develop
branch](https://dashboard.wholetale.org/run/64ae25c3a887f48b9f1735c8?tab=metadata)
* [Run latest release in WholeTale](https://dashboard.wholetale.org/run/64ae29e8a887f48b9f173678?tab=metadata)
* [Run the develop branch in WholeTale](https://dashboard.wholetale.org/run/64ae25c3a887f48b9f1735c8?tab=metadata)

WholeTale will give you a jupyter-lab running in the root of this
WholeTale will give you a JupyterLab running in the root of this
repository. You can navigate to `examples/` and then open and run the notebooks
of your choice. The develop container may require the user to update the
repository (`git pull origin`) to stay current with development.

## Overview of Repository Contents
Non-numbered notebooks in `examples/` cover additional topics. These
notebooks are not yet covered by testing and you may encounter some
issues. In `examples/developer/` there are notebooks of interest to
developers who may want to learn about running the software tests.

The contents of directories at this level is described. Therein you may discover
another README.md for more information.

```
.github/: Github actions, scripts and Python environments for continuous integration (CI) and releasing,
asv_benchmarks/: preformance benchmarking by ASV
autotest/: pywatershed package testing using pytest
autotest_exs/: pywatershed example notebook testing using pytest
bin/:PRMS executables distributed
doc/:Package/code documentation source code
evaluation/: tools for evaluation of pywatershed
examples/:How to use the package, mostly jupyter notebooks
prms_src/:PRMS source used for generating executables in bin/
pywatershed/:Package source
reference/:Ancillary materials for development
resources/:Static stuff like images
test_data/:Data used for automated testing
```
## Community engagement

We value your feedback! Please use [discussions](https://github.com/EC-USGS/pywatershed/discussions)
or [issues](https://github.com/EC-USGS/pywatershed/issues) on Github.
For more in-depth contributions, please start by reading over
the pywatershed
[DEVELOPER.md](https://github.com/EC-USGS/pywatershed/blob/develop/DEVELOPER.md) and
[CONTRIBUTING.md](https://github.com/EC-USGS/pywatershed/blob/develop/CONTRIBUTING.md)
guidelines.

Thank you for your interest.


## Disclaimer

Expand Down
2 changes: 1 addition & 1 deletion autotest/test_control.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ def test_control_simple(control_simple):
assert control_simple.time_step == ts
assert control_simple.start_time == time_dict["start_time"]
assert control_simple.end_time == time_dict["end_time"]
assert control_simple.current_time is None
assert control_simple.current_time == control_simple.init_time
assert control_simple.itime_step == -1
prev_time = control_simple.current_time
n_times = control_simple.n_times
Expand Down
89 changes: 68 additions & 21 deletions autotest/test_netcdf_output.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ def control(domain):
control.edit_n_time_steps(n_time_steps)
control.options["budget_type"] = "error"
del control.options["netcdf_output_var_names"]
del control.options["netcdf_output_dir"]
return control


Expand All @@ -40,6 +41,15 @@ def control(domain):
# optional variables to budgets

check_vars = {
"PRMSSolarGeometry": [
"soltab_horad_potsw",
"soltab_potsw",
],
"PRMSAtmosphere": [
"tminf",
"potet",
"swrad",
],
"PRMSCanopy": [
"hru_intcpstor",
"hru_intcpstor_change",
Expand All @@ -65,7 +75,24 @@ def control(domain):
def test_process_budgets(domain, control, params, tmp_path, budget_sum_param):
tmp_dir = pl.Path(tmp_path)
# print(tmp_dir)
model_procs = [pywatershed.PRMSCanopy, pywatershed.PRMSChannel]
model_procs = [
pywatershed.PRMSSolarGeometry,
pywatershed.PRMSAtmosphere,
pywatershed.PRMSCanopy,
pywatershed.PRMSChannel,
]

# setup input_dir with symlinked prms inputs and outputs
domain_output_dir = domain["prms_output_dir"]
input_dir = tmp_path / "input"
input_dir.mkdir()
control.options["input_dir"] = input_dir

# Could limit this to just the variables in model_procs
for ff in domain_output_dir.resolve().glob("*.nc"):
shutil.copy(ff, input_dir / ff.name)
for ff in domain_output_dir.parent.resolve().glob("*.nc"):
shutil.copy(ff, input_dir / ff.name)

# Deal with parameter around what budget sum vars to write and check
if budget_sum_param == "some":
Expand All @@ -80,9 +107,7 @@ def test_process_budgets(domain, control, params, tmp_path, budget_sum_param):
else:
raise ValueError("upexpected value")

# dont need any PRMS inputs for the model specified, so this is sufficient
input_dir = domain["prms_output_dir"]
control.options["input_dir"] = input_dir
control.options["netcdf_output_dir"] = tmp_dir

# TODO: Eliminate potet and other variables from being used
model = Model(
Expand All @@ -91,27 +116,26 @@ def test_process_budgets(domain, control, params, tmp_path, budget_sum_param):
parameters=params,
)

# we are going to harvest the data from memory and store here
check_dict = {proc: {} for proc in check_vars.keys()}

# test outputting specific vars by only using check_vars
output_vars = [
item for sublist in list(check_vars.values()) for item in sublist
]
output_vars = None

with pytest.warns(UserWarning):
with pytest.raises(ValueError):
model.initialize_netcdf(
tmp_dir,
pl.Path("foo"),
budget_args=budget_args,
output_vars=output_vars,
)

with pytest.raises(RuntimeError):
model.initialize_netcdf(
tmp_dir,
budget_args=budget_args,
output_vars=output_vars,
)
model.initialize_netcdf(
output_dir=tmp_dir, # should allow a matching argument to control
budget_args=budget_args,
output_vars=output_vars,
)

for tt in range(n_time_steps):
model.advance()
Expand All @@ -122,11 +146,24 @@ def test_process_budgets(domain, control, params, tmp_path, budget_sum_param):
for vv in pp_vars:
if tt == 0:
# use the output data to figure out the shape
check_dict[pp][vv] = np.zeros(
(n_time_steps, model.processes[pp][vv].shape[0])
)
if isinstance(
model.processes[pp][vv], pywatershed.TimeseriesArray
):
spatial_len = model.processes[pp][vv].data.shape[1]
else:
spatial_len = model.processes[pp][vv].shape[0]

check_dict[pp][vv][tt, :] = model.processes[pp][vv]
check_dict[pp][vv] = np.zeros((n_time_steps, spatial_len))

if isinstance(
model.processes[pp][vv], pywatershed.TimeseriesArray
):
check_dict[pp][vv][tt, :] = model.processes[pp][vv].current
else:
check_dict[pp][vv][tt, :] = model.processes[pp][vv]

if pp in ["PRMSSolarGeometry", "PRMSAtmosphere"]:
continue

for bb in check_budget_sum_vars:
if tt == 0:
Expand All @@ -148,7 +185,15 @@ def test_process_budgets(domain, control, params, tmp_path, budget_sum_param):
for pp, pp_vars in check_vars.items():
for vv in pp_vars:
nc_data = xr.open_dataset(tmp_dir / f"{vv}.nc")[vv]
assert np.allclose(check_dict[pp][vv], nc_data)
if vv in pywatershed.PRMSSolarGeometry.get_variables():
assert np.allclose(
check_dict[pp][vv], nc_data[0:n_time_steps, :]
)
else:
assert np.allclose(check_dict[pp][vv], nc_data)

if pp in ["PRMSSolarGeometry", "PRMSAtmosphere"]:
continue

for bb in check_budget_sum_vars:
nc_data = xr.open_dataset(tmp_dir / f"{pp}_budget.nc")[bb]
Expand Down Expand Up @@ -194,14 +239,12 @@ def test_separate_together_var_list(
]

# setup input_dir with symlinked prms inputs and outputs
test_output_dir = tmp_dir / "test_results"
domain_output_dir = domain["prms_output_dir"]
input_dir = tmp_path / "input"
input_dir.mkdir()
control.options["input_dir"] = input_dir
control.options["netcdf_output_var_names"] = output_vars
control.options["netcdf_output_separate_files"] = separate
del control.options["netcdf_output_dir"]

# Could limit this to just the variables in model_procs
for ff in domain_output_dir.resolve().glob("*.nc"):
Expand All @@ -218,6 +261,7 @@ def test_separate_together_var_list(
# passing no output_dir arg and none in opts throws an error
model.initialize_netcdf()

test_output_dir = tmp_dir / "test_results"
control.options["netcdf_output_dir"] = test_output_dir
model = Model(
model_procs,
Expand Down Expand Up @@ -257,7 +301,10 @@ def test_separate_together_var_list(
assert nc_file.exists()

ds = xr.open_dataset(nc_file, decode_timedelta=False)
proc_vars = set(proc.get_variables())
if output_vars is None:
proc_vars = set(proc.get_variables())
else:
proc_vars = set(check_vars[proc_key])
nc_vars = set(ds.data_vars)
assert proc_vars == nc_vars
for vv in proc.variables:
Expand Down
Loading

0 comments on commit d2d286d

Please sign in to comment.