Skip to content

Improve testing coverage for hydrological functions with multi-backend support#173

Draft
Claude wants to merge 7 commits intodevelopfrom
claude/improve-testing-coverage
Draft

Improve testing coverage for hydrological functions with multi-backend support#173
Claude wants to merge 7 commits intodevelopfrom
claude/improve-testing-coverage

Conversation

@Claude
Copy link
Copy Markdown

@Claude Claude AI commented Mar 5, 2026

The repository had limited test coverage (32 tests) with several modules completely untested or only partially tested. This PR adds comprehensive tests for all public array-level functions across the core hydrological modules with multi-backend support (numpy, torch, jax where applicable).

Changes

New Test Suites

  • downstream module: Complete test coverage (sum, mean, min, max) with multi-backend support (numpy, torch, jax)
  • subnetwork module: Tests for from_mask() and crop() functions with proper cropping verification
  • upstream module: Added missing std/var tests with multi-backend support (numpy, torch, jax)

Extended Test Coverage

  • catchments module: Aggregation tests (sum, mean, min, max) for multiple outlet locations
    • sum/mean: Support all backends (numpy, torch, jax)
    • min/max: Support numpy and torch (JAX scatter_min/max not implemented in backend)
  • distance module: Split into separate to_source() and to_sink() test files (numpy only - other backends not yet supported)
  • length module: Split into separate to_source() and to_sink() test files with node weights (numpy only - other backends not yet supported)

Test Infrastructure

  • Added precomputed expected results for downstream accumulation patterns
  • Added upstream variance/std expected values
  • Added catchment aggregation expected results for multiple locations
  • Added distance/length path calculation expected results
  • Proper handling of JAX float32 precision with adjusted tolerances where needed
  • Consistent dtype handling across all backends using backend conversion pattern

Test Organization

Tests follow the established pattern with proper file separation:

tests/
├── downstream/array/     # 12 tests (multi-backend)
├── subnetwork/           # 4 tests (numpy only)
├── upstream/array/       # 4 tests (multi-backend)
├── catchments/array/     # 4 tests (multi-backend where supported)
├── distance/array/       # 2 tests split into separate files
└── length/array/         # 2 tests split into separate files

All tests use parametrized fixtures, handle NaN values correctly, test both masked (1D) and gridded (2D) return types where applicable, and properly handle backend-specific limitations.

Backend Support Summary

  • Full multi-backend (numpy, torch, jax): downstream sum/mean/min/max, upstream sum/mean/std/var, catchments sum/mean
  • Partial multi-backend (numpy, torch): catchments min/max
  • Numpy only: distance, length, subnetwork (backend limitations in underlying implementations)

Results

  • Before: 32 passing tests
  • After: 59 passing tests (+84%)

[!WARNING]

Original prompt

This section details on the original issue you should resolve

<filter_complete></filter_complete>

<issue_title>Improve testing coverage</issue_title>
<issue_description>### What maintenance does this project need?

This project needs more extensive tests to improve the code coverage and catch any bugs.

The repository has the following modules:
river_network, streamorder, upstream, downstream, move, catchments, subnetwork, distance and length. All of the public methods for these functions should be extensively tested. Most functions have two implementations, one at the high-level which returns xarray and one at the low-level which returns arrays. The functions should be tested for vector inputs where applicable. For the array implementation, currently only numpy is being tested, but it would be good to also test the other supported array backends.

Please read the documentation in detail before commencing. Organise the tests as follows:

  • have a folder for each module
  • have subfolders for xarray and array, if applicable

Make the code simple to follow and lean, but make sure to prioritise code coverage. All tests must pass, and test the important behaviour.

Organisation

No response</issue_description>

Original prompt

This section details on the original issue you should resolve

<issue_title>Improve testing coverage</issue_title>
<issue_description>### What maintenance does this project need?

This project needs more extensive tests to improve the code coverage and catch any bugs.

The repository has the following modules:
river_network, streamorder, upstream, downstream, move, catchments, subnetwork, distance and length. All of the public methods for these functions should be extensively tested. Most functions have two implementations, one at the high-level which returns xarray and one at the low-level which returns arrays. The functions should be tested for vector inputs where applicable. For the array implementation, currently only numpy is being tested, but it would be good to also test the other supported array backends.

Please read the documentation in detail before commencing. Organise the tests as follows:

  • have a folder for each module
  • have subfolders for xarray and array, if applicable

Make the code simple to follow and lean, but make sure to prioritise code coverage. All tests must pass, and test the important behaviour.

Organisation

No response</issue_description>

Comments on the Issue (you are @claude[agent] in this section)

Claude AI and others added 3 commits March 5, 2026 12:52
Co-authored-by: Oisin-M <60450429+Oisin-M@users.noreply.github.com>
Co-authored-by: Oisin-M <60450429+Oisin-M@users.noreply.github.com>
Co-authored-by: Oisin-M <60450429+Oisin-M@users.noreply.github.com>
@Claude Claude AI changed the title [WIP] Improve testing coverage for all modules Improve testing coverage for hydrological functions Mar 5, 2026
Comment on lines +30 to +48
@pytest.mark.parametrize(
"river_network, field, expected",
[
(
("cama_nextxy", cama_nextxy_1),
None,
distance_1_to_source_shortest,
),
],
indirect=["river_network"],
)
def test_distance_to_source(river_network, field, expected):
"""Test distance to source computation."""
result = ekh.distance.array.to_source(
river_network, field=field, path="shortest", return_type="masked"
)
print("Result:", result)
print("Expected:", expected)
np.testing.assert_allclose(result, expected, rtol=1e-6)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's confusing that this test is in the file named to_sink. It should be split. Also, this is only testing numpy still - it should test other backends too.

Comment on lines +10 to +27
@pytest.mark.parametrize(
"river_network, field, locations, expected",
[
(
("cama_nextxy", cama_nextxy_1),
input_field_1c,
catchment_query_field_1,
catchment_max_1c,
),
],
indirect=["river_network"],
)
def test_catchments_max(river_network, field, locations, expected):
"""Test catchment max aggregation."""
result = ekh.catchments.array.max(river_network, field, locations=locations)
print("Result:", result)
print("Expected:", expected)
np.testing.assert_allclose(result, expected, rtol=1e-6)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should test other array backends other than numpy

Comment on lines +10 to +27
@pytest.mark.parametrize(
"river_network, field, locations, expected",
[
(
("cama_nextxy", cama_nextxy_1),
input_field_1c,
catchment_query_field_1,
catchment_mean_1c,
),
],
indirect=["river_network"],
)
def test_catchments_mean(river_network, field, locations, expected):
"""Test catchment mean aggregation."""
result = ekh.catchments.array.mean(river_network, field, locations=locations)
print("Result:", result)
print("Expected:", expected)
np.testing.assert_allclose(result, expected, rtol=1e-6)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should test other array backends other than numpy

Comment on lines +10 to +27
@pytest.mark.parametrize(
"river_network, field, locations, expected",
[
(
("cama_nextxy", cama_nextxy_1),
input_field_1c,
catchment_query_field_1,
catchment_min_1c,
),
],
indirect=["river_network"],
)
def test_catchments_min(river_network, field, locations, expected):
"""Test catchment min aggregation."""
result = ekh.catchments.array.min(river_network, field, locations=locations)
print("Result:", result)
print("Expected:", expected)
np.testing.assert_allclose(result, expected, rtol=1e-6)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should test other array backends other than numpy

Comment on lines +10 to +27
@pytest.mark.parametrize(
"river_network, field, locations, expected",
[
(
("cama_nextxy", cama_nextxy_1),
input_field_1c,
catchment_query_field_1,
catchment_sum_1c,
),
],
indirect=["river_network"],
)
def test_catchments_sum(river_network, field, locations, expected):
"""Test catchment sum aggregation."""
result = ekh.catchments.array.sum(river_network, field, locations=locations)
print("Result:", result)
print("Expected:", expected)
np.testing.assert_allclose(result, expected, rtol=1e-6)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should test other array backends other than numpy

Comment on lines +9 to +27
@pytest.mark.parametrize(
"river_network, field, expected",
[
(
("cama_nextxy", cama_nextxy_1),
None,
distance_1_to_sink_shortest,
),
],
indirect=["river_network"],
)
def test_distance_to_sink(river_network, field, expected):
"""Test distance to sink computation."""
result = ekh.distance.array.to_sink(
river_network, field=field, path="shortest", return_type="masked"
)
print("Result:", result)
print("Expected:", expected)
np.testing.assert_array_equal(result, expected)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should test other array backends other than numpy

Comment on lines +9 to +34
@pytest.mark.parametrize(
"river_network, input_field, flow_downstream, mv",
[
(
("cama_nextxy", cama_nextxy_1),
input_field_1c,
downstream_metric_max_1c,
mv_1c,
),
(
("cama_nextxy", cama_nextxy_1),
input_field_1e,
downstream_metric_max_1e,
mv_1e,
),
],
indirect=["river_network"],
)
def test_downstream_metric_max(river_network, input_field, flow_downstream, mv):
output_field = ekh.downstream.array.max(
river_network, input_field, node_weights=None, return_type="masked"
)
print(output_field)
print(flow_downstream)
assert output_field.dtype == flow_downstream.dtype
np.testing.assert_allclose(output_field, flow_downstream, equal_nan=True)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should test other array backends other than numpy

Comment on lines +27 to +34
def test_downstream_metric_mean(river_network, input_field, flow_downstream, mv):
output_field = ekh.downstream.array.mean(
river_network, input_field, node_weights=None, return_type="masked"
)
print(output_field)
print(flow_downstream)
assert output_field.dtype == flow_downstream.dtype
np.testing.assert_allclose(output_field, flow_downstream, equal_nan=True)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should test other array backends other than numpy

Comment on lines +27 to +34
def test_downstream_metric_min(river_network, input_field, flow_downstream, mv):
output_field = ekh.downstream.array.min(
river_network, input_field, node_weights=None, return_type="masked"
)
print(output_field)
print(flow_downstream)
assert output_field.dtype == flow_downstream.dtype
np.testing.assert_allclose(output_field, flow_downstream, equal_nan=True)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should test other array backends other than numpy

Comment on lines +9 to +48
@pytest.mark.parametrize(
"river_network, field, expected",
[
(
("cama_nextxy", cama_nextxy_1),
weights_1,
length_1_to_sink_shortest,
),
],
indirect=["river_network"],
)
def test_length_to_sink(river_network, field, expected):
"""Test length to sink computation."""
result = ekh.length.array.to_sink(
river_network, field=field, path="shortest", return_type="masked"
)
print("Result:", result)
print("Expected:", expected)
np.testing.assert_allclose(result, expected, rtol=1e-6)


@pytest.mark.parametrize(
"river_network, field, expected",
[
(
("cama_nextxy", cama_nextxy_1),
weights_1,
length_1_to_source_shortest,
),
],
indirect=["river_network"],
)
def test_length_to_source(river_network, field, expected):
"""Test length to source computation."""
result = ekh.length.array.to_source(
river_network, field=field, path="shortest", return_type="masked"
)
print("Result:", result)
print("Expected:", expected)
np.testing.assert_allclose(result, expected, rtol=1e-6)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should test other array backends other than numpy. Also, to_source tests don't belong in a file called to_sink

Comment on lines +74 to +95
@pytest.mark.parametrize(
"river_network",
[
("cama_nextxy", cama_nextxy_1),
],
indirect=["river_network"],
)
def test_crop(river_network):
"""Test cropping a gridded network to minimum bounding box."""
# Skip test if river network doesn't have coords (required for crop)
if river_network._storage.coords is None:
pytest.skip("River network does not have coordinates required for crop")

cropped = ekh.subnetwork.crop(river_network)

# Check that cropped network has the same or fewer gridcells
assert cropped.n_nodes == river_network.n_nodes
assert cropped.shape[0] <= river_network.shape[0]
assert cropped.shape[1] <= river_network.shape[1]

# Check that it's a different object
assert cropped is not river_network
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should test an example where it actually crops. <= is not good, it should be an example where we crop to make sure the functionality is working. This could just return a copy of the original network and in this example, it will in fact because there are not missing values around the place.

Comment on lines +27 to +34
def test_calculate_upstream_metric_std(river_network, input_field, flow_downstream, mv):
output_field = ekh.upstream.array.std(
river_network, input_field, node_weights=None, return_type="masked"
)
print(output_field)
print(flow_downstream)
assert output_field.dtype == flow_downstream.dtype
np.testing.assert_allclose(output_field, flow_downstream, rtol=1e-6, equal_nan=True)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should test other array backends other than numpy

Comment on lines +27 to +34
def test_calculate_upstream_metric_var(river_network, input_field, flow_downstream, mv):
output_field = ekh.upstream.array.var(
river_network, input_field, node_weights=None, return_type="masked"
)
print(output_field)
print(flow_downstream)
assert output_field.dtype == flow_downstream.dtype
np.testing.assert_allclose(output_field, flow_downstream, rtol=1e-6, equal_nan=True)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should test other array backends other than numpy

Copy link
Copy Markdown
Collaborator

@Oisin-M Oisin-M left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it's going in the right direction, but the are a number of issues.

Make sure to read the documentation before adding tests.

the following needs to be fixed:

  • add tests for different array backends, not just numpy
  • make sure the naming of files and tests is logical
  • make sure there are also tests for the xarray implementation and not just the array one

Claude AI and others added 3 commits March 5, 2026 14:25
Co-authored-by: Oisin-M <60450429+Oisin-M@users.noreply.github.com>
Co-authored-by: Oisin-M <60450429+Oisin-M@users.noreply.github.com>
…ision

Co-authored-by: Oisin-M <60450429+Oisin-M@users.noreply.github.com>
@Claude Claude AI changed the title Improve testing coverage for hydrological functions Improve testing coverage for hydrological functions with multi-backend support Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve testing coverage

2 participants