Skip to content

fix: don't mark missing multiscales metadata as processed#96

Open
wietzesuijker wants to merge 2 commits intoEOPF-Explorer:mainfrom
wietzesuijker:fix/s2-multiscales-metadata-none-handling
Open

fix: don't mark missing multiscales metadata as processed#96
wietzesuijker wants to merge 2 commits intoEOPF-Explorer:mainfrom
wietzesuijker:fix/s2-multiscales-metadata-none-handling

Conversation

@wietzesuijker
Copy link
Contributor

@wietzesuijker wietzesuijker commented Dec 12, 2025

Problem

add_multiscales_metadata_to_parent() can legitimately return None when inputs are insufficient to generate multiscales metadata, e.g.

  • fewer than 2 resolutions are available
  • CRS is missing
  • bounds are missing

Previously, this “skipped” outcome was still recorded in processed_groups["/measurements/reflectance"], making runs look complete even though no multiscales metadata was written.

Fix

  • Type add_multiscales_metadata_to_parent() as xr.DataTree | None.
  • If it returns None, log an info message and do not record it in processed_groups.
  • Preserve parent-group attrs during consolidation by avoiding recreation of groups that already exist on disk (prevents overwriting attrs like multiscales / zarr_conventions).
  • Make multiscales metadata writing more robust when .rio.* metadata is unavailable (fixture-like inputs):
    • prefer rioxarray for CRS/bounds when available
    • fall back to provided crs
    • fall back to estimating bounds from 1D x/y coords when .rio.bounds() is unavailable

Notes

  • Multiscales metadata is written as parent-group attributes (no dataset write).
  • processed_groups intentionally tracks leaf dataset groups written (used later for consolidation), so it should not be updated for the attrs-only step.

Testing

  • uv run pytest -q tests/test_s2_multiscale.py tests/test_s2_multiscale_geo_metadata.py
  • uv run pytest -q tests/test_cli_e2e.py::test_convert_s2_optimized

Comment on lines 266 to 268

if dt_multiscale is None:
log.warning("Multiscales metadata not added", base_path=base_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here processed_groups is never updated...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's intentional, @emmanuelmathot. I added 2b43aa0 to clarify why:

multiscales metadata is written as parent-group attrs (no dataset write). processed_groups is used to track dataset groups written (and is later used for consolidation), so it is intentionally not updated for the attrs-only step.

The naming is perhaps not the easiest here. I added some suggestions to the PR description above

@wietzesuijker wietzesuijker force-pushed the fix/s2-multiscales-metadata-none-handling branch from 6fca5cf to 2b43aa0 Compare December 12, 2025 17:21
- Don’t treat attrs-only multiscales writes as processed dataset groups\n- Preserve existing group attrs during root consolidation (e.g. /measurements/reflectance multiscales)\n- Add CRS/bounds fallbacks so multiscales attrs can be written from fixtures
@wietzesuijker wietzesuijker force-pushed the fix/s2-multiscales-metadata-none-handling branch from 2b43aa0 to 5f09a93 Compare December 12, 2025 19:02
@codecov-commenter
Copy link

codecov-commenter commented Dec 12, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 61.81818% with 21 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/eopf_geozarr/s2_optimization/s2_multiscale.py 39.39% 20 Missing ⚠️
...pf_geozarr/data_api/geozarr/multiscales/geozarr.py 92.85% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants