Skip to content

evidence: restore HDF5-to-tile smoke runner + model.py axis-order fix#45

Merged
jakobtfaber merged 1 commit into
mainfrom
worktree-smoke-runner-restore-v2
May 4, 2026
Merged

evidence: restore HDF5-to-tile smoke runner + model.py axis-order fix#45
jakobtfaber merged 1 commit into
mainfrom
worktree-smoke-runner-restore-v2

Conversation

@jakobtfaber
Copy link
Copy Markdown
Contributor

Summary

  • Restores the lost dsa110_continuum/evidence/hdf5_calibrator_tile_smoke.py (1539 lines) plus its CLI wrapper, test suite, and evidence-tree README — only the compiled .pyc files survived
  • Hybrid recovery: 38 of 63 chronological Codex apply_patch envelopes replayed via the real codex binary, then 9 Claude-implemented symbols spliced from extracted patch envelopes (validated against the bytecode disassembly)
  • Bundles a fix for calibration/model.py axis-order bug surfaced by the smoke runner test suite — the manual MODEL_DATA path silently broke on MS files written with alternate (npol, nchan) axis order

Why this is bundled

The model.py fix and the smoke runner restoration share the same Codex session lineage and the test that catches the bug lives in test_hdf5_calibrator_tile_smoke.py. Splitting them would force one PR to ship a known-broken path or skip a regression test the test file gates against.

Test plan

  • pytest tests/test_hdf5_calibrator_tile_smoke.py — 14/14 pass
  • pytest -k 'calibration or model or smoke or applycal or bandpass' — 105/105 pass
  • ruff check clean for changed files (one pre-existing D414 in model.py:990 left as-is per CLAUDE.md tracking-separately rule)
  • Real 3C48 evidence smoke run end-to-end (deferred — not blocking, FLAG-fix path already validated against real .b tables in calibration: receptor-aware FLAG fraction QA gate #44)

Recovery provenance

Codex session lineage:

  • Build: ~/.codex/sessions/2026/04/30/rollout-2026-04-30T00-44-50-…jsonl (47 smoke patches)
  • Follow-up performance-recovery sessions: 9 sessions, 16 additional patches

Claude session lineage:

  • 0eb30783-4208-…jsonl and 1930b697-13f1-…jsonl (Edit/Write tool calls for the 9 symbols not visible in apply_patch envelopes)

Bytecode validated against dsa110_continuum/evidence/__pycache__/hdf5_calibrator_tile_smoke.cpython-312.pyc (52-symbol top-level structure, exact line-number match for spliced functions).

🤖 Generated with Claude Code

… history

The dsa110_continuum/evidence/hdf5_calibrator_tile_smoke.py module and
companion test file went missing from the working tree, leaving only their
__pycache__/*.pyc bytecode and prior run artifacts under outputs/.

This commit restores the full smoke runner via a hybrid recovery:
- Replayed 38 of 63 chronological apply_patch envelopes harvested from the
  Codex session JSONLs covering the build (#38) plus follow-up performance
  recovery work, using the real codex apply_patch binary
- Bytecode-disassembled the .cpython-312.pyc to identify the 9 symbols that
  were Claude-implemented (and therefore not in any apply_patch envelope):
  _load_conversion_api, SmokeRunConfig, StageResult, SmokeRunManifest,
  _validated_run_id, create_work_run_dir, _audit_group_files,
  _cached_transit_time, _cached_isot_time
- Spliced those symbols back from extracted patch envelope content, with
  the documented decorator/lru_cache parameters from the corresponding
  refactor patches

Bundled fix for calibration/model.py axis-order bug surfaced by the smoke
runner test suite: _calculate_manual_model_data now identifies the channel
axis by matching against the SPW channel count, supports both (nchan, npol)
and (npol, nchan) DATA cell layouts, and broadcasts the point-source model
to every correlation while preserving MS DATA axis order. The previous
hard-coded (nchan, npol) assumption silently corrupted MODEL_DATA on MS
files written with the alternate axis order.

Tests: 14/14 smoke runner tests pass, full calibration/model/photometry
subset (105 tests) green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 4, 2026 20:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restores the HDF5-to-calibrator-tile smoke evidence runner (module + CLI wrapper + tests + evidence README) and updates the manual MODEL_DATA writer in calibration/model.py to correctly preserve (nchan, npol) vs (npol, nchan) axis ordering.

Changes:

  • Reintroduce dsa110_continuum.evidence.hdf5_calibrator_tile_smoke (discovery + run orchestration) along with a lightweight scripts/ entrypoint.
  • Add a focused pytest suite covering discovery/preflight helpers, stage logging, and the MODEL_DATA axis-order regression.
  • Fix _calculate_manual_model_data() to detect the channel axis by SPW channel count and write MODEL_DATA in the same per-row axis order as DATA.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
dsa110_continuum/evidence/hdf5_calibrator_tile_smoke.py Restores the smoke evidence workflow implementation (discovery + run pipeline), plus supporting utilities.
dsa110_continuum/calibration/model.py Fixes manual MODEL_DATA generation to respect alternate DATA axis order.
tests/test_hdf5_calibrator_tile_smoke.py Adds tests for discovery behavior, stage logging, cal-table validation, and the MODEL_DATA regression.
scripts/hdf5_calibrator_tile_smoke.py Adds a CLI wrapper entrypoint to invoke the evidence runner.
outputs/hdf5_calibrator_tile_smoke/README.md Documents the evidence workflow and discovery usage.
dsa110_continuum/evidence/__init__.py Introduces the evidence package initializer.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +294 to +298
def create_immutable_run_dir(config: SmokeRunConfig, run_id: str | None = None) -> Path:
"""Create a new run directory and refuse in-place overwrite."""
chosen_run_id = run_id or config.run_id or _utc_run_id(config.calibrator, config.group_id)
run_dir = config.evidence_root / chosen_run_id
run_dir.mkdir(parents=True, exist_ok=False)
_run_stage(manifest, "preflight", preflight)

with _stage_log(run_dir / "logs" / "01_convert.log"):
conversion = _run_stage(manifest, "conversion", lambda: _conversion_stage(config, run_dir))
Comment on lines +56 to +58
The current implementation covers the discovery/preflight phase only. The H17
execution phase still needs the production conversion/calibration/imaging wiring
described in issue #38.
group,
files,
config.fwhm_deg,
file_audit=_audit_group_files(config.group_id, files),
Comment on lines +1217 to +1221
bp_table, g_table = validate_fresh_cal_tables(tables, run_dir)
_write_json(
run_dir / "calibration" / "fresh_calibration_tables.json",
{"tables": [str(t) for t in tables], "bp_table": str(bp_table), "g_table": str(g_table)},
)
"imaging",
lambda: _imaging_stage(
config,
work_dir,
@jakobtfaber jakobtfaber merged commit 9841710 into main May 4, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants