evidence: restore HDF5-to-tile smoke runner + model.py axis-order fix by jakobtfaber · Pull Request #45 · dsa110/dsa110-continuum

jakobtfaber · 2026-05-04T20:08:07Z

Summary

Restores the lost dsa110_continuum/evidence/hdf5_calibrator_tile_smoke.py (1539 lines) plus its CLI wrapper, test suite, and evidence-tree README — only the compiled .pyc files survived
Hybrid recovery: 38 of 63 chronological Codex apply_patch envelopes replayed via the real codex binary, then 9 Claude-implemented symbols spliced from extracted patch envelopes (validated against the bytecode disassembly)
Bundles a fix for calibration/model.py axis-order bug surfaced by the smoke runner test suite — the manual MODEL_DATA path silently broke on MS files written with alternate (npol, nchan) axis order

Why this is bundled

The model.py fix and the smoke runner restoration share the same Codex session lineage and the test that catches the bug lives in test_hdf5_calibrator_tile_smoke.py. Splitting them would force one PR to ship a known-broken path or skip a regression test the test file gates against.

Test plan

pytest tests/test_hdf5_calibrator_tile_smoke.py — 14/14 pass
pytest -k 'calibration or model or smoke or applycal or bandpass' — 105/105 pass
ruff check clean for changed files (one pre-existing D414 in model.py:990 left as-is per CLAUDE.md tracking-separately rule)
Real 3C48 evidence smoke run end-to-end (deferred — not blocking, FLAG-fix path already validated against real .b tables in calibration: receptor-aware FLAG fraction QA gate #44)

Recovery provenance

Codex session lineage:

Build: ~/.codex/sessions/2026/04/30/rollout-2026-04-30T00-44-50-…jsonl (47 smoke patches)
Follow-up performance-recovery sessions: 9 sessions, 16 additional patches

Claude session lineage:

0eb30783-4208-…jsonl and 1930b697-13f1-…jsonl (Edit/Write tool calls for the 9 symbols not visible in apply_patch envelopes)

Bytecode validated against dsa110_continuum/evidence/__pycache__/hdf5_calibrator_tile_smoke.cpython-312.pyc (52-symbol top-level structure, exact line-number match for spliced functions).

🤖 Generated with Claude Code

… history The dsa110_continuum/evidence/hdf5_calibrator_tile_smoke.py module and companion test file went missing from the working tree, leaving only their __pycache__/*.pyc bytecode and prior run artifacts under outputs/. This commit restores the full smoke runner via a hybrid recovery: - Replayed 38 of 63 chronological apply_patch envelopes harvested from the Codex session JSONLs covering the build (#38) plus follow-up performance recovery work, using the real codex apply_patch binary - Bytecode-disassembled the .cpython-312.pyc to identify the 9 symbols that were Claude-implemented (and therefore not in any apply_patch envelope): _load_conversion_api, SmokeRunConfig, StageResult, SmokeRunManifest, _validated_run_id, create_work_run_dir, _audit_group_files, _cached_transit_time, _cached_isot_time - Spliced those symbols back from extracted patch envelope content, with the documented decorator/lru_cache parameters from the corresponding refactor patches Bundled fix for calibration/model.py axis-order bug surfaced by the smoke runner test suite: _calculate_manual_model_data now identifies the channel axis by matching against the SPW channel count, supports both (nchan, npol) and (npol, nchan) DATA cell layouts, and broadcasts the point-source model to every correlation while preserving MS DATA axis order. The previous hard-coded (nchan, npol) assumption silently corrupted MODEL_DATA on MS files written with the alternate axis order. Tests: 14/14 smoke runner tests pass, full calibration/model/photometry subset (105 tests) green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

This PR restores the HDF5-to-calibrator-tile smoke evidence runner (module + CLI wrapper + tests + evidence README) and updates the manual MODEL_DATA writer in calibration/model.py to correctly preserve (nchan, npol) vs (npol, nchan) axis ordering.

Changes:

Reintroduce dsa110_continuum.evidence.hdf5_calibrator_tile_smoke (discovery + run orchestration) along with a lightweight scripts/ entrypoint.
Add a focused pytest suite covering discovery/preflight helpers, stage logging, and the MODEL_DATA axis-order regression.
Fix _calculate_manual_model_data() to detect the channel axis by SPW channel count and write MODEL_DATA in the same per-row axis order as DATA.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`dsa110_continuum/evidence/hdf5_calibrator_tile_smoke.py`	Restores the smoke evidence workflow implementation (discovery + run pipeline), plus supporting utilities.
`dsa110_continuum/calibration/model.py`	Fixes manual `MODEL_DATA` generation to respect alternate DATA axis order.
`tests/test_hdf5_calibrator_tile_smoke.py`	Adds tests for discovery behavior, stage logging, cal-table validation, and the MODEL_DATA regression.
`scripts/hdf5_calibrator_tile_smoke.py`	Adds a CLI wrapper entrypoint to invoke the evidence runner.
`outputs/hdf5_calibrator_tile_smoke/README.md`	Documents the evidence workflow and discovery usage.
`dsa110_continuum/evidence/__init__.py`	Introduces the evidence package initializer.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+def create_immutable_run_dir(config: SmokeRunConfig, run_id: str | None = None) -> Path:
+    """Create a new run directory and refuse in-place overwrite."""
+    chosen_run_id = run_id or config.run_id or _utc_run_id(config.calibrator, config.group_id)
+    run_dir = config.evidence_root / chosen_run_id
+    run_dir.mkdir(parents=True, exist_ok=False)


+        _run_stage(manifest, "preflight", preflight)
+
+        with _stage_log(run_dir / "logs" / "01_convert.log"):
+            conversion = _run_stage(manifest, "conversion", lambda: _conversion_stage(config, run_dir))


+The current implementation covers the discovery/preflight phase only. The H17
+execution phase still needs the production conversion/calibration/imaging wiring
+described in issue #38.


+        group,
+        files,
+        config.fwhm_deg,
+        file_audit=_audit_group_files(config.group_id, files),


+    bp_table, g_table = validate_fresh_cal_tables(tables, run_dir)
+    _write_json(
+        run_dir / "calibration" / "fresh_calibration_tables.json",
+        {"tables": [str(t) for t in tables], "bp_table": str(bp_table), "g_table": str(g_table)},
+    )


+                "imaging",
+                lambda: _imaging_stage(
+                    config,
+                    work_dir,


Copilot AI review requested due to automatic review settings May 4, 2026 20:08

Copilot started reviewing on behalf of jakobtfaber May 4, 2026 20:08 View session

Copilot AI reviewed May 4, 2026

View reviewed changes

jakobtfaber merged commit 9841710 into main May 4, 2026
5 checks passed

jakobtfaber mentioned this pull request May 4, 2026

Issue #26: add structured epoch gaincal status and promotion records #34

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evidence: restore HDF5-to-tile smoke runner + model.py axis-order fix#45

evidence: restore HDF5-to-tile smoke runner + model.py axis-order fix#45
jakobtfaber merged 1 commit into
mainfrom
worktree-smoke-runner-restore-v2

jakobtfaber commented May 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jakobtfaber commented May 4, 2026

Summary

Why this is bundled

Test plan

Recovery provenance

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants