Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,8 +102,8 @@ Il toolkit non gestisce il deployment: scrive nella directory configurata via
|---|---|
| `toolkit inspect paths --config dataset.yml --year 2023` | Mostra path assoluti degli output (utile nei notebook). Esempio: `toolkit inspect paths --config project-example/dataset.yml --json` |
| `toolkit inspect schema-diff --config dataset.yml` | Confronta schema RAW tra anni configurati |
| `toolkit blocker-hints --config dataset.yml` | Mismatch tra config e output reali |
| `toolkit review-readiness --config dataset.yml` | Check di prontezza per review candidate |
| `toolkit review-readiness --config dataset.yml` | Check di prontezza per review candidate (raccomandato) |
| `toolkit blocker-hints --config dataset.yml` | ⚠️ Deprecato: usa `review-readiness` |
| `toolkit status --dataset <name> --year <year> --latest --config dataset.yml` | Ultimo run completato |
| `toolkit profile raw --config dataset.yml` | Profilo diagnostico del RAW (encoding, delimitatore, colonne) |

Expand Down Expand Up @@ -265,7 +265,7 @@ toolkit/
| Problema | Soluzione |
|---|---|
| `toolkit: command not found` | Usa `python -m toolkit.cli.app` al posto di `toolkit` |
| `run all` fallisce | `toolkit blocker-hints --config dataset.yml` + controlla che la fonte sia raggiungibile |
| `run all` fallisce | `toolkit review-readiness --config dataset.yml` + controlla che la fonte sia raggiungibile |
| "dove sono i parquet prodotti?" | `toolkit inspect paths --config dataset.yml --year <anno>` o cerca in `root/data/` |
| "errore schema tra anni diversi" | `toolkit inspect schema-diff --config dataset.yml` per vedere il drift RAW |
| Voglio solo un layer, non tutto | `toolkit run clean` o `toolkit run mart` — skippa i layer upstream se già presenti |
Expand Down
29 changes: 21 additions & 8 deletions tests/test_cli_blocker_hints.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,29 @@
"""Tests for toolkit blocker-hints CLI command.

DEPRECATED: delegato a review-readiness. Tests aggiornati per la nuova
implementazione che verifica leggibilita' parquet (non solo esistenza).

contract: blocker-hints CLI public interface (--json output format, exit codes)
policy: missing config is blocker (not warning); relative path resolution from config dir
"""

import json
from pathlib import Path

import duckdb
import pytest
from typer.testing import CliRunner

from toolkit.cli.app import app


def _write_real_parquet(path: Path) -> None:
"""Write a minimal real parquet file via DuckDB (non dummy text)."""
conn = duckdb.connect()
conn.execute(f"COPY (SELECT 1 AS id) TO '{path}' (FORMAT PARQUET)")
conn.close()


# ---------------------------------------------------------------------------
# contract — CLI public interface
# ---------------------------------------------------------------------------
Expand Down Expand Up @@ -172,8 +183,8 @@ def test_blocker_hints_missing_config(self) -> None:
)

@pytest.mark.policy
def test_blocker_hints_detects_clean_dir_missing_when_mart_exists(self, tmp_path: Path, monkeypatch) -> None:
"""policy: mart dir exists but clean dir is missing is a blocker (run-order inconsistency)."""
def test_blocker_hints_detects_missing_layers(self, tmp_path: Path, monkeypatch) -> None:
"""policy: mart dir without clean/raw is a blocker (delegato a review-readiness)."""
project_dir = tmp_path / "project"
project_dir.mkdir()
config_path = project_dir / "dataset.yml"
Expand All @@ -200,7 +211,7 @@ def test_blocker_hints_detects_clean_dir_missing_when_mart_exists(self, tmp_path
(project_dir / "sql" / "clean.sql").write_text("select 1 as value", encoding="utf-8")
(sql_dir / "test_table.sql").write_text("select * from clean_input", encoding="utf-8")

# Only mart dir exists, not clean dir
# Only mart dir exists, not clean/raw dirs
mart_dir = project_dir / "out" / "data" / "mart" / "test_ds" / "2023"
mart_dir.mkdir(parents=True, exist_ok=True)
(mart_dir / "manifest.json").write_text(
Expand All @@ -224,7 +235,9 @@ def test_blocker_hints_detects_clean_dir_missing_when_mart_exists(self, tmp_path
)

assert result.exit_code == 0
assert "clean_dir_missing" in result.output
# Delegates to review-readiness: reports missing raw + clean as blockers
assert "blocker" in result.output.lower()
assert result.output.count("blocker") >= 1

@pytest.mark.policy
def test_blocker_hints_resolves_relative_path_from_config_dir(
Expand Down Expand Up @@ -272,15 +285,15 @@ def test_blocker_hints_resolves_relative_path_from_config_dir(

clean_dir = project_dir / "out" / "data" / "clean" / "test_ds" / "2023"
clean_dir.mkdir(parents=True, exist_ok=True)
(clean_dir / "test_ds_2023_clean.parquet").write_text("dummy", encoding="utf-8")
_write_real_parquet(clean_dir / "test_ds_2023_clean.parquet")
(clean_dir / "manifest.json").write_text(
json.dumps({"outputs": [{"file": "test_ds_2023_clean.parquet"}]}, indent=2),
encoding="utf-8",
)

mart_dir = project_dir / "out" / "data" / "mart" / "test_ds" / "2023"
mart_dir.mkdir(parents=True, exist_ok=True)
(mart_dir / "test_table.parquet").write_text("dummy", encoding="utf-8")
_write_real_parquet(mart_dir / "test_table.parquet")
(mart_dir / "manifest.json").write_text(
json.dumps({"outputs": [{"file": "test_table.parquet"}]}, indent=2),
encoding="utf-8",
Expand Down Expand Up @@ -357,15 +370,15 @@ def test_blocker_hints_no_blockers_when_all_present(self, tmp_path: Path, monkey

clean_dir = project_dir / "out" / "data" / "clean" / "test_ds" / "2023"
clean_dir.mkdir(parents=True, exist_ok=True)
(clean_dir / "test_ds_2023_clean.parquet").write_text("dummy parquet", encoding="utf-8")
_write_real_parquet(clean_dir / "test_ds_2023_clean.parquet")
(clean_dir / "manifest.json").write_text(
json.dumps({"outputs": [{"file": "test_ds_2023_clean.parquet"}]}, indent=2),
encoding="utf-8",
)

mart_dir = project_dir / "out" / "data" / "mart" / "test_ds" / "2023"
mart_dir.mkdir(parents=True, exist_ok=True)
(mart_dir / "test_table.parquet").write_text("dummy parquet", encoding="utf-8")
_write_real_parquet(mart_dir / "test_table.parquet")
(mart_dir / "manifest.json").write_text(
json.dumps({"outputs": [{"file": "test_table.parquet"}]}, indent=2),
encoding="utf-8",
Expand Down
14 changes: 11 additions & 3 deletions tests/test_mcp_toolkit_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import shutil
from pathlib import Path

import duckdb
import pytest

from toolkit.mcp.toolkit_client import (
Expand All @@ -18,6 +19,13 @@
)


def _write_real_parquet(path: Path) -> None:
"""Write a minimal real parquet file via DuckDB."""
conn = duckdb.connect()
conn.execute(f"COPY (SELECT 1 AS id) TO '{path}' (FORMAT PARQUET)")
conn.close()


def test_mcp_toolkit_client_works_from_repo_layout(tmp_path: Path, monkeypatch) -> None:
src = Path("project-example")
dst = tmp_path / "project-example"
Expand Down Expand Up @@ -92,13 +100,13 @@ def test_mcp_blocker_hints_empty_when_all_present(tmp_path: Path, monkeypatch) -

clean_dir = dst / "_smoke_out" / "data" / "clean" / "project_example" / "2022"
clean_dir.mkdir(parents=True, exist_ok=True)
(clean_dir / "project_example_2022_clean.parquet").write_bytes(b"")
_write_real_parquet(clean_dir / "project_example_2022_clean.parquet")

mart_dir = dst / "_smoke_out" / "data" / "mart" / "project_example" / "2022"
mart_dir.mkdir(parents=True, exist_ok=True)
# Il config dichiara 2 tabelle mart
(mart_dir / "rd_by_regione.parquet").write_bytes(b"")
(mart_dir / "rd_by_provincia.parquet").write_bytes(b"")
_write_real_parquet(mart_dir / "rd_by_regione.parquet")
_write_real_parquet(mart_dir / "rd_by_provincia.parquet")

hints_payload = blocker_hints(str(config_path), 2022)
assert hints_payload["hint_count"] == 0
Expand Down
2 changes: 2 additions & 0 deletions toolkit/cli/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from toolkit.cli.cmd_scaffold import register as register_scaffold
from toolkit.cli.cmd_batch import register as register_batch
from toolkit.cli.cmd_blocker_hints import register as register_blocker_hints
from toolkit.cli.cmd_review_readiness import register as register_review_readiness
from toolkit.cli.cmd_init import register as register_init

app = typer.Typer(no_args_is_help=True, add_completion=False)
Expand All @@ -25,6 +26,7 @@
register_scaffold(app)
register_batch(app)
register_blocker_hints(app)
register_review_readiness(app)
register_init(app)


Expand Down
18 changes: 9 additions & 9 deletions toolkit/cli/cmd_blocker_hints.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,7 @@
"""CLI command: toolkit blocker-hints

Esporta blocker_hints come interfaccia CLI pubblica, invece di chiamare
il modulo interno toolkit.mcp.toolkit_client.
"""CLI command: toolkit blocker-hints (DEPRECATED, use review-readiness)

Usage:
toolkit blocker-hints --config candidates/terna-electricity-by-source/dataset.yml --year 2023
toolkit blocker-hints --config candidates/terna-electricity-by-source/dataset.yml --year 2023 --json
toolkit review-readiness --config candidates/terna-electricity-by-source/dataset.yml --year 2023
"""

from __future__ import annotations
Expand All @@ -26,13 +22,17 @@ def blocker_hints(
"""
Mostra hint diagnostici per mismatch comuni tra config dichiarato e output.

I blocker sono errori che impediscono al candidate di funzionare.
I warning sono segnali di possibili problemi che non bloccano l'esecuzione.
DEPRECATED: usa invece 'toolkit review-readiness'.

Exit code:
0 — hint generati (anche se ci sono blocker, il comando funziona)
1 — config non trovato o errore nell'analisi
"""
if not as_json:
typer.echo(
"⚠️ DEPRECATED: 'toolkit blocker-hints' sara' rimosso. Usa 'toolkit review-readiness'.",
err=True,
)
try:
# Use load_config like other CLI commands (run, init, status) so that
# relative paths are resolved from the config file's base_dir, not from
Expand Down Expand Up @@ -95,4 +95,4 @@ def blocker_hints(


def register(app: typer.Typer) -> None:
app.command("blocker-hints")(blocker_hints)
app.command("blocker-hints", hidden=True)(blocker_hints)
102 changes: 102 additions & 0 deletions toolkit/cli/cmd_review_readiness.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
"""CLI command: toolkit review-readiness

Usage:
toolkit review-readiness --config candidates/terna-electricity-by-source/dataset.yml --year 2023
toolkit review-readiness --config candidates/terna-electricity-by-source/dataset.yml --year 2023 --json
"""

from __future__ import annotations

from pathlib import Path

from toolkit.mcp.schema_ops import review_readiness as _review_readiness
from toolkit.core.config import load_config

import typer


def review_readiness(
config: str = typer.Option(..., "--config", "-c", help="Path to dataset.yml"),
year: int | None = typer.Option(None, "--year", "-y", help="Dataset year (default: last declared year)"),
as_json: bool = typer.Option(False, "--json", help="Emit JSON output"),
) -> None:
"""Check di prontezza per review candidate: layer, output e coerenza run record.

Classifica il candidate come:
- ready: tutti i check passano — pronto per review
- needs-review: qualche check fallito ma recuperabile
- incomplete: troppi check falliti — non pronto

Exit code:
0 — readiness generata
1 — config non trovato o errore nell'analisi
"""
try:
load_config(config, strict_config=False)
config_path_resolved = str(Path(config).resolve())
result = _review_readiness(config_path_resolved, year)
except FileNotFoundError:
typer.echo(f"error: config file not found: {config}", err=True)
raise typer.Exit(code=1)
except Exception as exc:
exc_msg = str(exc).lower()
if "no such file or directory" in exc_msg or "non trovata" in exc_msg:
typer.echo(f"error: config file not found: {config}", err=True)
else:
typer.echo(f"error: {type(exc).__name__}: {exc}", err=True)
raise typer.Exit(code=1)

if as_json:
import json
typer.echo(json.dumps(result, indent=2, ensure_ascii=False))
return

# Human-readable output
dataset = result.get("dataset", "?")
config_path = result.get("config_path", "?")
year_val = result.get("year", "?")
readiness = result.get("readiness", "?")
ok_count = result.get("ok_count", 0)
fail_count = result.get("fail_count", 0)

readiness_icon = {"ready": "✅", "needs-review": "⚠️", "incomplete": "🔴"}.get(readiness, "?")
typer.echo(f"dataset: {dataset}")
typer.echo(f"config: {config_path}")
typer.echo(f"year: {year_val}")
typer.echo(f"readiness: {readiness_icon} {readiness}")
typer.echo(f"checks: {ok_count}/{ok_count + fail_count} ok")
typer.echo("")

checks = result.get("checks", [])
if not checks:
typer.echo("nessun check disponibile")
return

typer.echo("checks:")
for check in checks:
name = check.get("check", "?")
ok = check.get("ok", False)
detail = check.get("detail", "")
icon = "✅" if ok else "🔴"
typer.echo(f" {icon} [{name}]")
if isinstance(detail, list):
for item in detail:
if isinstance(item, dict):
item_icon = "✅" if item.get("readable") else "🔴"
typer.echo(f" {item_icon} {item.get('name', '?')} ({item.get('rows', '?')} righe)")
else:
typer.echo(f" {item}")
elif detail:
typer.echo(f" {detail}")

typer.echo("")
if readiness == "ready":
typer.echo("✅ Pronto per review — tutti i check passano.")
elif readiness == "needs-review":
typer.echo(f"⚠️ {fail_count} check falliti — verificare prima del merge.")
else:
typer.echo(f"🔴 {fail_count} check falliti — candidate non pronto.")


def register(app: typer.Typer) -> None:
app.command("review-readiness")(review_readiness)
2 changes: 1 addition & 1 deletion toolkit/cli/cmd_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -527,4 +527,4 @@ def register(app: typer.Typer) -> None:
run_sub.command("cross-year")(run_cross_year_cmd) # alias hyphen
run_sub.command("init")(run_init)
run_sub.command("full")(run_full)
app.add_typer(run_sub, name="run")
app.add_typer(run_sub, name="run", help="Esegue la pipeline RAW → CLEAN → MART per un dataset.")
2 changes: 1 addition & 1 deletion toolkit/cli/cmd_scaffold.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,4 +113,4 @@ def scaffold_clean(
def register(app: typer.Typer) -> None:
scaffold_app = typer.Typer(no_args_is_help=True, add_completion=False)
scaffold_app.command("clean")(scaffold_clean)
app.add_typer(scaffold_app, name="scaffold")
app.add_typer(scaffold_app, name="scaffold", help="Genera scheletro candidate: dataset.yml, SQL template.")
2 changes: 1 addition & 1 deletion toolkit/cli/inspect/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,4 @@ def register(app: typer.Typer) -> None:
inspect_app.command("schema")(schema)
inspect_app.command("url")(url)
inspect_app.command("probe")(probe)
app.add_typer(inspect_app, name="inspect")
app.add_typer(inspect_app, name="inspect", help="Ispeziona path, schema, readiness e URL del dataset.")
14 changes: 12 additions & 2 deletions toolkit/cli/inspect/schema_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,26 @@


def schema(
config_path: str = typer.Argument(..., help="Path al dataset.yml", metavar="CONFIG"),
config_path: str = typer.Argument("", metavar="CONFIG", help="Path al dataset.yml (posizionale)"),
config: str = typer.Option(None, "--config", "-c", help="Path al dataset.yml", hidden=True),
layer: str = typer.Option("clean", "--layer", "-l", help="Layer: raw, clean, mart"),
year: int = typer.Option(0, "--year", "-y", help="Anno (default: ultimo)"),
json_output: bool = typer.Option(False, "--json", help="Output JSON"),
) -> None:
"""Mostra lo schema (colonne + tipi) di raw, clean o mart.

Chiama la stessa implementazione del tool MCP toolkit_show_schema.

Il path config puo' essere passato come argomento posizionale
(es. toolkit inspect schema path/to/dataset.yml)
o con l'opzione --config / -c.
"""
result = show_schema(config_path, layer, year or None)
resolved_config = config or config_path
if not resolved_config:
typer.echo("error: specificare il path al dataset.yml (argomento o --config)", err=True)
raise typer.Exit(code=1)

result = show_schema(resolved_config, layer, year or None)
status = result.get("status", "ok" if result.get("columns") else "empty")

if json_output:
Expand Down
6 changes: 3 additions & 3 deletions toolkit/mcp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ Server MCP locale, read-only, per ispezionare rapidamente path risolti, schemi e
- `toolkit_show_schema(config_path, layer="clean", year=0)`
- `toolkit_run_summary(config_path, year=0)` — statistiche aggregate (totali, successi, durata media)
- `toolkit_summary(config_path, year=0)` — dashboard diagnostico (layer + run + warnings)
- `toolkit_blocker_hints(config_path, year=0)`
- `toolkit_review_readiness(config_path, year=0)`
- `toolkit_blocker_hints(config_path, year=0)` — ⚠️ deprecato, usa `toolkit_review_readiness`
- `toolkit_review_readiness(config_path, year=0)` — (raccomandato)
- `toolkit_list_runs(config_path, year=0, since=None, until=None, status=None, limit=20, cross_year=False)`
- `toolkit_schema_diff(config_path)` — confronto segnali schema raw cross-year (encoding, colonne, ecc.)
- `toolkit_csv_preview(csv_path, limit=20)` — schema + preview CSV via profiler pipeline (`sniff_source_file` + `profile_with_read_cfg`); output allineato con `RawProfile` (delim, encoding, decimal, skip, robust_read_suggested)
Expand Down Expand Up @@ -55,5 +55,5 @@ Sostituire il path del `command` con il Python reale del clone locale che usera'
- `toolkit_csv_preview` legge un CSV usando la stessa pipeline di `profile_raw` (`sniff_source_file` + `profile_with_read_cfg`); restituisce schema + prime N righe + mapping_suggestions — utile per ispezionare file raw senza runnare la pipeline
- `toolkit_run_summary` aggrega tutti i run record per dataset/year
- `toolkit_summary` include `run.latest_run_record` (payload completo dell'ultimo run)
- `toolkit_blocker_hints` evidenzia mismatch pratici tra output risolti e stato run
- `toolkit_blocker_hints` ⚠️ deprecato, punta a `toolkit_review_readiness`
- `toolkit_review_readiness` esegue check di readiness per review candidate: config valida, layer presenti, output leggibili, coerenza run record
Loading