diff --git a/README.md b/README.md index 800e4b9..3425c98 100644 --- a/README.md +++ b/README.md @@ -131,7 +131,7 @@ La struttura resta invariata. Non serve capire tutto subito: qui trovi la base p pip install dataciviclab-toolkit toolkit run all --config dataset.yml toolkit validate all --config dataset.yml -toolkit status --dataset --year --latest --config dataset.yml +toolkit inspect summary --dataset --year --latest --config dataset.yml ``` I notebook del template usano anche: diff --git a/WORKFLOW.md b/WORKFLOW.md index 77127ab..bd3b156 100644 --- a/WORKFLOW.md +++ b/WORKFLOW.md @@ -35,7 +35,7 @@ GitHub resta il posto dove deve restare la traccia utile. 1. valida la config con `py -m pytest tests/test_contract.py` 2. esegui `toolkit run all --config dataset.yml` 3. esegui `toolkit validate all --config dataset.yml` -4. esegui `toolkit status --dataset --year --latest --config dataset.yml` +4. esegui `toolkit inspect summary --dataset --year --latest --config dataset.yml` 5. usa `toolkit inspect paths --config dataset.yml --year --json` 6. usa i notebook per ispezionare RAW, CLEAN, MART e QA diff --git a/docs/contributing.md b/docs/contributing.md index c5b4e8d..700a5be 100644 --- a/docs/contributing.md +++ b/docs/contributing.md @@ -44,7 +44,7 @@ Su Windows, se `sh` non e disponibile nel `PATH`, usa una shell POSIX come Git B ```powershell toolkit run all --config dataset.yml toolkit validate all --config dataset.yml -toolkit status --dataset --year --latest --config dataset.yml +toolkit inspect summary --dataset --year --latest --config dataset.yml ``` ## Dove scrivere cosa @@ -85,7 +85,7 @@ La destinazione su Drive mantiene gli stessi path relativi sotto `root`, quindi ```sh toolkit run all --config dataset.yml toolkit validate all --config dataset.yml -toolkit status --dataset --year --latest --config dataset.yml +toolkit inspect summary --dataset --year --latest --config dataset.yml toolkit inspect paths --config dataset.yml --year --json ``` @@ -120,7 +120,7 @@ Queste fasi non sono una catena rigida: spesso bastano 2-4 issue piccole per far | Sources/RAW | `dataset.yml`, `docs/sources.md`, `docs/decisions.md` | `toolkit run raw --config dataset.yml`, poi `toolkit inspect paths --config dataset.yml --year --json` | `01_inspect_raw.ipynb` | | CLEAN | `sql/clean.sql`, `dataset.yml`, `docs/data_dictionary.md` | `toolkit run clean --config dataset.yml`, poi `toolkit inspect paths --config dataset.yml --year --json` | `02_inspect_clean.ipynb` | | MART | `sql/mart/*.sql`, `dataset.yml` | `toolkit run mart --config dataset.yml`, poi `toolkit inspect paths --config dataset.yml --year --json` | `03_explore_mart.ipynb` | -| Release | `README.md`, `docs/overview.md`, `docs/data_dictionary.md` | `toolkit status --dataset --year --latest --config dataset.yml` | `00_quickstart.ipynb` | +| Release | `README.md`, `docs/overview.md`, `docs/data_dictionary.md` | `toolkit inspect summary --dataset --year --latest --config dataset.yml` | `00_quickstart.ipynb` | | Maintenance | `dataset.yml`, `sql/`, `docs/`, `tests/test_contract.py` | `toolkit run all --config dataset.yml` | `01_inspect_raw.ipynb`, `02_inspect_clean.ipynb`, `03_explore_mart.ipynb` | I notebook usano `toolkit inspect paths --config dataset.yml --year --json` come contratto stabile per localizzare gli output. diff --git a/scripts/smoke.sh b/scripts/smoke.sh index f984a6c..1dc1568 100644 --- a/scripts/smoke.sh +++ b/scripts/smoke.sh @@ -106,5 +106,5 @@ echo "YEAR=${YEAR}" run_toolkit run all --config "${DATASET_FILE}" run_toolkit validate all --config "${DATASET_FILE}" -run_toolkit status --dataset "${DATASET_NAME}" --year "${YEAR}" --latest --config "${DATASET_FILE}" +run_toolkit inspect summary --dataset "${DATASET_NAME}" --year "${YEAR}" --latest --config "${DATASET_FILE}" run_toolkit inspect paths --config "${DATASET_FILE}" --year "${YEAR}" --json