Weather-station analytics pipeline for Parks Canada Agency (PEI Field Unit).
This project determines weather-station redundancy across Prince Edward Island National Park and automates Canadian Fire Weather Index (FWI) calculation for localized wildfire risk management.
The project follows the OSEMN (Obtain, Scrub, Explore, Model, iNterpret) framework:
- Obtain — raw station CSVs inventoried and schema-audited
- Scrub — ingestion, timestamp normalization, hourly/daily resampling, imputation
- Explore — EDA, QA/QC summaries, exploratory notebooks
- Model — Stanhope reference calibration, FWI chain, PCA redundancy analysis
- iNterpret — probabilistic uncertainty quantification and recommendations
python3 -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt
pip install -r requirements-dev.txtOr use:
make installpea_met_network.cleaning is the end-to-end pipeline entrypoint. It loads raw station
CSVs from data/raw/, normalizes timestamps, resamples to hourly and daily
frequencies, applies imputation, and writes cleaned datasets to
data/processed/.
python -m pea_met_network
python -m pea_met_network --output-dir /custom/pathNo manual steps are required between start and finished output. If raw data directories are missing, a clear error message is shown.
analysis.ipynb contains the full analytical narrative with sections for
EDA, redundancy analysis, FWI logic, and uncertainty quantification. Each
section includes visualizations and markdown explanations.
To run:
jupyter lab analysis.ipynb- Cleaned datasets — hourly and daily resampled data for all PCA stations
- FWI values — full FWI chain (FFMC → DMC → DC → ISI → BUI → FWI)
- Redundancy results — PCA biplot and clustering analysis of station overlap
- Uncertainty distributions — probabilistic quantification of imputation and model uncertainty
make lint
make test
make checkpea-met-network/
├── analysis.ipynb # Analytical narrative notebook
├── data/
│ ├── raw/
│ ├── processed/
│ └── external/
├── docs/
├── notebooks/
├── specs/
├── src/
├── tests/
├── IMPLEMENTATION_PLAN.md
├── Makefile
├── README.md
├── pyproject.toml
├── requirements.txt
└── requirements-dev.txt
DATA-3210: Advanced Concepts in Data — Semester Project
Client: Parks Canada Agency (PEI Field Unit)
Required themes:
- Python-based data pipeline and QA/QC
- Station redundancy analysis using PCA and/or clustering
- FWI calculation and validation
- Probabilistic uncertainty quantification