Pathology review is time-intensive, and spotting metastatic tissue across large slide volumes is hard to scale. This project delivers a fast, explainable patch-level assistant built on PatchCamelyon (PCam): a ResNet-based classifier with Grad-CAM evidence, uncertainty estimates, and domain-shift warnings, wrapped in a single-screen Streamlit UI for demo and triage workflows.
- Create a virtual env and install deps:
python3 -m venv .venv && source .venv/bin/activatepip install -r requirements.txt - Add model weights (optional):
- Place
outputs/best_model.ptlocally, or - Set
MODEL_URLto a direct download link for the checkpoint.
- Place
- Run the UI:
streamlit run src/ui/app.py
The UI ships with bundled samples in assets/samples/ so the demo works without the full dataset.
If no model weights are found, the app falls back to a heuristic demo mode and still runs end-to-end.
If you want to load dataset samples or retrain, place the full dataset locally in data/pcam/
(not tracked in git). You can also use a tiny subset in data/sample/ for smoke tests.
The repo expects the original PCam structure:
data/pcam/
pcam/
training_split.h5
validation_split.h5
test_split.h5
Labels/Labels/
camelyonpatch_level_2_split_train_y.h5
camelyonpatch_level_2_split_valid_y.h5
camelyonpatch_level_2_split_test_y.h5
Metadata/Metadata/
train_metadata.csv
valid_metadata.csv
test_metadata.csv
camelyonpatch_level_2_split_train_mask/
camelyonpatch_level_2_split_train_mask.h5
Full dataset source (Kaggle): https://www.kaggle.com/datasets/andrewmvd/metastatic-tissue-classification-patchcamelyon
outputs/ is gitignored to keep the repo light. For real predictions, provide a checkpoint locally:
outputs/best_model.pt(recommended), or- set
MODEL_URLin your environment or Streamlit secrets to auto-download.
Example:
export MODEL_URL="https://your-hosted-file/best_model.pt"
- Images are stored in HDF5 under dataset key
x. - Labels are stored in HDF5 under dataset key
y. - Train masks are optional and stored under dataset key
mask. - Metadata CSV rows align with the HDF5 sample indices.
src/
data/ # dataset loaders
models/ # model defs
train/ # training scripts
eval/ # metrics + evaluation
explain/ # Grad-CAM
ood/ # uncertainty + domain shift
api/ # inference service
ui/ # demo UI
scripts/ # helper scripts (sample creation, inspection)
Inspect data:
python3 scripts/inspect_dataset.py --split train --num-samples 5 --metadata
Train (fast smoke run):
python3 -m src.train.train --epochs 1 --max-train 2048 --max-val 512
Train (better baseline with balancing + aug):
python3 -m src.train.train --epochs 2 --max-train 50000 --max-val 10000 --pretrained --augment --pos-weight --balanced-sampler
Evaluate:
python3 -m src.eval.evaluate --checkpoint outputs/best_model.pt --split valid
Compute OOD stats (feature centroid):
python3 scripts/compute_ood_stats.py --checkpoint outputs/best_model.pt --max-samples 5000
Inference demo (prediction + Grad-CAM + uncertainty + OOD):
python3 scripts/infer_demo.py --checkpoint outputs/best_model.pt --feature-stats outputs/feature_stats.npz --split test --index 0
Run UI:
streamlit run src/ui/app.py
Generate bundled UI samples (optional, from local dataset):
python3 scripts/make_ui_samples.py --count 50
Metrics report (ROC-AUC + sensitivity/specificity):
python3 scripts/report_metrics.py --checkpoint outputs/best_model.pt --split valid --threshold 0.5 --max-samples 5000
Calibrate temperature (optional):
python3 scripts/calibrate_temperature.py --checkpoint outputs/best_model.pt --split valid --max-samples 5000
Threshold sweep (optional):
python3 scripts/threshold_sweep.py --checkpoint outputs/best_model.pt --split valid --max-samples 5000
Generate gallery images (optional):
python3 scripts/make_gallery.py --checkpoint outputs/best_model.pt