Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
*.pyc
__pycache__

**/lightning_logs/
**/checkpoints/
.DS_Store
.env
.vscode/
.idea/
.ipynb_checkpoints/
.venv/
.ruff_cache/
.claude/
144 changes: 119 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,77 +1,171 @@
# DINOv2-3D: Self-Supervised 3D Vision Transformer Pretraining

A configuration-first (and therefore easily understandable and trackable) repository for a 3D implementation od DINOv2. Based on the implementations from Lightly (Thank you!) and integrated with Pytorch Lightning. 3D capabilities of this implementation are largely through MONAI's functionalities
A configuration-driven repository for 3D DINOv2 self-supervised learning. Built with [Lighter](https://github.com/project-lighter/lighter), PyTorch Lightning, and MONAI.

## What you can do with this Repo
- Train your own 3D Dinov2 on CT, MRI, PET data, etc. with very little configuration other than whats been provided.
- Use state of the art PRIMUS transformer in medical segmentation to pretrain your DINOV2
- Make a baseline for DinoV2 to improve and build on.
- Change elements of the framework through modular extensions.
## What You Can Do with This Repo
- Train your own 3D DINOv2 on CT, MRI, PET data, etc. with minimal configuration
- Use state-of-the-art PRIMUS transformer for medical imaging pretraining
- Make a baseline for DINOv2 to improve and build on
- Change elements of the framework through modular extensions

## Features
- DINOv2-style self-supervised learning with teacher-student models
- Block masking for 3D volumes
- Block masking for 3D volumes
- Flexible 3D augmentations (global/local views) courtesy of MONAI
- PyTorch Lightning training loop
- YAML-based experiment configuration that is explainable at a glance due to its abstraction!

- PyTorch Lightning training loop
- YAML-based experiment configuration powered by Lighter

## Installation

1. Clone the repository:
```bash
git clone https://github.com/AIM-Harvard/DINOv2-3D-Med.git
cd DINOv2_3D
cd DINOv2-3D-Med
```
2. Create a virtual environment with UV(recommended):

2. Create a virtual environment with UV (recommended):
```bash
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
```

3. Install dependencies:
```bash
uv sync
```

If you do not want to use uv, you could just as easily do a `pip install -e .` in the repo directory
If you do not want to use uv, you can use `pip install -e .` instead.

## Usage

### Training
Run the training script with the default training config:

Run training with the default configuration:
```bash
python -m scripts.run fit --config_file=./configs/train.yaml,./configs/models/primus.yaml,./configs/datasets/amos.yaml
lighter fit configs/train.yaml configs/models/primus.yaml configs/datasets/amos.yaml
```

Here the train.yaml contains most of the heart of the configuration. primus.yaml provides the backbone to use for DINOv2 and amos.yaml provides the path to the dataset to be used.
Override parameters directly from the CLI:
```bash
lighter fit configs/train.yaml configs/models/primus.yaml configs/datasets/amos.yaml \
trainer::max_epochs=50 \
model::base_lr=0.0005 \
data::batch_size=4
```

### Prediction

```bash
lighter predict configs/predict.yaml
```

### Configuration
- All experiment settings (model, trainer, data) are defined in YAML configs.
- `configs/train.yaml`: Main training configuration with complete setup
- `configs/predict.yaml`: Configuration for inference/prediction tasks

Lighter uses YAML configs with powerful features:

- **Variable references**: `%vars::hidden_size` - reference shared variables
- **Cross-section references**: `%trainer::max_epochs` - reference other config sections
- **Python expressions**: `$int(%trainer::max_epochs * 0.03)` - compute values dynamically
- **Object instantiation**: `_target_: module.ClassName` - create objects from config

#### Config Structure

```
configs/
├── train.yaml # Main training configuration
├── predict.yaml # Inference configuration
├── dinotxt_stage.yaml # Image-text alignment training
├── models/
│ ├── primus.yaml # PRIMUS backbone
│ └── vit.yaml # MONAI ViT backbone
└── datasets/
├── amos.yaml # AMOS dataset
└── idc_dump.yaml # IDC dataset
```

Configs are composable - pass multiple files and they merge in order:
```bash
lighter fit base.yaml model.yaml dataset.yaml # Later files override earlier ones
```

## Path Configuration

Each config file defines its paths in the `vars:` section at the top for easy customization:

| Config | Variable | Description |
|--------|----------|-------------|
| `train.yaml` | `experiments_dir` | Output directory for checkpoints and logs |
| `dinotxt_stage.yaml` | `experiments_dir` | Output directory for checkpoints and logs |
| `predict.yaml` | `amos_dataset` | Path to AMOS dataset |
| `datasets/amos.yaml` | `amos_dataset` | Path to AMOS dataset |
| `datasets/idc_dump.yaml` | `idc_dataset` | Path to IDC dataset |

Override paths from the CLI:
```bash
lighter fit configs/train.yaml configs/models/primus.yaml configs/datasets/amos.yaml \
vars::experiments_dir=/your/output/path

lighter fit configs/train.yaml configs/models/primus.yaml configs/datasets/idc_dump.yaml \
vars::idc_dataset=/your/idc/data/path
```

## Data Preparation

For now, to run a straightforward DINOv2 pipeline, all you need to do is setup your data paths in a JSON in the MONAI format.
Create a JSON file in MONAI format:

It looks something like this
```json
{
"training": [
{"image": "/path/to/image1.nii.gz"},
{"image": "/path/to/image2.nii.gz"}
]
}
```

If you need more complex data loading (e.g., with labels for sampling), extend the JSON:

```json
{
"training": [
{"image": <path_to_image>},
....
{"image": "/path/to/image.nii.gz", "label": "/path/to/label.nii.gz"}
]
}
```
If you'd like to do more complex manipulations like sample based on a mask and so on, you can easily extend this json to include a "label" in addition to the image and use MONAI transforms to sample as you like.

Then update your dataset config or override from CLI:
```bash
lighter fit configs/train.yaml \
"data::train_dataset::dataset::data=\$monai.auto3dseg.datafold_read('/path/to/dataset.json', basedir='/path/to/data', key='training')[0]"
```

## Project Structure

```
DINOv2-3D-Med/
├── __lighter__.py # Lighter marker (enables project.* imports)
├── configs/ # YAML configurations
├── models/ # Model architectures
│ ├── meta_arch.py # DINOv2 teacher-student architecture
│ └── backbones/ # PRIMUS, ViT, EVA backbones
├── training/ # Lightning modules
│ ├── dinov2_lightning_module.py
│ ├── dinotxt_lightning_module.py
│ └── data_module.py
├── transforms/ # Data augmentations
│ ├── dinov2_aug.py # DINOv2 3D augmentations
│ └── blockmask.py # Block masking for iBOT
├── losses/ # Loss functions
│ └── dino.py # DINOv2 + iBOT + KoLeo losses
└── utils/ # Utilities
```

## References
- [Lighter](https://github.com/project-lighter/lighter)
- [Lightly](https://github.com/lightly-ai/lightly)
- [DINOv2 (Facebook Research)](https://github.com/facebookresearch/dinov2)
- [MONAI (Medical Open Network for AI)](https://github.com/Project-MONAI/MONAI)
- [PyTorch Lightning](https://www.pytorchlightning.ai/)


## License
Copyright &copy; 2025 Suraj Pai, Vasco Prudente

Expand Down
2 changes: 2 additions & 0 deletions __lighter__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Lighter marker file - enables `project.*` imports
# See: https://github.com/project-lighter/lighter
14 changes: 13 additions & 1 deletion configs/datasets/amos.yaml
Original file line number Diff line number Diff line change
@@ -1 +1,13 @@
data_module#train_dataset#data: "$monai.auto3dseg.datafold_read('/mnt/data1/datasets/AMOS/amos22/dataset.json', basedir='/mnt/data1/datasets/AMOS/amos22', key='training')[0]"
# AMOS Dataset Configuration
# Multi-organ segmentation dataset

_imports_:
monai: monai

vars:
amos_dataset: "/mnt/data1/datasets/AMOS/amos22"

data:
train_dataset:
dataset:
data: "$monai.auto3dseg.datafold_read(@vars::amos_dataset + '/dataset.json', basedir=@vars::amos_dataset, key='training')[0]"
14 changes: 13 additions & 1 deletion configs/datasets/idc_dump.yaml
Original file line number Diff line number Diff line change
@@ -1 +1,13 @@
data_module#train_dataset#dataset#data: "$monai.auto3dseg.datafold_read('/mnt/ssd1/ibro/IDC_SSL_CT/idc_dump_datalist.json', basedir='', key='training')[0]"
# IDC Dataset Configuration
# Imaging Data Commons CT dataset

_imports_:
monai: monai

vars:
idc_dataset: "/mnt/ssd1/ibro/IDC_SSL_CT"

data:
train_dataset:
dataset:
data: "$monai.auto3dseg.datafold_read(@vars::idc_dataset + '/idc_dump_datalist.json', basedir='', key='training')[0]"
Loading