Skip to content

Commit

Permalink
Merge pull request #127 from roboflow/fix/packaging
Browse files Browse the repository at this point in the history
packaging: 📦 update transformers dependency for Qwen2.5-VL and documentation updates
  • Loading branch information
SkalskiP authored Feb 4, 2025
2 parents 4ed36d8 + 4af961e commit c983d22
Show file tree
Hide file tree
Showing 5 changed files with 186 additions and 67 deletions.
38 changes: 19 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,30 @@

<div align="center">

<h1>maestro</h1>

<br>

<div>
<a href="https://example1.com" style="margin: 0 10px;">
<img
src="https://github.com/user-attachments/assets/c9416f1f-a2bf-4590-86da-d2fc89ba559b"
width="80"
height="40"
/>
</a>
<a href="https://example2.com" style="margin: 0 10px;">
<img
src="https://github.com/user-attachments/assets/75dc7214-e82a-498d-950e-c64d90218e49"
width="80"
height="40"
/>
</a>
<a href="https://example3.com" style="margin: 0 10px;">
<img
src="https://github.com/user-attachments/assets/5d265473-b938-4501-b894-6a44a6a28a8c"
width="80"
height="40"
/>
</a>
<a href="https://example3.com" style="margin: 0 10px;">
<img
src="https://github.com/user-attachments/assets/b7ccdf39-ac77-4dbd-8608-0fa2d9dadf0a"
width="80"
height="40"
/>
</a>
</div>

<br>
Expand All @@ -44,29 +35,33 @@

## Hello

**maestro** is a tool designed to streamline and accelerate the fine-tuning process for
multimodal models. It provides ready-to-use recipes for fine-tuning popular
vision-language models (VLMs) such as **Florence-2**, **PaliGemma 2**, and
**Qwen2.5-VL** on downstream vision-language tasks.
**maestro** is a streamlined tool to accelerate the fine-tuning of multimodal models.
By encapsulating best practices from our core modules, maestro handles configuration,
data loading, reproducibility, and training loop setup. It currently offers ready-to-use
recipes for popular vision-language models such as **Florence-2**, **PaliGemma 2**, and
**Qwen2.5-VL**.

![maestro](https://github.com/user-attachments/assets/3bb9ccba-b0ee-4964-bcd6-f71124a08bc2)

## Quickstart

### Install

To get started with maestro, you’ll need to install the dependencies specific to the model you wish to fine-tune.
To begin, install the model-specific dependencies. Since some models may have clashing requirements,
we recommend creating a dedicated Python environment for each model.

```bash
pip install maestro[qwen_2_5_vl]
pip install maestro[paligemma_2]
```

**Note:** Some models may have clashing dependencies. We recommend creating a separate python environment for each model to avoid version conflicts.

### CLI

Kick off fine-tuning with our command-line interface, which leverages the configuration
and training routines defined in each model’s core module. Simply specify key parameters such as
the dataset location, number of epochs, batch size, optimization strategy, and metrics.

```bash
maestro qwen_2_5_vl train \
maestro paligemma_2 train \
--dataset "dataset/location" \
--epochs 10 \
--batch-size 4 \
Expand All @@ -76,8 +71,13 @@ maestro qwen_2_5_vl train \

### Python

For greater control, use the Python API to fine-tune your models.
Import the train function from the corresponding module and define your configuration
in a dictionary. The core modules take care of reproducibility, data preparation,
and training setup.

```python
from maestro.trainer.models.qwen_2_5_vl.core import train
from maestro.trainer.models.paligemma_2.core import train

config = {
"dataset": "dataset/location",
Expand Down
2 changes: 2 additions & 0 deletions docs/assets/maestro-logo.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
175 changes: 143 additions & 32 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,50 +2,161 @@

<h1>maestro</h1>

<p>coming: when it's ready...</p>
<br>

<div>
<img
src="https://github.com/user-attachments/assets/c9416f1f-a2bf-4590-86da-d2fc89ba559b"
width="80"
height="40"
/>
<img
src="https://github.com/user-attachments/assets/75dc7214-e82a-498d-950e-c64d90218e49"
width="80"
height="40"
/>
<img
src="https://github.com/user-attachments/assets/5d265473-b938-4501-b894-6a44a6a28a8c"
width="80"
height="40"
/>
<img
src="https://github.com/user-attachments/assets/b7ccdf39-ac77-4dbd-8608-0fa2d9dadf0a"
width="80"
height="40"
/>
</div>

</div>

**maestro** is a tool designed to streamline and accelerate the fine-tuning process for
multimodal models. It provides ready-to-use recipes for fine-tuning popular
vision-language models (VLMs) such as **Florence-2**, **PaliGemma**, and
**Qwen2-VL** on downstream vision-language tasks.
## Hello

## install
**maestro** is a streamlined tool to accelerate the fine-tuning of multimodal models.
By encapsulating best practices from our core modules, maestro handles configuration,
data loading, reproducibility, and training loop setup. It currently offers ready-to-use
recipes for popular vision-language models such as **Florence-2**, **PaliGemma 2**, and
**Qwen2.5-VL**.

Pip install the supervision package in a
[**Python>=3.8**](https://www.python.org/) environment.
## Quickstart

```bash
pip install maestro
```
### Install

## quickstart
To begin, install the model-specific dependencies. Since some models may have clashing requirements,
we recommend creating a dedicated Python environment for each model.

### CLI
=== "Florence-2"

```bash
pip install maestro[florence_2]
```

VLMs can be fine-tuned on downstream tasks directly from the command line with
`maestro` command:
=== "PaliGemma 2"

```bash
maestro florence2 train --dataset='<DATASET_PATH>' --epochs=10 --batch-size=8
```
```bash
pip install maestro[paligemma_2]
```

### SDK
=== "Qwen2.5-VL"

Alternatively, you can fine-tune VLMs using the Python SDK, which accepts the same
arguments as the CLI example above:
```bash
pip install maestro[qwen_2_5_vl]
pip install git+https://github.com/huggingface/transformers
```

```python
from maestro.trainer.common import MeanAveragePrecisionMetric
from maestro.trainer.models.florence_2 import train, Configuration
!!! warning
Support for Qwen2.5-VL in transformers is experimental.
For now, please install transformers from source to ensure compatibility.

config = Configuration(
dataset='<DATASET_PATH>',
epochs=10,
batch_size=8,
metrics=[MeanAveragePrecisionMetric()]
)
### CLI

train(config)
```
Kick off fine-tuning with our command-line interface, which leverages the configuration
and training routines defined in each model’s core module. Simply specify key parameters such as
the dataset location, number of epochs, batch size, optimization strategy, and metrics.

=== "Florence-2"

```bash
maestro florence_2 train \
--dataset "dataset/location" \
--epochs 10 \
--batch-size 4 \
--optimization_strategy "qlora" \
--metrics "edit_distance"
```

=== "PaliGemma 2"

```bash
maestro paligemma_2 train \
--dataset "dataset/location" \
--epochs 10 \
--batch-size 4 \
--optimization_strategy "qlora" \
--metrics "edit_distance"
```

=== "Qwen2.5-VL"

```bash
maestro qwen_2_5_vl train \
--dataset "dataset/location" \
--epochs 10 \
--batch-size 4 \
--optimization_strategy "qlora" \
--metrics "edit_distance"
```

### Python

For greater control, use the Python API to fine-tune your models.
Import the train function from the corresponding module and define your configuration
in a dictionary. The core modules take care of reproducibility, data preparation,
and training setup.

=== "Florence-2"

```python
from maestro.trainer.models.florence_2.core import train

config = {
"dataset": "dataset/location",
"epochs": 10,
"batch_size": 4,
"optimization_strategy": "qlora",
"metrics": ["edit_distance"],
}

train(config)
```

=== "PaliGemma 2"

```python
from maestro.trainer.models.paligemma_2.core import train

config = {
"dataset": "dataset/location",
"epochs": 10,
"batch_size": 4,
"optimization_strategy": "qlora",
"metrics": ["edit_distance"],
}

train(config)
```

=== "Qwen2.5-VL"

```python
from maestro.trainer.models.qwen_2_5_vl.core import train

config = {
"dataset": "dataset/location",
"epochs": 10,
"batch_size": 4,
"optimization_strategy": "qlora",
"metrics": ["edit_distance"],
}

train(config)
```
30 changes: 17 additions & 13 deletions mkdocs.yaml
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
site_name: maestro
site_url: https://roboflow.github.io/multimodal-maestro/
site_url: https://roboflow.github.io/maestro/
site_author: Roboflow
site_description: 'Streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, Qwen2-VL.'
repo_name: roboflow/multimodal-maestro
repo_url: https://github.com/roboflow/multimodal-maestro
edit_uri: https://github.com/roboflow/multimodal-maestro/tree/main/docs
copyright: Roboflow 2024. All rights reserved.
repo_name: roboflow/maestro
repo_url: https://github.com/roboflow/maestro
edit_uri: https://github.com/roboflow/maestro/tree/main/docs
copyright: Roboflow 2025. All rights reserved.

extra:
social:
Expand All @@ -23,34 +23,38 @@ extra:

nav:
- Maestro: index.md
- Models:
- Florence-2: florence-2.md
- Tasks: tasks.md
- Metrics: metrics.md
# - Models:
# - Florence-2: florence-2.md
# - Tasks: tasks.md
# - Metrics: metrics.md


theme:
name: 'material'
logo: https://media.roboflow.com/open-source/supervision/supervision-lenny.png
favicon: https://media.roboflow.com/open-source/supervision/supervision-lenny.png
logo: assets/maestro-logo.svg
favicon: assets/maestro-logo.svg
custom_dir: docs/theme
palette:
# Palette for light mode
- scheme: default
primary: 'custom'
primary: 'black'
toggle:
icon: material/brightness-7
name: Switch to dark mode

# Palette toggle for dark mode
- scheme: slate
primary: 'custom'
primary: 'black'
toggle:
icon: material/brightness-4
name: Switch to light mode
font:
text: Roboto
code: Roboto Mono
features:
- content.tabs.link
- content.code.copy


plugins:
- search
Expand Down
8 changes: 5 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[build-system]
requires = ["setuptools", "setuptools-scm", "wheel"]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"

[project]
Expand Down Expand Up @@ -78,15 +78,17 @@ florence_2 = [
paligemma_2 = [
"peft>=0.12",
"torch>=2.4.0",
"transformers<4.48.0", # does not work with 4.49.*
# PaliGemma 2 training does not work with 4.49.*
"transformers<4.48.0",
"bitsandbytes>=0.45.0"
]
qwen_2_5_vl = [
"accelerate>=1.2.1",
"peft>=0.12",
"torch>=2.4.0",
"torchvision>=0.20.0",
"transformers @ git+https://github.com/huggingface/transformers",
# PyPi doesn't allow git repo packages; uncomment when transformers release support for Qwen2.5-VL
# "transformers @ git+https://github.com/huggingface/transformers",
"bitsandbytes>=0.45.0",
"qwen-vl-utils>=0.0.8"
]
Expand Down

0 comments on commit c983d22

Please sign in to comment.