Skip to content
Draft
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
4e053ac
Add integration example(WIP).
KanaiYuma-aist Jun 18, 2025
1b2986c
Fix config.
KanaiYuma-aist Jun 19, 2025
2857756
Fix filename for mypy.
KanaiYuma-aist Jun 19, 2025
dbe17c7
Fix commnets.
KanaiYuma-aist Jun 20, 2025
1b74aac
Fix typo.
KanaiYuma-aist Jun 23, 2025
58ac660
Merge branch 'main' into feature/integration_example
KanaiYuma-aist Jun 25, 2025
1885e48
Merge branch 'main' of github:aistairc/aiaccel into feature/integrati…
KanaiYuma-aist Jun 25, 2025
52dea6e
Merge branch 'main' into feature/integration_example
KanaiYuma-aist Jun 25, 2025
dbddbf5
Add README.
KanaiYuma-aist Jul 3, 2025
cedc08e
Merge branch 'feature/integration_example' of github:aistairc/aiaccel…
KanaiYuma-aist Jul 3, 2025
3c23b5e
Fix typo.
KanaiYuma-aist Jul 3, 2025
bbc4d84
Merge branch 'main' of github:aistairc/aiaccel into feature/integrati…
KanaiYuma-aist Jul 4, 2025
4e129f2
Add Detailed Descriptions.
KanaiYuma-aist Jul 4, 2025
0b1d447
Remove tensorboard in README.
KanaiYuma-aist Jul 7, 2025
5bf3f72
Merge branch 'main' into feature/integration_example
KanaiYuma-aist Jul 17, 2025
672fb3b
Add SaveValLossCallback.
KanaiYuma-aist Jul 18, 2025
8c8339f
Remove unnecessary files.
KanaiYuma-aist Jul 18, 2025
f8db321
Fix README.
KanaiYuma-aist Jul 18, 2025
0327954
Merge branch 'main' into feature/integration_example
KanaiYuma-aist Jul 22, 2025
3ee480e
Remove comment.
KanaiYuma-aist Jul 22, 2025
fd99f49
Merge branch 'feature/integration_example' of github:aistairc/aiaccel…
KanaiYuma-aist Jul 22, 2025
233ab80
Merge branch 'main' into feature/integration_example
KanaiYuma-aist Jul 23, 2025
4086167
Merge branch 'main' into feature/integration_example
KanaiYuma-aist Jul 24, 2025
618dbcf
Merge branch 'main' of github:aistairc/aiaccel into feature/integrati…
KanaiYuma-aist Jul 25, 2025
2230afe
Add pyproject.toml.
KanaiYuma-aist Jul 29, 2025
e6daf34
Rename hpo_config.yaml.
KanaiYuma-aist Jul 29, 2025
4fadd61
Merge branch 'main' of github:aistairc/aiaccel into feature/integrati…
KanaiYuma-aist Jul 29, 2025
1b88776
Add job_config.yaml.
KanaiYuma-aist Jul 29, 2025
b609120
Use SaveMetricCallback.
KanaiYuma-aist Jul 29, 2025
5925a69
Remove PYTHONPATH.
KanaiYuma-aist Jul 29, 2025
ff8a29d
Update job_config.yaml.
KanaiYuma-aist Jul 29, 2025
cc840b6
Merge branch 'main' of github:aistairc/aiaccel into feature/integrati…
KanaiYuma-aist Aug 4, 2025
da071b2
Fix job_config.
KanaiYuma-aist Aug 5, 2025
4bf00e3
Fix README.
KanaiYuma-aist Aug 5, 2025
5c6a070
Merge and delete common_config.
KanaiYuma-aist Aug 6, 2025
b67a378
Change to local.
KanaiYuma-aist Aug 6, 2025
4fb787d
Remove unnecessary contents.
KanaiYuma-aist Aug 6, 2025
872681f
Fix README.
KanaiYuma-aist Aug 6, 2025
bb789f4
Fix script_prologue.
KanaiYuma-aist Aug 6, 2025
bb3a226
Rename task_for_integration_example.
KanaiYuma-aist Aug 7, 2025
600e3c9
Merge branch 'main' into feature/integration_example
yoshipon Aug 10, 2025
5369d97
Update structure
yoshipon Aug 11, 2025
5f75eed
Update test
yoshipon Aug 11, 2025
7052b56
Merge remote-tracking branch 'origin/main' into feature/integration_e…
yoshipon Aug 11, 2025
5a0d0a4
Update CI
yoshipon Aug 11, 2025
ec2563d
fix mypy
yoshipon Aug 11, 2025
9fe7231
fix mypy
yoshipon Aug 11, 2025
8a8b920
Update pre-commit
yoshipon Aug 11, 2025
a5b4adb
pre-commit migrate-config
yoshipon Aug 11, 2025
63ad8f9
bugfix
yoshipon Aug 11, 2025
4fe6fd5
pre-commit autoupdate
yoshipon Aug 11, 2025
d05623e
Update
yoshipon Aug 12, 2025
5a3fb68
Merge branch 'main' into feature/integration_example
yoshipon Aug 22, 2025
b40eccd
Merge branch 'main' into feature/integration_example
yoshipon Nov 18, 2025
bd1049d
updates
yoshipon Nov 18, 2025
6041ae0
Merge branch 'main' into feature/integration_example
yoshipon Nov 18, 2025
22eabfd
Merge branch 'main' into feature/integration_example
yoshipon Nov 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions examples/integration/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Example of Black Box Optimization on ABCI 3.0

This is an example of performing black-box optimization of the learning rate for a ResNet50 model on the MNIST dataset.

## Getting started

In an environment where aiaccel is installed, additionally install torchvision.

```bash
pip install torchvision
```


Run the following command to perform black-box optimization.
PATH_TO_ENV should be changed to the path of the environment prepared above.

```bash
python -m aiaccel.hpo.apps.optimize "python -m aiaccel.jobs.cli.abci3 gpu --command_prefix 'cd \$PBS_O_WORKDIR && module load cuda/12.6/12.6.1 && module load python/3.13/3.13.2 && source PATH_TO_ENV/bin/activate &&' jobs/{job_name}.log -- python -m aiaccel.torch.apps.train resnet50/config.yaml task.optimizer_config.optimizer_generator.lr={lr} trainer.logger.name=lr_{lr} out_filename={out_filename}" --config config.yaml
```

## Detailed Descriptions

The target function for optimization using aiaccel.hpo.app.optimize is objective_integration.main.
Within objective_integration.main, aiaccel.torch.app.train is called, and the learning rate is returned.

Detailed descriptions of torch and optimize are available on the [aiaccel document(torch)](https://aistairc.github.io/aiaccel/user_guide/torch.html) [aiaccel document(optimize)](https://aistairc.github.io/aiaccel/user_guide/hpo.html)
61 changes: 61 additions & 0 deletions examples/integration/common_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
trainer:
max_epochs: 10

callbacks:
- _target_: lightning.pytorch.callbacks.ModelCheckpoint
filename: "{epoch:04d}"
save_last: True
save_top_k: -1
- _target_: torchvision_task_integration.SaveValLossCallback
output_path: ${out_filename}


datamodule:
_target_: aiaccel.torch.lightning.datamodules.single_datamodule.SingleDataModule

train_dataset_fn:
_partial_: true
_target_: torchvision.datasets.MNIST
train: True

val_dataset_fn:
_partial_: true
_target_: torchvision.datasets.MNIST
train: False

common_args:
root: "./dataset"
download: True
transform: ${transform}

batch_size: 128
use_scatter: False


transform:
_target_: torchvision.transforms.Compose
transforms:
- _target_: torchvision.transforms.Resize
size: [256, 256]
- _target_: torchvision.transforms.Grayscale
num_output_channels: 3
- _target_: torchvision.transforms.ToTensor
- _target_: torchvision.transforms.Normalize
mean: [0.5]
std: [0.5]

task:
_target_: torchvision_task_integration.ImageClassificationTask
num_classes: 10

model:
_target_: torchvision.models.resnet50
weights:
_target_: hydra.utils.get_object
path: torchvision.models.ResNet50_Weights.DEFAULT

optimizer_config:
_target_: aiaccel.torch.lightning.OptimizerConfig
optimizer_generator:
_partial_: True
_target_: torch.optim.Adam
12 changes: 12 additions & 0 deletions examples/integration/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
params:
_convert_: partial
_target_: aiaccel.hpo.apps.optimize.HparamsManager
lr:
_target_: aiaccel.hpo.optuna.suggest_wrapper.SuggestFloat
name: lr
low: 1.e-6
high: 1.e-2
log: true

n_trials: 30
n_max_jobs: 4
3 changes: 3 additions & 0 deletions examples/integration/resnet50/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
_base_:
- ${base_config_path}/train_base.yaml
- ../common_config.yaml
75 changes: 75 additions & 0 deletions examples/integration/torchvision_task_integration.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
import json

import torch
from torch import nn
from torch.nn import functional as func

import lightning.pytorch as pl

from torchmetrics.classification import MulticlassAccuracy

from aiaccel.torch.lightning import OptimizerConfig, OptimizerLightningModule


class ImageClassificationTask(OptimizerLightningModule):
def __init__(self, model: nn.Module, optimizer_config: OptimizerConfig, num_classes: int = 10):
super().__init__(optimizer_config)

self.model = model
if hasattr(self.model.fc, "in_features") and isinstance(self.model.fc.in_features, int):
self.model.fc = nn.Linear(self.model.fc.in_features, num_classes)

self.train_accuracy = MulticlassAccuracy(num_classes=num_classes)
self.val_accuracy = MulticlassAccuracy(num_classes=num_classes)

def forward(self, x: torch.Tensor) -> torch.Tensor:
return self.model(x) # type: ignore

def training_step(self, batch: tuple[torch.Tensor, torch.Tensor], batch_idx: int) -> torch.Tensor:
x, y = batch

logits = self(x)

loss = func.cross_entropy(logits, y)

acc = self.train_accuracy(logits, y)
self.log_dict(
{
"training/loss": loss,
"training/acc": acc,
},
prog_bar=True,
)

return loss

def validation_step(self, batch: tuple[torch.Tensor, torch.Tensor], batch_idx: int) -> None:
x, y = batch

logits = self(x)

loss = func.cross_entropy(logits, y)

acc = self.val_accuracy(logits, y)
self.log_dict(
{
"validation/loss": loss,
"validation/acc": acc,
},
prog_bar=True,
)


class SaveValLossCallback(pl.Callback):
def __init__(self, output_path: str) -> None:
super().__init__()
self.output_path = output_path

def on_fit_end(self, trainer: pl.Trainer, pl_module: pl.LightningModule) -> None:
val_loss = trainer.callback_metrics.get("validation/loss")
if val_loss is not None:
val_loss_value = val_loss.item()
with open(self.output_path, "w") as f:
json.dump(val_loss_value, f)
else:
print("Warning: 'validation/loss' not found in callback_metrics.")
Loading