feat: support finetuning and evaluation of VLA models #7

learnerljh · 2025-12-19T12:23:05Z

Description

Please include a summary of the changes and which issue is fixed. Include relevant motivation and context.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Code refactoring
Performance improvement
Test coverage improvement

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

I have read the CONTRIBUTION guide. (required)
My change requires a change to the documentation.
I have updated the tests accordingly. (required for a bug fix or a new feature)
I have updated the documentation accordingly.
I have reformatted the code using make format. (required)
I have checked the code using make lint. (required)
I have ensured make test pass. (required)

Testing

Please describe the tests that you ran to verify your changes:

Test A
Test B

Copilot

Pull request overview

This PR adds comprehensive support for finetuning and evaluation of Vision-Language-Action (VLA) models, including OpenPI, OpenVLA, UniVLA, and SmolVLA. The changes introduce a unified CLI interface, model-specific training/evaluation configurations, and extensive infrastructure for model serving and benchmarking.

Key changes:

Unified CLI interface (vla-arena train/eval) for all VLA models with dynamic model loading
OpenPI model integration with JAX-based training, Docker deployment, and websocket-based policy serving
Training and evaluation configurations for OpenPI, OpenVLA (with OFT variant), UniVLA, and SmolVLA
Removed legacy evaluation utilities and policy implementations in favor of model-specific evaluators

Reviewed changes

Copilot reviewed 124 out of 1401 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
vla_arena/cli/main.py	CLI entry point with train/eval subcommands
vla_arena/cli/train.py	Training orchestrator supporting PyTorch (torchrun) and JAX models
vla_arena/cli/eval.py	Evaluation launcher for model-specific evaluators
vla_arena/models/openpi/*	Complete OpenPI integration including scripts, Docker configs, and client libraries
vla_arena/configs/train/*.yaml	Training configurations for all supported models
vla_arena/configs/evaluation/*.yaml	Evaluation configurations for all supported models
vla_arena/evaluation/*	Removed legacy evaluation utilities and policy base classes
tests/*	Updated test structure and fixtures

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

vla_arena/models/openpi/scripts/docker/install_nvidia_container_toolkit.sh

vla_arena/models/openpi/examples/vla_arena/eval_vla_arena.py

vla_arena/models/openpi/evaluator.py

vla_arena/models/openpi/examples/vla_arena/requirements.in

vla_arena/configs/train/openvla_oft.yaml

muchvo

LGTM.

feat:support finetuning and evaluation of models

5bfeab8

learnerljh requested review from Copilot, muchvo and propellanesjc December 19, 2025 12:23

Copilot AI reviewed Dec 19, 2025

View reviewed changes

muchvo and others added 3 commits December 20, 2025 20:27

update

ce2f7c5

change_toml

dcee52b

add_copyright

1ddf438

muchvo changed the title ~~feat:support finetuning and evaluation of VLA models~~ feat: support finetuning and evaluation of VLA models Dec 21, 2025

learnerljh added 16 commits December 21, 2025 16:33

remove_unused_stuff

a7a6af9

remove_unused_stuff

f1b79aa

remove_unused_stuff

1eb67e2

merge

21a0170

merge

e15c1cf

merge

8d3f53e

repair_coverage

e559c1c

repair_pytest

d02cb39

repair_pytest

d00f4e7

repair_pytest

ba81f4c

repair_pytest

8acf313

remove_unused_stuff

acc8a24

remove_unused_stuff

20870df

ruff_auto_fic

34b8942

improve_code_quality

a81a2c3

improve_code_quality

ef28c02

muchvo approved these changes Dec 21, 2025

View reviewed changes

muchvo merged commit d12789b into PKU-Alignment:main Dec 21, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: support finetuning and evaluation of VLA models #7

feat: support finetuning and evaluation of VLA models #7

Uh oh!

learnerljh commented Dec 19, 2025 •

edited by muchvo

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

muchvo left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: support finetuning and evaluation of VLA models #7

feat: support finetuning and evaluation of VLA models #7

Uh oh!

Conversation

learnerljh commented Dec 19, 2025 • edited by muchvo Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

muchvo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

learnerljh commented Dec 19, 2025 •

edited by muchvo

Loading