Skip to content

Latest commit

 

History

History
12 lines (9 loc) · 402 Bytes

README.md

File metadata and controls

12 lines (9 loc) · 402 Bytes

Simple Training Skeleton for VLMs

This repository contains code to quickly train visual-language models from pre-trained models (e.g., ~300 lines of code for model, dataset, and training).

Supported Vision Models

  • openai/clip-vit-base-patch32
  • google/siglip-base-patch16-224

Supported Language Models

  • allenai/OLMo-1B-hf
  • google/gemma-2b-it
  • mistralai/Mistral-7B-Instruct-v0.3