Skip to content

A Video Joint Embedding Predictive Architecture (JEPA) that runs on a personal computer.

License

Notifications You must be signed in to change notification settings

BHI-Research/nano-jepa

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nano-JEPA

nano-JEPA: a Video Joint Embedding Predictive Architecture that runs in a regular computer. Based on V-JEPA. Read the paper about our work. Consider using nano-datasets to create your dataset ;)

Setup

(base) conda create -n nano-jepa python=3.9 
(base) conda activate nano-jepa

# Install PyTorch on hardware that contains GPUs
# (nano-jepa) conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

# Install PyTorch only using CPUs 
(nano-jepa) conda install pytorch torchvision torchaudio cpuonly -c pytorch

(nano-jepa) python setup.py install

Run

A set of config files are provided in this repo. Change the paths ins the *.yaml file using the guidelines in the next section.

# unsupervised training
(nano-jepa)$ python -m app.train_nano_jepa --fname configs/pretrain/vitt.yaml

# video evaluation
(nano-jepa)$ python -m evals.eval_video_nano_jepa  --fname configs/evals/vitt16_k400_16x8x3.yaml

# image evaluation
(nano-jepa)$ python -m evals.eval_image_nano_jepa  --fname configs/evals/vitt16_in1k.yaml

# video inference
(nano-jepa)$ python -m evals.infer_video_classification --fname configs/infer/infer_vitt_k400x8x3.yaml

# visualize feature (work in progress)
(nano-jepa)$ python -m evals.eval_features 

System directories

Consider the nano-datasets tool to create your local dataset.

Dataset location path, k400 dataset example

  • A Windows user path: C:\Users\your-user\Documents\ML-datasets\video_datasets\k400\k400_file_index.csv
  • A Linux user path: /home/your-user/Documents/ML-datasets/video_datasets/k400/k400_file_index.csv

Logging location path

  • A Windows user path: C:\Users\your-user\Documents\ML-logging
  • A Linux user path: /home/your-user/Documents/ML-logging

Paper and Authors

Read the PDF file here

Paper details:

  • Title: "nano-JEPA: Democratizing Video Understanding with Personal Computers"
  • Authors: Adrián Rostagno, Javier Iparraguirre, Joel Ermantraut, Lucas Tobio, Segundo Foissac, Santiago Aggio, Guillermo Friedrich
  • Event: XXV WASI – Workshop Agentes y Sistemas Inteligentes, CACIC.
  • Year: 2024.

Bibtex:

@inproceedings{ermantraut2020resolucion,
     title={nano-JEPA: Democratizing Video Understanding with Personal Computer},
     author={Adrian Rostagno and Javier Iparraguirre and Joel Ermantraut and Lucas Tobio and Segundo Foissac and Santiago Aggio and Guillermo Friedrich},
     booktitle={XXV WASI – Workshop Agentes y Sistemas Inteligentes, CACIC},
     year={2024}
}

Checkpoints

Here is a list of checkpoints available for experimentation:

                                                                                                                                                                                                                                           
ModelClassesTrain VideosVal videosRatioAccuracyEpochsPre-train
A (download)4100500.545.5020nano-JEPA ViT-T 800 videos(download)
B (download)425120.4835.4110
C (download)8100500.541.5020
D (download)4100500.599.506V-JEPA ViT-L (see V-JEPA site)
E (download)425120.4891.666
F (download)8100500.594.256

About

A Video Joint Embedding Predictive Architecture (JEPA) that runs on a personal computer.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%