vit_cifar10

vit cifar10 test with image size 32, 64

vit test result with patch size 32

acc : 83.6

https://github.com/YuBeomGon/vit_cifar10/blob/master/notebooks/vit-scratch-s4.ipynb

vit test result with patch size 32, v2 training

regard v2,

https://pytorch.org/blog/how-to-train-state-of-the-art-models-using-torchvision-latest-primitives/

acc : 88.0

https://github.com/YuBeomGon/vit_cifar10/blob/master/notebooks/vit-scratch-v2.ipynb

vit fine tune to img_size 64

acc : 89.1

https://github.com/YuBeomGon/vit_cifar10/blob/master/notebooks/vit-64-from-32.ipynb

confusion matrix

attn map visualize

checkpoint

image resolution is important. accuracy is 70.0 in original setting, so add overraping patch embeding for efficient attention training
add lots of augmentation, normalization(q,k,v norm), because dataset is small, and vit has low inductive bias
weight init is not good in cifar10 case, default init is used.
use patch embeding interpolation for image_size 64 fine tuning

further

Use LoRA, adapter like big model fine tuning scheme
Deit, knowledge distillation(from big model to tiny model knowledge transfer)
use layer scale(cait)
loss ft tuning for cat

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
example		example
models		models
notebooks		notebooks
utils		utils
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vit_cifar10

vit test result with patch size 32

vit test result with patch size 32, v2 training

vit fine tune to img_size 64

About

Releases

Packages

Languages

YuBeomGon/vit_cifar10

Folders and files

Latest commit

History

Repository files navigation

vit_cifar10

vit test result with patch size 32

vit test result with patch size 32, v2 training

vit fine tune to img_size 64

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages