Initial NaFlex ViT model and training support #2466

rwightman · 2025-04-08T04:28:03Z

Working:

'flex' ViT w/ NaFlex position embedding resize, pre-patched input, attention padding masks
Single node train.py works with a custom naflex data-pipeline via a dataset wrapper that handles random seq-len & batch-size selection, constrains images to seq-len while keeping aspect ratio (with randomizations)
A much faster patch embed kernel resample, torch only, can be used in forward()

Not tested / not completed:

distributed training not tested, dataset wrapper needs more verification
dataset wrapper for iterable datasets (wds, tfds, iterable hfds) needs to be added
more model definitions
weight loading / translation for existing vits
SigLip-2 NaFlex vision encoder weight port
Integration of naflex data pipeline components into OpenCLIP
Add randomization of the patch_size along with seq_len

HuggingFaceDocBuilderDev · 2025-04-08T04:37:14Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…the model.

…t if we bash h,w key into an int or str

rwightman added 2 commits April 7, 2025 21:27

Initial NaFlex ViT model and training support

0893f5d

Type fixes, remove old comments

825edcc

rwightman marked this pull request as draft April 8, 2025 04:39

rwightman added 7 commits April 8, 2025 07:59

Exclude naflex models from jit tests

9b23d6d

Fix ParallelThingsBlock w/ attn_mask

6675590

Add loss scale arg, initial distributed loss scale. Maybe fix FX for …

13e0f3a

…the model.

Exclude embeds module and mask attn functions from tracing

b4bb0f4

A much faster resample_patch_embed, can be used at train/validation time

97341fe

Improve several typing issues for flex vit, can (almost) work with ji…

ea728f6

…t if we bash h,w key into an int or str

Optimizations for pos embed resize, merge different mask helper fns

c527c37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial NaFlex ViT model and training support #2466

Initial NaFlex ViT model and training support #2466

rwightman commented Apr 8, 2025 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 8, 2025

Initial NaFlex ViT model and training support #2466

Are you sure you want to change the base?

Initial NaFlex ViT model and training support #2466

Conversation

rwightman commented Apr 8, 2025 • edited Loading

HuggingFaceDocBuilderDev commented Apr 8, 2025

rwightman commented Apr 8, 2025 •

edited

Loading