Omnivorous modeling for visual modalities

This repository contains PyTorch pretrained models, inference examples for the following papers:

Omnivore A single vision model for many different visual modalities, CVPR 2022 [bib]

@inproceedings{girdhar2022omnivore,
  title={{Omnivore: A Single Model for Many Visual Modalities}},
  author={Girdhar, Rohit and Singh, Mannat and Ravi, Nikhila and van der Maaten, Laurens and Joulin, Armand and Misra, Ishan},
  booktitle={CVPR},
  year={2022}
}

OmniMAE Single Model Masked Pretraining on Images and Videos [bib]

@article{girdhar2022omnimae,
  title={OmniMAE: Single Model Masked Pretraining on Images and Videos},
  author={Girdhar, Rohit and El-Nouby, Alaaeldin and Singh, Mannat and Alwala, Kalyan Vasudev and Joulin, Armand and Misra, Ishan},
  journal={arXiv preprint arXiv:2206.08356},
  year={2022}
}

OmniVision Our training pipeline supporting the multi-modal vision research.[bib]

Contributing

We welcome your pull requests! Please see CONTRIBUTING and CODE_OF_CONDUCT for more information.

License

Omnivore is released under the CC-BY-NC 4.0 license. See LICENSE for additional details. However the Swin Transformer implementation is additionally licensed under the Apache 2.0 license (see NOTICE for additional details).

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
.github		.github
omnimae		omnimae
omnivision		omnivision
omnivore		omnivore
temp_build		temp_build
tests		tests
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
cog.yaml		cog.yaml
hubconf.py		hubconf.py
inference_tutorial.ipynb		inference_tutorial.ipynb
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Omnivorous modeling for visual modalities

Contributing

License

About

Releases

Packages

Contributors 9

Languages

License

facebookresearch/omnivore

Folders and files

Latest commit

History

Repository files navigation

Omnivorous modeling for visual modalities

Contributing

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 9

Languages

Packages