A list of projects relying on Iterative tools to achieve awesomeness.
Missing something awesome? Anyone is welcome to submit projects to this list.
dvthis
: Utility functions and project templates for DVC pipelines using R.- nvim-dvc: Neovim plugin for DVC.
- COVID Genomics/Airflow-DVC: Airflow extension for DVC.
- COVID Genomics/dvc-fs: High-level abstraction for DVC file manipulation (listing & I/O) with basic support for PyFilesystem2.
- zincware/ZnTrack: Create, visualize, run & benchmark DVC pipelines in Python & Jupyter notebooks.
- zincware/dask4dvc: Provides a DVC-like CLI that combines DVC with Dask Distributed to make it easier to use with HPC managers like Slurm
- DVC Streamlit Example: Build a custom web UI with DVC and Streamlit for visually tracking & comparing model performance during R&D (adapted from TensorFlow's transfer learning tutorial).
- DVC Pipelines and Experiments Tutorial: Build maintainable Machine Learning pipelines using DVC.
- CD4ML Example: Example DVC setup with AWS S3 remote storage & GitLab CI/CD.
- DVC with PyCaret & FastAPI: End-to-end demo of data & model tracking (DVC), remote storage (Azure), efficient experimentation (PyCaret) & model deployment (FastAPI).
- Example-get-started: Train a
sklearn
random forest classifier for StackOverflow question tagging. - Example-DVC-experiments: Train a Tensorflow CNN classifier for Fashion-MNIST data; used in https://dvc.org/doc/start/experiments.
- Example-versioning: Used in https://dvc.org/doc/use-cases/versioning-data-and-model-files/tutorial.
- DVC-Checkpoints-MNIST: A showcase for different ways to use the checkpoints. Train a PyTorch classifier on a CSV MNIST dataset.
- Scalable and Distributed ML Workflows with DVC + Ray on AWS: This tutorial introduces you to integrating DVC with Ray, turning them into your go-to toolkit for creating automated, scalable, and distributed ML pipelines.
- LensKit/lk-demo-experiment: Demo DVC experiment pipeline (DAG) using multiple public datasets, preprocessing & training, and Jupyter notebooks.
- ModelOriented/MAIR: Monitoring impact of AI regulations with a DVC pipeline.
- Kaggle-Titanic-DVC: Survival analysis DVC experiment.
- VQA-With-Multimodal-Transformers: Visual Question Answering task on the DAQUAR Dataset using multimodal transformer models with an experiment pipeline tracked in DVC Studio.
- pinellolab/pyrovelocity:
Pyro-Velocity
is a Bayesian, generative, and multivariate RNA velocity model to estimate uncertainty in predictions of future cell states from minimal models approximating transcript splicing dynamics.
-
Barrak, A., Eghan, E.E. and Adams, B. On the Co-evolution of ML Pipelines and Source Code - Empirical Study of DVC Projects , in Proceedings of the 28th IEEE International Conference on Software Analysis, Evolution, and Reengineering, SANER 2021. Hawaii, USA.
-
Barreto Simedo Pacheco, L., Rahman, M., Rabbi, F., Fathollahzadeh, P., Abdellatif, A., Shihab, E., Chen, T.P., Yang, J., and Zou, Y. DVC in Open Source ML-development: The Action and the Reaction, In Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering - Software Engineering for AI (CAIN '24). Lisbon, Portugal.