A generative model is a type of machine learning model that can generate new data that is similar to the data it has been trained on 12. This tutorial will give you a short introduction to diffusion models, which are a specific type of generative model.
A diffusion model is based on the idea of incrementally inverting a forward process of adding noise to data. In this forward process, the input data
We will use denoising-diffusion-pytorch which is a diffusion model library written by lucidrains. For training we will use lightning.
First, clone and cd into the repository:
git clone https://github.com/weigertlab/diffusion_model_tutorial.git
cd diffusion_model_tutorialIf using BARD, simply open the notebook regularly in VSCode and choose the kernel diffusion.
Otherwise all the setup, including environment creation as well as data downloading, can be done by simply running this command:
source setup.shGenerative models are typically trained on a dataset of (unlabeled) images. In this tutorial we will use two example datasets, showing images of
- flywing membrane, or
- dual color zebrafish retina.
Please choose one of the two datasets to train your own models. If you want to train a model on your own data, you can skip these steps. Note that the 2D images are given as a npz file, which is a compressed numpy array. For your custom data, you could as well use a folder with tiff files.
1. Flywing membrane3
2. Dual color zebrafish retina4
Please see diffusion.ipynb to see how to train a diffusion model on one of the two datasets.
Footnotes
-
Ho, Jonathan, Ajay Jain, and Pieter Abbeel. "Denoising diffusion probabilistic models." Advances in neural information processing systems 33 (2020): 6840-6851. ↩
-
Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-Based Generative Modeling through Stochastic Differential Equations. arXiv preprint arXiv:2011.13456, 2020. ↩
-
Prakash, M., Buchholz, T.-O., Schmidt, D., Krull, A., & Jug, F. (2020). Flywing (noise 0) dataset for microscopy image denoising and segmentation benchmark as used in DenoiSeg paper. Zenodo dataset ↩
-
Martin Weigert, Uwe Schmidt, Tobias Boothe, Andreas Müller, Alexandr Dibrov, Akanksha Jain, Benjamin Wilhelm, Deborah Schmidt, Coleman Broaddus, Sian Culley, Mauricio Rocha-Martins, Fabián Segovia-Miranda, Caren Norden, Ricardo Henriques, Marino Zerial, Michele Solimena, Jochen Rink, Pavel Tomancak, Loic Royer, Florian Jug, Eugene W. Myers. Content Aware Image Restoration: Pushing the Limits of Fluorescence Microscopy. Dataset ↩



