Skip to content

Latest commit

 

History

History
66 lines (45 loc) · 2.31 KB

README.md

File metadata and controls

66 lines (45 loc) · 2.31 KB

Diffusion

A basic diffusion model from scratch - Denoising Diffusion Probabilistic Models (DDPM) pipeline. Build on top of https://github.com/cloneofsimo/minDiffusion/tree/master. Note it supports both DDPM with and without additional conditions (e.g. text information).

Installation

  1. Create a conda environment.
conda create --name diffusion python=3.9 -y
conda activate diffusion
  1. Install PyTorch.
pip install torch torchvision torchaudio
  1. Install additional libraries.
pip install tqdm openai-clip

Data Preparation

  1. CIFAR10

No preparation needed.

  1. Customized dataset

Download your own images into ./data and write your customized data loader.

Train

Run the below commad. You can specify different DDPM models (DDPM with NaiveUnet, DDPM with ContextUnet) and hyperparamters inside.

python train.py

Inference

Run the below command. You can specify different models (NaiveUnet, ContextUnet) and hyperparamters inside.

python inference.py

Results

  1. DDPM 1000 steps CIFAR10 without conditions and trained with 100 epochs.

Generated images

  1. DDPM 1000 steps CIFAR10 with one-hot encoding class conditions and trained with 100 epochs. The first to last row is conditioned on 'automobile', 'cat', 'dog', and 'ship', respectively.

Generated images

  1. DDPM 1000 steps CIFAR10 with text embedding class conditions and trained with 100 epochs. The first to last row is conditioned on 'automobile', 'cat', 'dog', and 'ship', respectively.

Generated images

Checkpoints

  1. DDPM without conditions (NaiveUnet) trained with 100 epochs on CIFAR10.

https://drive.google.com/file/d/1e95Rkgb1DtvFPuyynMipOPprMcgrp99j/view?usp=sharing

  1. DDPM with one-hot encoding class conditions (ContextUnet) and trained with 100 epochs on CIFAR10.

https://drive.google.com/file/d/1tK6a1mOisSM-holI8Ou3AFlX5MTzCr7i/view?usp=sharing

  1. DDPM with text embedding class conditions (ContextUnet) and trained with 100 epochs on CIFAR10.

https://drive.google.com/file/d/1OZwl_cjNqretPH-Azj3sMoYC4jFNt_Jc/view?usp=sharing