Skip to content

Latest commit

 

History

History

image

Image example

Training instructions

  1. Download and unpack blurred ImageNet from the official website.
export IMAGENET_DIR=~/flow_matching/examples/image/data/
export IMAGENET_RES=64
tar -xf ~/Downloads/train_blurred.tar.gz -C $IMAGENET_DIR
  1. Downsample Imagenet to the desired resolution.
cd ~/
git clone [email protected]:PatrykChrabaszcz/Imagenet32_Scripts.git
python Imagenet32_Scripts/image_resizer_imagent.py -i ${IMAGENET_DIR}train_blurred -o ${IMAGENET_DIR}train_blurred_$IMAGENET_RES -s $IMAGENET_RES -a box  -r -j 10 
  1. Set up the virtual environment. First, set up the virtual environment by following the steps in the repository's README.md. Then,
conda activate flow_matching

cd examples/image
pip install -r requirements.txt
  1. [Optional] Test-run training locally. A test run executes one step of training followed by one step of evaluation.
python train.py --data_path=${IMAGENET_DIR}train_blurred_$IMAGENET_RES/box/ --test_run
  1. Launch training on a SLURM cluster
python submitit_train.py --data_path=${IMAGENET_DIR}train_blurred_$IMAGENET_RES/box/ 
  1. Evaluate the model using the --eval_only flag. The evaluation script will generate snapshots under the /snapshots folder. Specify the --compute_fid flag to also compute the FID with respect to the training set. Make sure to specify your most recent checkpoint to resume from. The results are printed to log.txt.
python submitit_train.py --data_path=${IMAGENET_DIR}train_blurred_$IMAGENET_RES/box/ --resume=./output_dir/checkpoint-899.pth --compute_fid --eval_only

Results

Data Model type Epochs FID Command
Cifar10 Unconditional UNet 1800 2.07 python submitit_train.py \
--dataset=cifar10 \
--batch_size=64 \
--nodes=1 \
--accum_iter=1 \
--eval_frequency=100 \
--epochs=3000 \
--class_drop_prob=1.0 \
--cfg_scale=0.0 \
--compute_fid \
--ode_method heun2 \
--ode_options '{"nfe": 50}' \
--use_ema \
--edm_schedule \
--skewed_timesteps
ImageNet32 (Blurred) Class conditional Unet 900 1.14 export IMAGENET_RES=32 \
python submitit_train.py \
--data_path=${IMAGENET_DIR}train_blurred_$IMAGENET_RES/box/ \
--batch_size=32 \
--nodes=8 \
--accum_iter=1 \
--eval_frequency=100 \
--decay_lr \
--compute_fid \
--ode_method dopri5 \
--ode_options '{"atol": 1e-5, "rtol":1e-5}'
ImageNet64 (Blurred) Class conditional Unet 900 1.64 export IMAGENET_RES=64 \
python submitit_train.py \
--data_path=${IMAGENET_DIR}train_blurred_$IMAGENET_RES/box/ \
--batch_size=32 \
--nodes=8 \
--accum_iter=1 \
--eval_frequency=100 \
--decay_lr \
--compute_fid \
--ode_method dopri5 \
--ode_options '{"atol": 1e-5, "rtol":1e-5}'
Cifar10 (Discrete Flow) Unconditional Unet 2500 3.58 python submitit_train.py \
--dataset=cifar10 \
--nodes=1 \
--discrete_flow_matching \
--batch_size=32 \
--accum_iter=1 \
--cfg_scale=0.0 \
--use_ema \
--epochs=3000 \
--class_drop_prob=1.0 \
--compute_fid \
--sym_func

Acknowledgements

This example partially use code from:

License

The majority of the code in this example is licensed under CC-BY-NC, however portions of the project are available under separate license terms:

  • The UNet model is under MIT license.
  • The distributed computing and the grad scaler code is under MIT license.

Citations

Deng, Jia, et al. "Imagenet: A large-scale hierarchical image database." 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009.

Karras, Tero, et al. "Elucidating the design space of diffusion-based generative models." Advances in neural information processing systems 35 (2022): 26565-26577.

Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18. Springer International Publishing, 2015.