@@ -59,21 +59,21 @@ We retrained several state-of-the-art diffusion model-based methods using our da
5959## Time Cost and GPU Memory Usage
6060
6161### Inference Time Cost and GPU Memory Usage
62- | ` output_size ` | latent size | ` autoencoder_sliding_window_infer_size ` | ` autoencoder_tp_num_splits ` | Peak Memory | DM Time | VAE Time |
63- | ---------------| :--------------------------------------:| :--------------------------------------:| :---------------------------:| :-----------:| :-------:| :--------:|
64- | [ 256x256x128] ( ./configs/config_infer_16g_256x256x128.json ) | 4x64x64x32| >=[ 64,64,32] , not used | 2 | 14G | 57s | 1s |
65- | [ 256x256x256] ( ./configs/config_infer_16g_256x256x256.json ) | 4x64x64x64| [ 48,48,64] , 4 patches | 2 | 14G | 81s | 7s |
66- | [ 512x512x128] ( ./configs/config_infer_16g_512x512x128.json ) | 4x128x128x32| [ 64,64,32] , 9 patches | 1 | 14G | 138s | 7s |
67- | | | | | | |
68- | [ 256x256x256] ( ./configs/config_infer_24g_256x256x256.json ) | 4x64x64x64| >=[ 64,64,64] , not used | 4 | 22G | 81s | 2s |
69- | [ 512x512x128] ( ./configs/config_infer_24g_512x512x128.json ) | 4x128x128x32| [ 80,80,32] , 4 patches | 1 | 18G | 138s | 9s |
70- | [ 512x512x512] ( ./configs/config_infer_24g_512x512x512.json ) | 4x128x128x128| [ 64,64,48] , 36 patches | 2 | 22G | 569s | 29s |
71- | | | | | | |
72- | [ 512x512x512] ( ./configs/config_infer_32g_512x512x512.json ) | 4x128x128x128| [ 64,64,64 ] , 27 patches | 2 | 26G | 569s | 40s |
73- | | | | | | |
74- | [ 512x512x128] ( ./configs/config_infer_80g_512x512x128.json ) | 4x128x128x32| >=[ 128,128,32] , not used | 4 | 37G | 138s | 140s |
75- | [ 512x512x512] ( ./configs/config_infer_80g_512x512x512.json ) | 4x128x128x128| [ 80,80,80] , 8 patches | 2 | 44G | 569s | 30s |
76- | [ 512x512x768] ( ./configs/config_infer_24g_512x512x768 .json ) | 4x128x128x192| [ 80,80,112] , 8 patches | 4 | 55G | 904s | 48s |
62+ | ` output_size ` | latent size | ` autoencoder_sliding_window_infer_size ` | ` autoencoder_tp_num_splits ` | Peak Memory | VAE Time | DM Time (DDPM) | DM Time (RFlow) |
63+ | ---------------| :--------------------------------------:| :--------------------------------------:| :---------------------------:| :-----------:| :-------- :| :-------: | :------- --------:|
64+ | [ 256x256x128] ( ./configs/config_infer_16g_256x256x128.json ) | 4x64x64x32| >=[ 64,64,32] , not used | 2 | 15.0G | 1s | 57s | 2s |
65+ | [ 256x256x256] ( ./configs/config_infer_16g_256x256x256.json ) | 4x64x64x64| [ 48,48,64] , 4 patches | 4 | 15.4G | 5s | 81s | 3s |
66+ | [ 512x512x128] ( ./configs/config_infer_16g_512x512x128.json ) | 4x128x128x32| [ 64,64,32] , 9 patches | 2 | 15.7G | 8s | 138s | 5s |
67+ | | | | | | | | |
68+ | [ 256x256x256] ( ./configs/config_infer_24g_256x256x256.json ) | 4x64x64x64| >=[ 64,64,64] , not used | 4 | 22.7G | 2s | 81s | 3s |
69+ | [ 512x512x128] ( ./configs/config_infer_24g_512x512x128.json ) | 4x128x128x32| [ 80,80,32] , 4 patches | 2 | 21.0G | 6s | 138s | 5s |
70+ | [ 512x512x512] ( ./configs/config_infer_24g_512x512x512.json ) | 4x128x128x128| [ 64,64,48] , 36 patches | 2 | 22.8G | 29s | 569s | 19s |
71+ | | | | | | | | |
72+ | [ 512x512x512] ( ./configs/config_infer_32g_512x512x512.json ) | 4x128x128x128| [ 80,80,48 ] , 16 patches | 4 | 28.4G | 30s | 569s | 19s |
73+ | | | | | | | | |
74+ | [ 512x512x128] ( ./configs/config_infer_80g_512x512x128.json ) | 4x128x128x32| >=[ 128,128,32] , not used | 4 | 37.7G | 127s | 138s | 5s |
75+ | [ 512x512x512] ( ./configs/config_infer_80g_512x512x512.json ) | 4x128x128x128| [ 80,80,80] , 8 patches | 2 | 45.3G | 32s | 569s | 19s |
76+ | [ 512x512x768] ( ./configs/config_infer_80g_512x512x768 .json ) | 4x128x128x192| [ 80,80,112] , 8 patches | 4 | 56.2G | 50s | 904s | 30s |
7777
7878** Table 3:** Inference Time Cost and GPU Memory Usage. ` DM Time ` refers to the time required for diffusion model inference. ` VAE Time ` refers to the time required for VAE decoder inference. The total inference time is the sum of ` DM Time ` and ` VAE Time ` . The experiment was conducted on an A100 80G GPU.
7979
@@ -168,7 +168,13 @@ For example,
168168To run the inference script with MAISI DDPM, please set ` "num_inference_steps": 1000 ` in ` ./configs/config_infer.json ` , and run:
169169``` bash
170170export MONAI_DATA_DIRECTORY=< dir_you_will_download_data>
171- python -m scripts.inference -c ./configs/config_maisi3d-ddpm.json -i ./configs/config_infer.json -e ./configs/environment.json --random-seed 0
171+ python -m scripts.inference -c ./configs/config_maisi3d-ddpm.json -i ./configs/config_infer.json -e ./configs/environment_maisi3d-ddpm.json --random-seed 0 --version maisi3d-ddpm
172+ ```
173+
174+ To run the inference script with MAISI RFlow, please set ` "num_inference_steps": 30 ` in ` ./configs/config_infer.json ` , and run:
175+ ``` bash
176+ export MONAI_DATA_DIRECTORY=< dir_you_will_download_data>
177+ python -m scripts.inference -c ./configs/config_maisi3d-rflow.json -i ./configs/config_infer.json -e ./configs/environment_maisi3d-rflow.json --random-seed 0 --version maisi3d-rflow
172178```
173179
174180Please refer to [ maisi_inference_tutorial.ipynb] ( maisi_inference_tutorial.ipynb ) for the tutorial for MAISI model inference.
0 commit comments