Skip to content

Commit f43d3d5

Browse files
committed
grammar
Signed-off-by: Can-Zhao <[email protected]>
1 parent 62e688c commit f43d3d5

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

generation/maisi/README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -74,14 +74,14 @@ We retrained several state-of-the-art diffusion model-based methods using our da
7474
| [512x512x512](./configs/config_infer_80g_512x512x512.json) |4x128x128x128| [80,80,80], 8 patches | 2 | 44G | 569s | 30s |
7575
| [512x512x768](./configs/config_infer_24g_512x512x768.json) |4x128x128x192| [80,80,112], 8 patches | 4 | 55G | 904s | 48s |
7676

77-
**Table 3:** Inference Time Cost and GPU Memory Usage. `DM Time` refers to the time cost of diffusion model inference. `VAE Time` refers to the time cost of VAE decoder inference. The total inference time is the `DM Time` plus `VAE Time`. When `autoencoder_sliding_window_infer_size` is equal or larger than the latent feature size, sliding window will not be used,
78-
and the time and memory cost remain the same. The experiment was tested on A100 80G GPU.
77+
**Table 3:** Inference Time Cost and GPU Memory Usage. `DM Time` refers to the time required for diffusion model inference. `VAE Time` refers to the time required for VAE decoder inference. The total inference time is the sum of `DM Time` and `VAE Time`. The experiment was conducted on an A100 80G GPU.
7978

79+
During inference, the peak GPU memory usage occurs during the autoencoder's decoding of latent features.
80+
To reduce GPU memory usage, we can either increase `autoencoder_tp_num_splits` or reduce `autoencoder_sliding_window_infer_size`.
81+
Increasing `autoencoder_tp_num_splits` has a smaller impact on the generated image quality, while reducing `autoencoder_sliding_window_infer_size` may introduce stitching artifacts and has a larger impact on the generated image quality.
82+
83+
When `autoencoder_sliding_window_infer_size` is equal to or larger than the latent feature size, the sliding window will not be used, and the time and memory costs remain the same.
8084

81-
During inference, the peak GPU memory usage happens during the autoencoder decoding latent features.
82-
To reduce GPU memory usage, we can either increasing `autoencoder_tp_num_splits` or reduce `autoencoder_sliding_window_infer_size`.
83-
Increasing `autoencoder_tp_num_splits` has smaller impact on the generated image quality.
84-
Yet reducing `autoencoder_sliding_window_infer_size` may introduce stitching artifact and has larger impact on the generated image quality.
8585

8686
### Training GPU Memory Usage
8787
VAE is trained on patches and thus can be trained with 16G GPU if patch size is set to be small like [64,64,64].

0 commit comments

Comments
 (0)