Inject multi-diffusion into your UNets/Transformers with two lines of code.
from multidiffusion import enable_2d_multidiffusion
enable_2d_multidiffusion(pipeline.unet) # or `pipeline.transformer`
This is a work in progress.
See multidiffusion.github.io for the original paper, for our purposes, it is:
- a way to reduce memory consumption and reduce runtime (sometimes) and improve image composition (usually) when working with diffusion models at very large resolutions, and
- a way to generate images of dynamic resolution using diffusion models that are only capable of static resolutions natively.
The enable_2d_multidiffusion
method will work on any UNet or 2D Transformer that accepts 4-dimensional Tensor input (B×C×H×W)
and returns the same, either in latent space or pixel space.
At present it will not work with FLUX, as that uses packed Tensor input/output.
See example-sdxl.py for a complete example generating the following collage using SDXL 1.0.
- Wrap-around (tiling images)
- Tile batching (sacrifice some memory savings for faster generation)
- FLUX compatibility
- Animation compatbility (
enable_3d_multidiffusion
) - Include example for using Multi-Diffusion for regional prompting