Skip to content

DEVAIEXP/mod-control-tile-upscaler-sdxl

Repository files navigation

MoD ControlNet Tile Upscaler for SDXL🤗

This project implements the 📜 MoD (Mixture-of-Diffusers) tiled diffusion technique and combines it with SDXL's ControlNet Tile process. 🚀 Controlnet Union Power! Check out the model: Controlnet Union 🎨 RealVisXL V5.0 for Stunning Visuals! Explore it here: RealVisXL

If you like the project, please give me a star! ⭐

GitHub
ko-fi

About

The MoD ControlNet Tile Upscale SDXL is an advanced pipeline that leverages ControlNet Tile and Mixture-of-Diffusers techniques, integrating tile diffusion directly into the latent space denoising process. Designed to overcome the limitations of conventional pixel-space tile processing, this pipeline delivers Super Resolution (SR) upscaling for higher-quality images, reduced processing time, and greater adaptability.

By processing tiles in the latent space, the pipeline ensures smoother transitions, eliminates visible seams, and optimizes resource usage. Additionally, it supports Unet quantization in float8, significantly reducing GPU memory consumption while maintaining high performance. This makes it ideal for high-resolution image generation and upscaling tasks, even on hardware with limited resources. Whether you're working with large-scale images or complex visual data, the MoD ControlNet Tile Upscale SDXL provides a robust and efficient solution for achieving superior results.

Key Features

  • Latent Space Processing: Tiles are processed directly in the latent space, improving efficiency and quality.

  • Seamless Transitions: Advanced weighting methods (e.g., Gaussian or Cosine) ensure smooth blending between tiles.

  • Dynamic Overlap: Adaptive overlap calculations guarantee complete image coverage, even at high resolutions.

  • Unet Quantization in float8: Reduces GPU memory consumption without compromising performance.

  • Scalability: Designed to handle large-scale images (e.g., 8192x6144) with ease.

  • Flexibility: Compatible with various resolutions and aspect ratios, maintaining consistency across different use cases.

Why Choose MoD ControlNet Tile Upscale SDXL?

  • Higher Quality: Latent space processing eliminates artifacts and improves detail preservation.

  • Faster Execution: Parallel tile processing reduces overall computation time.

  • Reduced Memory Usage: Unet quantization in float8 minimizes GPU memory requirements.

  • Adaptability: Works seamlessly across different resolutions and tile sizes.

  • Ease of Use: Intuitive integration with existing workflows and pipelines.

Method Comparison

To demonstrate the advantages of MoD ControlNet Tile Upscale SDXL, this section presents a visual comparison with alternative upscaling methods: ControlNet Tile and Lanczos Upscaler. We use a Ground Truth image (original high-resolution image) as a reference to evaluate the quality and fidelity of each upscaling method.

Common Issues with Conventional ControlNet Tile Upscaling

While ControlNet Tile is a useful technique for upscaling, especially when combined with ControlNets, the conventional pixel-space approach can suffer from several limitations that impact the final image quality. This section visually demonstrates some of these common issues.

The images below showcase common artifacts often observed in images upscaled using conventional ControlNet Tile methods:

ControlNet Tile - Visible Seams ControlNet Tile - Tile Inconsistency Ours - Without Seams Ours - Without Inconsistency
  • ControlNet Tile - Visible Seams: This example demonstrates the issue of visible seams or tile boundaries that can occur when tiles are processed and stitched together in pixel space. Observe the image, and focus on the areas highlighted by the red dots. These red dots pinpoint visible seams, which are evident as distinct lines or abrupt transitions in texture and tone, creating a noticeable grid-like artifact.
  • ControlNet Tile - Tile Inconsistency: This image highlights inconsistencies between tiles, where different tiles may exhibit slightly different styles, colors, or detail levels. Observe how the texture or color tone varies subtly across different tiles, leading to an uneven or patchwork appearance.

Visual Comparison

Below, you can observe side-by-side the upscaling results using different methods. For a more detailed analysis, click on title of images and observing areas with fine details and textures.

Ground Truth (Original) Ours ControlNet Tile Lanczos Upscaler
  • Ground Truth (Original): Original high-resolution image, used as a quality and detail reference.
  • Ours: Upscaling result using the proposed method, which integrates ControlNet Tile and Mixture-of-Diffusers in the latent space.
  • ControlNet Tile: Upscaling result using the standard ControlNet Tile method.
  • Lanczos Upscaler: Upscaling result using the Lanczos algorithm, a traditional interpolation method.
Input 1024px vs Upscaled 4x (ours) Input 1024px vs Ground Truth

Examples Results

Example 2: 1024 -> 4x
View
Example 3: 512 -> 4x
View
Example 4: 1024 -> 8x
View
Example 5: 1024 -> 8x
View

Installation

Use Python version 3.10.* and have the Python virtual environment installed.

Then run the following commands in the terminal:

Clone repository:

git clone https://github.com/DEVAIEXP/mod-control-tile-upscaler-sdxl.git
cd mod-control-tile-upscaler-sdxl

Prepare environment:

python -m venv venv
(for windows) .\venv\Scripts\activate
(for linux) source /venv/bin/activate
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124 --upgrade
pip install -r requirements.txt

How to Run

Gradio app: To launch the Gradio app on your local machine, execute the following command in your terminal:

python app.py

The following code👇 comes from infer.py. If you want to do quickly inference, please refer to the code in infer.py.

import torch
from diffusers import ControlNetUnionModel, AutoencoderKL, UNet2DConditionModel
from diffusers.utils import load_image
from pipeline.mod_controlnet_tile_sr_sdxl import StableDiffusionXLControlNetTileSRPipeline

from pipeline.util import (
    create_hdr_effect,
    progressive_upscale,
    quantize_8bit,
    select_scheduler,
)

device = "cuda"

# Initialize the models and pipeline
controlnet = ControlNetUnionModel.from_pretrained(
    "brad-twinkl/controlnet-union-sdxl-1.0-promax", torch_dtype=torch.float16
).to(device=device)
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16).to(device=device)

model_id = "SG161222/RealVisXL_V5.0"
pipe = StableDiffusionXLControlNetTileSRPipeline.from_pretrained(
    model_id, controlnet=controlnet, vae=vae, torch_dtype=torch.float16, use_safetensors=True, variant="fp16"
).to(device)

To save VRAM, you can enable FP8 Quantization on UNET:

unet = UNet2DConditionModel.from_pretrained(model_id, subfolder="unet", variant="fp16", use_safetensors=True)
quantize_8bit(unet)
pipe.unet = unet

To save VRAM, you can enable CPU offloading, vae tiling and vae slicing

pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()
pipe.enable_vae_slicing()

Set the scheduler. See SAMPLERS variable keys list on util.py file.

# Set selected scheduler
scheduler="UniPC" #<--set the key name here
pipe.scheduler = select_scheduler(pipe, scheduler)

....

# Load image
image = load_image("./examples/1.jpg")
original_height = image.height
original_width = image.width
print(f"Current resolution: H:{original_height} x W:{original_width}")

# Pre-upscale image for tiling
resolution = 4096
hdr = 0.5
tile_gaussian_sigma = 0.3
max_tile_size = 1024 # or 1280
control_image = create_hdr_effect(image, hdr)
image = progressive_upscale(image, resolution)
image = create_hdr_effect(image, hdr)

# Update target height and width
target_height = image.height
target_width = image.width
print(f"Target resolution: H:{target_height} x W:{target_width}")
print(f"Applied HDR effect: {True if hdr > 0 else False}")

# Calculate overlap size
normal_tile_overlap, border_tile_overlap = pipe.calculate_overlap(target_width, target_height)

# Set other params
tile_weighting_method = pipe.TileWeightingMethod.COSINE.value
guidance_scale = 4
num_inference_steps = 35
denoising_strenght = 0.65
controlnet_strength = 1.0
prompt = "high-quality, noise-free edges, high quality, 4k, hd, 8k"
negative_prompt = "blurry, pixelated, noisy, low resolution, artifacts, poor details"

# Image generation
image = pipe(
    image=control_image,
    control_image=image,
    control_mode=[6],
    controlnet_conditioning_scale=float(controlnet_strength),
    prompt=prompt,
    negative_prompt=negative_prompt,
    normal_tile_overlap=normal_tile_overlap,
    border_tile_overlap=border_tile_overlap,
    height=target_height,
    width=target_width,
    original_size=(original_width, original_height),
    target_size=(target_width, target_height),
    guidance_scale=guidance_scale,        
    strength=float(denoising_strenght),
    tile_weighting_method=tile_weighting_method,
    max_tile_size=max_tile_size,
    tile_gaussian_sigma=float(tile_gaussian_sigma),
    num_inference_steps=num_inference_steps,
)["images"][0]

image.save("result.png")

Acknowledgements

  • Our work is highly inspired by Mixture-of-Diffusers and ControlNetUnion model. Thanks for their great works!
  • We borrowed some ideias like adaptative tile sizes, progressive_upscale and hdr effect from TileUpscalerV2. Thanks for your work!
  • Thanks to Andrew Svk for using your image of Socotra Island in our nº 5 demo image.
  • Thanks to the HuggingFace team for their controlnet union pipeline used!
  • Thanks to the Gradio gradio team for their support in new Sidebar component!

Other DEVAIXP works

  • Mixture-of-Diffusers for SDXL Tiling Pipeline - SDXL Text-to-Image pipeline for image composition generation by using several diffusion processes in parallel, each configured with a specific prompt and settings, and focused on a particular region of the image.
  • Image Interrogator - Tool for image captioning with support for large models like LLaVa, CogVml and others.

Star History

Star History Chart

License

This project is released under the Apache 2.0.

Contact

If you have any questions, please contact: [email protected]

About

MoD Control Tile Upscaler for SDXL Pipeline

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

No packages published

Languages