Skip to content

Stable Diffusion-based image manipulation method with a sketch and reference image

License

Notifications You must be signed in to change notification settings

kangyeolk/Paint-by-Sketch

Repository files navigation

Paint-by-Sketch

Kangyeol Kim, Sunghyun Park, Junsoo Lee and Jaegul Choo.

Teaser

Teaser

Multi-backgrounds

Multi-backgrounds

Multi-references

Multi-references

Abstract

Recent remarkable improvements in large-scale text-to-image generative models have shown promising results in generating high-fidelity images. To further enhance editability and enable fine-grained generation, we introduce a multi-input-conditioned image composition model that incorporates a sketch as a novel modal, alongside a reference image. Thanks to the edge-level controllability using sketches, our method enables a user to edit or complete an image sub-part with a desired structure (i.e., sketch) and content (i.e., reference image). Our framework fine-tunes a pre-trained diffusion model to complete missing regions using the reference image while maintaining sketch guidance. Albeit simple, this leads to wide opportunities to fulfill user needs for obtaining the in-demand images.Through extensive experiments, we demonstrate that our proposed method offers unique use cases for image manipulation, enabling user-driven modifications of arbitrary scenes.

Environment & Pre-trained models

Dependancies

$ conda env create -f environment.yaml
$ conda activate paint_sketch
$ pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
$ pip install opencv-python==4.6.0.66 opencv-python-headless==4.6.0.66 matplotlib==3.2.2 streamlit==1.14.1 streamlit-drawable-canvas==0.9.2
$ pip install git+https://github.com/openai/CLIP.git

Download checkpoints

Paint-by-Sketch
    pretrained_models/
        model-modified-12channel.ckpt        
    models/
        Cartoon_v1_aesthetic/
            ...
    ...

Data preparation

  • Sketch extraction
bash preprocess_dataset/run_preprocess.sh <path/to/image_root> <gpu_id>
# e.g., 
# bash preprocess_dataset/run_preprocess.sh /home/nas2_userF/kangyeol/Project/webtoon2022/Paint-by-Sketch/samples 7
  • Result
IMAGE_ROOT
    images/
        000000.png
        000001.png
        ...
    sketch_bin/
        000000.png
        000001.png
        ...
    sketch(Not used)/
        ... 
    ...

Training

bash cartoon_train.sh <gpu_ids> <path/to/logdir> <path/to/config>

# e.g,
# bash cartoon_train.sh 0,1 models/test configs/v1_aesthetic_sketch_image.yaml
  • You need to match the number of gpu_ids and the number of gpus in the lightning.trainer.yaml (gpu_ids=2,3 with '0,1' in config file).

Demo

  1. Running streamlit server
streamlit run demo/app.py --server.port=8507 --server.fileWatcherType none
  1. Upload the source image

  1. Draw mask and sketch separately

  • The 1st and 2nd canvases are panels where you can draw masks and sketches.
  • In the 3rd canvas, you can view the drawn mask and sketch overlaid together.
  1. Upload a reference image.

  • Select a image in the left panel.
  • Click Read Exemplar button.
  • Crop the image partially with bounding box.
  1. Inference and export

  • Perform inference with the drawn mask, sketch, and the cropped image as conditions.
  • You can adjust the scale and sketch strength in the left panel.
  • ou can save images in grid format through Export button.

Issues

  • If the screen size is not large enough and the canvas size changes, there will be misalignment in the drawn mask and sketch.

Citation

@misc{kim2023referencebased,
    title={Reference-based Image Composition with Sketch via Structure-aware Diffusion Model},
    author={Kangyeol Kim and Sunghyun Park and Junsoo Lee and Jaegul Choo},
    year={2023},
    eprint={2304.09748},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

License

The code in this repository is released under the MIT License.

Acknowledges

This code borrows heavily from Stable Diffusion and Paint-by-Example.

Releases

No releases published

Packages

No packages published