Skip to content

PrimeDepth: Efficient Monocular Depth Estimation with a Stable Diffusion Preimage

License

MIT and 2 other licenses found

Licenses found

MIT
LICENSE
MIT
LICENSE-SD
Unknown
LICENSE-SD-MODEL
Notifications You must be signed in to change notification settings

vislearn/PrimeDepth

Repository files navigation

PrimeDepth: Efficient Monocular Depth Estimation with a Stable Diffusion Preimage

Denis Zavadski* · Damjan Kalšan* · Carsten Rother

Computer Vision and Learning Lab,
IWR, Heidelberg University

*equal contribution

ACCV 2024

Project Page Paper PDF Github Code

PrimeDepth is a diffusion-based monocular depth estimator which leverages the rich representation of the visual world stored within Stable Diffusion. The representation, termed preimage, is extracted in a single diffusion step from frozen Stable Diffusion 2.1 and adjusted towards depth prediction. PrimeDepth yields detailed predictions while simulatenously being fast at inference time due to the single-step approach.

teaser

Introduction

This is an inference codebase for PrimeDepth based on Stable Diffusion 2.1. Further details and visual examples can be found on the project page.

Installation

  1. Create and activate a virtual environment:

    conda create -n PrimeDepth python=3.9
    conda activate PrimeDepth
    
  2. Install dependencies:

    pip3 install -r requirements.txt
    
  3. Download the weights

  4. Adjust the attribute ckpt_path in configs/inference.yaml to point to the downloaded weights from the previous step

Usage

from scripts.utils import InferenceEngine


config_path = "./configs/inference.yaml"
image_path = "./images/comparisons/vertical_resized/goodBoy.png"

ie = InferenceEngine(pd_config_path=config_path, device="cuda")

depth_ssi, depth_color = ie.predict(image_path)

PrimeDepth predicts in inverse space. The raw model predictions are stored in depth_ssi, while a colorized prediction depth_color is precomputed for visualization convenience:

depth_color.save("goodBoy_primedepth.png")

Citation

@misc{zavadski2024primedepth,
    title={PrimeDepth: Efficient Monocular Depth Estimation with a Stable Diffusion Preimage}, 
    author={Denis Zavadski and Damjan Kalšan and Carsten Rother},
    year={2024},
    eprint={2409.09144},
    archivePrefix={arXiv},
    primaryClass={cs.CV},
    url={https://arxiv.org/abs/2409.09144}, 
}