Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output of grayscale depthmaps #65

Open
herrartist opened this issue Nov 5, 2024 · 2 comments
Open

Output of grayscale depthmaps #65

herrartist opened this issue Nov 5, 2024 · 2 comments

Comments

@herrartist
Copy link

How can I output grayscale depthmaps (and nothing else)?

@Dhruva-Storz
Copy link

Not sure if depth pro has an option for this, but the output of the model is a metric depth array. You need to normalize the depth values between 0 and 255 to get a traditional 8 bit greyscale image (and possibly invert it), then save the file as a PNG or whatever format.

@hradec
Copy link

hradec commented Nov 12, 2024

You can run depth_pro and save a floating point tif file using something like this snippet:

        import depth_pro
        import torch, cv2

        def get_torch_device() -> torch.device:
            """Get the Torch device."""
            device = torch.device("cpu")
            if torch.cuda.is_available():
                # use gpu if it has more than 7GB free
                if torch.cuda.get_device_properties(0).total_memory/1024/1024/1024 > 7:
                    device = torch.device("cuda:0")
            elif torch.backends.mps.is_available():
                device = torch.device("mps")
            return device

        # Load model and preprocessing transform
        model, transform = depth_pro.create_model_and_transforms(
            device=get_torch_device(),
            precision=torch.half,
        )
        self.progress_callback()
        model.eval()

        # Load and preprocess an image.
        image, _, f_px = depth_pro.load_rgb("your_rgb_image.jpg")
        image = transform(image)

        # Run inference.
        prediction = model.infer(image, f_px=f_px)
        depth = prediction["depth"].detach().cpu().numpy().squeeze()
        cv2.imwrite(f"your_rgb_image_depthPro.tif", depth)

A floating point tif will give you the full range and precision of the calculated depth, including values above 1 meter.

You can also use the depth variable straight from it just like a normal cv2.imgread() object. You don't have to save to tif and read back with cv2.imgread().

But be careful when loading up the tif file in image editors like photoshop or gimp, since all of then will apply some sort of color conversion instead of display the actual floating point values.

You can use this script to display the tif images properly, and sample the values using the mouse:
https://gist.github.com/hradec/adb03ea9fd02ee2a8cde00a69fb26f62

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants