Skip to content

11785 Deep Learning Course Project: End-to-End 2D to 3D Video Conversion

Notifications You must be signed in to change notification settings

aditya5558/2D-to-3D

Repository files navigation

2D-to-3D

11785 Deep Learning Course Project

To access driving stereo data (unpacked size will be 77GB):

  • Open an AWS instance
  • aws s3 cp s3://idl-proj-3d/driving_stereo.tar.gz ./
  • tar -xvzf driving_stereo.tar.gz

(note: do not download this from S3 to outside of AWS, as that will incur huge data egress costs)

This contains a 'train' directory with 174,437 image pairs and 'test' directory with 7,751 image pairs.

Both 'train' and 'test' have 'left' and 'right' folder which contain the images. Thus the 'train' or 'test' directory paths can be passed to the "class Inria(data.Dataset)".

Results

Dataset: Inria

Model Train MAE Test MAE
Deep-3D 6.36 7.58
Deep-3D + Monocular Depth Estimation + Mask-RCNN (Early Fusion) 6.50 7.29
Deep-3D + Monocular Depth Estimation + Mask-RCNN (Late Fusion) 6.31 6.84

Qualitative Examples: Inputs and Outputs as GIFs

Below are 7 examples from the Inria dataset, showing input and output images as GIFs. Each example consists of a stereo pair of left and right views of a scene. The left view is the input image from the Inria dataset. The right view is the output produced by our reimplementation of the Deep3D model. By alternating fast between these two views, these GIFs produce a sensation of depth.

Image 0 GIF Image 1 GIF Image 2 GIF Image 3 GIF Image 4 GIF Image 5 GIF Image 6 GIF

About

11785 Deep Learning Course Project: End-to-End 2D to 3D Video Conversion

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •