11785 Deep Learning Course Project
To access driving stereo data (unpacked size will be 77GB):
- Open an AWS instance
- aws s3 cp s3://idl-proj-3d/driving_stereo.tar.gz ./
- tar -xvzf driving_stereo.tar.gz
(note: do not download this from S3 to outside of AWS, as that will incur huge data egress costs)
This contains a 'train' directory with 174,437 image pairs and 'test' directory with 7,751 image pairs.
Both 'train' and 'test' have 'left' and 'right' folder which contain the images. Thus the 'train' or 'test' directory paths can be passed to the "class Inria(data.Dataset)".
Model | Train MAE | Test MAE |
---|---|---|
Deep-3D | 6.36 | 7.58 |
Deep-3D + Monocular Depth Estimation + Mask-RCNN (Early Fusion) | 6.50 | 7.29 |
Deep-3D + Monocular Depth Estimation + Mask-RCNN (Late Fusion) | 6.31 | 6.84 |
Below are 7 examples from the Inria dataset, showing input and output images as GIFs. Each example consists of a stereo pair of left and right views of a scene. The left view is the input image from the Inria dataset. The right view is the output produced by our reimplementation of the Deep3D model. By alternating fast between these two views, these GIFs produce a sensation of depth.