Skip to content

Latest commit

 

History

History
24 lines (17 loc) · 1.36 KB

mono_depth_video_in_the_wild.md

File metadata and controls

24 lines (17 loc) · 1.36 KB

July 2019

tl;dr: Estimate the intrinsics in addition to the extrinsics of the camera from any video.

Overall impression

This work eliminates the assumption of the availability of intrinsics. This opens up a whole lot possibilities to learn from a wide range of videos.

This network regresses depth, ego-motion, object motion and camera intrinsics from mono videos.

Key ideas

  • Estimate each of the intrinsics
  • Occlusion aware loss (picking the most foreground pixels during photometric loss calculation)
  • Foreground mask to mask out the possible moving objects.
  • Use a randomized layer optimization (this is quite weird)

Technical details

  • Sometimes an overall supervision signal is given to two tightly coupled parameters and it is not enough to get accurate estimate for both parameters. (cf. Deep3Dbox)

Notes

  • In detail, how was the lens correction regressed?
  • See interview with the CEO of isee on this paper.
  • Q: Can we project the intermediate representation (3D points) to BEV instead of back to camera plane for loss calculation? This would eliminate the need for using occlusion-aware loss.