Skip to content

Latest commit

 

History

History
22 lines (15 loc) · 1.21 KB

coord_conv.md

File metadata and controls

22 lines (15 loc) · 1.21 KB

August 2019

tl;dr: Predicting coordinate transformation (predicting x and y directly from image and vice versa) with Conv Nets are hard. Adding a mesh grid to input image helps this task significantly.

Overall impression

The paper results are very convincing, and the technique is super efficient. Essentially it only concats two channel meshgrid to the original input.

RoI10D cited this paper.

Key ideas

  • Other coordinates works as well, such as radius and theta.
  • The idea can be useful for other tasks such as object detection, GAN, DRL, but not so much for classification.

Technical details

  • Summary of technical details

Notes