Last Updated: 07/04/2019
The following organizes the paper reading notes according to subfields.
- Bag of Tricks for Image Classification with Convolutional Neural Networks [Notes] CLS
- Bag of Freebies for Training Object Detection Neural Networks [Notes]
- mixup: Beyond Empirical Risk Minimization [Notes] ICLR 2018
- Aggregated Residual Transformations for Deep Neural Networks (ResNeXt) [Notes] CVPR 2017
- Group Normalization [Notes] ECCV 2018
- Spatial Transformer Networks [Notes] NIPS 2015
- Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite [Notes] CVPR 2012
- Vision meets Robotics: The KITTI Dataset [Notes] IJRR 2013
- A New Performance Measure and Evaluation Benchmark for Road Detection Algorithms [Notes] ITSC 2013
- What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? [Notes] NIPS 2017
- Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding [Notes]BMVC 2017
- Human-level control through deep reinforcement learning (Nature DQN paper) [Notes] DRL
- Deep Reinforcement Learning for Vessel Centerline Tracing in Multi-modality 3D Volumes [Notes] DRL MI
- Deep Reinforcement Learning for Flappy Bird [Notes] DRL
- Optimizing the Trade-off between Single-Stage and Two-Stage Object Detectors using Image Difficulty Prediction (Very nice illustration of 1 and 2 stage object detection)
- Light-Head R-CNN: In Defense of Two-Stage Object Detector (Megvii) [Notes]
- Object Detection based on Region Decomposition and Assembly [Notes] AAAI 2019
- M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network [Notes] AAAI 2019
- Review of Anchor-free methods (知乎Blog) 目标检测:Anchor-Free时代 Anchor free深度学习的目标检测方法 My Slides on CSP
- DenseBox: Unifying Landmark Localization with End to End Object Detection
- CornerNet: Detecting Objects as Paired Keypoints [Notes] ECCV 2018
- ExtremeNet: Bottom-up Object Detection by Grouping Extreme and Center Points [Notes] CVPR 2019
- CSP: High-level Semantic Feature Detection: A New Perspective for Pedestrian Detection (center and scale prediction) [Notes] CVPR 2019
- FSAF: Feature Selective Anchor-Free Module for Single-Shot Object Detection [Notes] CVPR 2019
- FoveaBox: Beyond Anchor-based Object Detector (anchor-free) [Notes]
- CenterNet: Objects as points (from ExtremeNet authors) [Notes]
- CenterNet: Object Detection with Keypoint Triplets [Notes]
- FCOS: Fully Convolutional One-Stage Object Detection [Notes]
- Panoptic Segmentation [Notes] PanSeg
- Panoptic Feature Pyramid Networks [Notes] PanSeg
- Attention-guided Unified Network for Panoptic Segmentation [Notes] PanSeg
- Path Aggregation Network for Instance Segmentation [Notes] CVPR 2018
- PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation CVPR 2017 [Notes]
- PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space NIPS 2017 [Notes]
- Dynamic Graph CNN for Learning on Point Clouds [Notes]
- VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition (VoxNet) [Notes]
- Multi-view Convolutional Neural Networks for 3D Shape Recognition (MVCNN) [Notes] ICCV 2015
- 3D ShapeNets: A Deep Representation for Volumetric Shapes [Notes] CVPR 2015
- Volumetric and Multi-View CNNs for Object Classification on 3D Data [Notes] CVPR 2016
- Beyond the pixel plane: sensing and learning in 3D (blog, 中文版本)
- Review of Geometric deep learning 几何深度学习前沿 (from 知乎) (Up to CVPR 2018)
- Frustum PointNets for 3D Object Detection from RGB-D Data (F-PointNet) [Notes] CVPR 2018
- PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud (SOTA for 3D object detection) [Notes] CVPR 2019
- VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection CVPR 2018 (Apple, first end-to-end point cloud encoding to grid)
- SECOND: Sparsely Embedded Convolutional Detection Sensors 2018 (builds on VoxelNet)
- PointPillars: Fast Encoders for Object Detection from Point Clouds [Notes] CVPR 2019 (builds on SECOND)
- Multi-View 3D Object Detection Network for Autonomous Driving (MV3D) [Notes] CVPR 2017 (Baidu, sensor fusion, BV proposal)
- Joint 3D Proposal Generation and Object Detection from View Aggregation (AVOD) [Notes] IROS 2018 (sensor fusion, multiview proposal)
- Multi-Task Multi-Sensor Fusion for 3D Object Detection [Notes] CVPR 2019 (@Uber, sensor fusion)
- Multi-Level Fusion based 3D Object Detection from Monocular Images [Notes] CVPR 2018 (precursor to pseudo-lidar)
- Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving [Notes] CVPR 2019
- Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving [Notes]
- Pseudo lidar-e2e: Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud [Notes] ICCV 2019
- pseudo lidar color: Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving [Notes] ICCV 2019
- ForeSeE: Task-Aware Monocular Depth Estimation for 3D Object Detection [Notes] (successor to pseudo-lidar) (mono 3DOD SOTA)
- BEV-IPM: Deep Learning based Vehicle Position and Orientation Estimation via Inverse Perspective Mapping Image [Notes] IV 2019
- OFT: Orthographic Feature Transform for Monocular 3D Object Detection [Notes] (Convert camera to BEV, Alex Kendall) BMVC 2019
- BirdGAN: Learning 2D to 3D Lifting for Object Detection in 3D for Autonomous Vehicles [Notes] IROS 2019
- Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image [Notes] CVPR 2017
- 3D-RCNN: Instance-level 3D Object Reconstruction via Render-and-Compare [Notes] CVPR 2018
- ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape [Notes] CVPR 2019
- MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object Localization [Notes] AAAI 2019
- MonoGRNet 2: Monocular 3D Object Detection via Geometric Reasoning on Keypoints [Notes] (estimates depth from keypoints)
- MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation [Notes] ICCV 2019
- GPP: Ground Plane Polling for 6DoF Pose Estimation of Objects on the Road [Notes] (UCSD, mono 3DOD)
- Deep3dBox: 3D Bounding Box Estimation Using Deep Learning and Geometry [Notes] (from Zoox) CVPR 2017
- Shift R-CNN: Deep Monocular 3D Object Detection with Closed-Form Geometric Constraints [Notes] IEEE ICIP 2019
- FQNet: Deep Fitting Degree Scoring Network for Monocular 3D Object Detection [Notes] CVPR 2019 (Mono 3DOD, Jiwen Lu)
- Mono3D++: Monocular 3D Vehicle Detection with Two-Scale 3D Hypotheses and Task Priors [Notes] (Stefano Soatto) AAAI 2019
- GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving [Notes] CVPR 2019 (2d bbox height)
- MonoPSR: Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction [Notes] CVPR 2019 (2d bbox height)
- CasGeo: 3D Bounding Box Estimation for Autonomous Vehicles by Cascaded Geometric Constraints and Depurated 2D Detections Using 3D Results [Notes]
- MVRA: Multi-View Reprojection Architecture for Orientation Estimation [Notes] ICCV 2019
- Mono3D: Monocular 3D Object Detection for Autonomous Driving [Notes] CVPR2016 (lots of 3D anchors)
- TLNet: Triangulation Learning Network: from Monocular to Stereo 3D Object Detection [Notes] CVPR 2019 (mono baseline with 3D anchors)
- SS3D: Monocular 3D Object Detection and Box Fitting Trained End-to-End Using Intersection-over-Union Loss [Notes] (rergess distance from images, centernet like)
- M3D-RPN: Monocular 3D Region Proposal Network for Object Detection [Notes] ICCV 2019 (Xiaoming Liu)
- MonoDIS: Disentangling Monocular 3D Object Detection [Notes] ICCV 2019
- Joint Monocular 3D Vehicle Detection and Tracking [Notes] ICCV 2019 (directly regress 1/d, Berkeley DeepDrive)
- CenterNet: Objects as points (from ExtremeNet authors) [Notes] (directly regress 1/d)
- Joint Monocular 3D Vehicle Detection and Tracking [Notes] ICCV 2019 (Berkeley DeepDrive)
- Deep Depth Completion of a Single RGB-D Image [Notes] CVPR 2018 (indoor)
- DeepLiDAR: Deep Surface Normal Guided Depth Prediction for Outdoor Scene from Sparse LiDAR Data and Single Color Image [Notes] CVPR 2019 (outdoor)
- SfMLearner: Unsupervised Learning of Depth and Ego-Motion from Video [Notes] CVPR 2017
- Monodepth2: Digging Into Self-Supervised Monocular Depth Estimation [Notes] (@Niantic)
- MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving [Notes]
- TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents [Notes] AAAI 2019 (oral)
- Semantic Segmentation on Radar Point Clouds [[Notes]] (from Daimler AG) FUSION 2018
- DeepSignals: Predicting Intent of Drivers Through Visual Signals [Notes] ICRA2019 (@Uber, turn signal detection)
- Deep Radar Detector [Notes] RadarCon 2019
- MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications (MobileNets) [Notes]
- MobileNetV2: Inverted Residuals and Linear Bottlenecks (MobileNets v2) [Notes] CVPR 2018
- MobileNetV3: Searching for MobileNetV3 [Notes]
- MnasNet: Platform-Aware Neural Architecture Search for Mobile [Notes] CVPR 2019
- EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks [Notes] ICML 2019
- Pruning Filters for Efficient ConvNets [Notes] ICLR 2017
- Channel Pruning for Accelerating Very Deep Neural Networks ICCV 2017 (Face++, Yihui He) [Notes]
- Layer-compensated Pruning for Resource-constrained Convolutional Neural Networks [Notes] NIPS 2018 Talk
- LeGR: Filter Pruning via Learned Global Ranking [Notes]
- Rethinking the Value of Network Pruning ICLR 2019
- The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks [Notes] ICLR 2019
- NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection [Notes] CVPR 2019
- AutoAugment: Learning Augmentation Policies from Data [Notes] CVPR 2019
- AMC: AutoML for Model Compression and Acceleration on Mobile Devices ECCV 2018 (Song Han, Yihui He)
- Long-Term Feature Banks for Detailed Video Understanding [Notes] Video
- Non-local Neural Networks [Notes] Video CVPR 2018
- Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset (I3D) [Notes]Video CVPR 2017
- Initialization Strategies of Spatio-Temporal Convolutional Neural Networks [Notes] Video
- Detect-and-Track: Efficient Pose Estimation in Videos [Notes] ICCV 2017 Video
- SlowFast Networks for Video Recognition [Notes] Video