We implement FreeAnchor in 3D detection systems and provide their first results with PointPillars on nuScenes dataset.
With the implemented FreeAnchor3DHead
, a PointPillar detector with a big backbone (e.g., RegNet-3.2GF) achieves top performance
on the nuScenes benchmark.
@inproceedings{zhang2019freeanchor,
title = {{FreeAnchor}: Learning to Match Anchors for Visual Object Detection},
author = {Zhang, Xiaosong and Wan, Fang and Liu, Chang and Ji, Rongrong and Ye, Qixiang},
booktitle = {Neural Information Processing Systems},
year = {2019}
}
As in the baseline config, we only need to replace the head of an existing one-stage detector to use FreeAnchor head.
Since the config is inherit from a common detector head, _delete_=True
is necessary to avoid conflicts.
The hyperparameters are specifically tuned according to the original paper.
_base_ = [
'../_base_/models/hv_pointpillars_fpn_lyft.py',
'../_base_/datasets/nus-3d.py', '../_base_/schedules/schedule_2x.py',
'../_base_/default_runtime.py'
]
model = dict(
pts_bbox_head=dict(
_delete_=True,
type='FreeAnchor3DHead',
num_classes=10,
in_channels=256,
feat_channels=256,
use_direction_classifier=True,
pre_anchor_topk=25,
bbox_thr=0.5,
gamma=2.0,
alpha=0.5,
anchor_generator=dict(
type='AlignedAnchor3DRangeGenerator',
ranges=[[-50, -50, -1.8, 50, 50, -1.8]],
scales=[1, 2, 4],
sizes=[
[0.8660, 2.5981, 1.], # 1.5/sqrt(3)
[0.5774, 1.7321, 1.], # 1/sqrt(3)
[1., 1., 1.],
[0.4, 0.4, 1],
],
custom_values=[0, 0],
rotations=[0, 1.57],
reshape_out=True),
assigner_per_size=False,
diff_rad_by_sin=True,
dir_offset=0.7854, # pi/4
dir_limit_offset=0,
bbox_coder=dict(type='DeltaXYZWLHRBBoxCoder', code_size=9),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=0.8),
loss_dir=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.2)),
# model training and testing settings
train_cfg = dict(
pts=dict(code_weight=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.25, 0.25])))
Backbone | FreeAnchor | Lr schd | Mem (GB) | Inf time (fps) | mAP | NDS | Download |
---|---|---|---|---|---|---|---|
FPN | ✗ | 2x | 17.1 | 40.0 | 53.3 | model | log | |
FPN | ✓ | 2x | 16.2 | 43.7 | 55.3 | model | log | |
RegNetX-400MF-FPN | ✗ | 2x | 17.3 | 44.8 | 56.4 | model | log | |
RegNetX-400MF-FPN | ✓ | 2x | 17.7 | 47.9 | 58.6 | model | log | |
RegNetX-1.6GF-FPN | ✓ | 2x | 24.3 | 51.2 | 60.8 | model | log | |
RegNetX-1.6GF-FPN* | ✓ | 3x | 24.3 | 53.0 | 62.2 | model | log | |
RegNetX-3.2GF-FPN | ✓ | 2x | 29.5 | 52.2 | 62.0 | model | log | |
RegNetX-3.2GF-FPN* | ✓ | 3x | 29.5 | 55.09 | 63.5 | model | log |
Note: Models noted by *
means it is trained using stronger augmentation with vertical flip under bird-eye-view, global translation, and larger range of global rotation.