Name		Name	Last commit message	Last commit date
parent directory ..
configs		configs
mmcv_custom/runner		mmcv_custom/runner
mmdet_custom/apis		mmdet_custom/apis
Dockerfile_mmdetseg		Dockerfile_mmdetseg
README.md		README.md
checkpoint.py		checkpoint.py
dist_test.sh		dist_test.sh
dist_train.sh		dist_train.sh
test.py		test.py
train.py		train.py

README.md

Applying PoolFormer to Object Detection

For details see MetaFormer is Actually What You Need for Vision (CVPR 2022 Oral).

Note

Please note that we just simply follow the hyper-parameters of PVT which may not be the optimal ones for PoolFormer. Feel free to tune the hyper-parameters to get better performance.

Environement Setup

Install MMDetection v2.19.0 from souce cocde,

or

pip install mmdet==2.19.0 --user

Apex (optional):

git clone https://github.com/NVIDIA/apex
cd apex
python setup.py install --cpp_ext --cuda_ext --user

If you would like to disable apex, modify the type of runner as EpochBasedRunner and comment out the following code block in the configuration files:

fp16 = None
optimizer_config = dict(
    type="DistOptimizerHook",
    update_interval=1,
    grad_clip=None,
    coalesce=True,
    bucket_size_mb=-1,
    use_fp16=True,
)

Note: Since we write PoolFormer backbone code of detection and segmentation in a same file which requires to install both MMDetection v2.19.0 and MMSegmentation v0.19.0. Please continue to install MMSegmentation or modify the backone code.

Dockerfile_mmdetseg is the docker file that I use to set up the environment for detection and segmentation. You can also refer to it.

Data preparation

Prepare COCO according to the guidelines in MMDetection v2.19.0.

Results and models on COCO

Method	Backbone	Pretrain	Lr schd	Aug	box AP	mask AP	Config	Download
RetinaNet	PoolFormer-S12	ImageNet-1K	1x	No	36.2	-	config	log & model
RetinaNet	PoolFormer-S24	ImageNet-1K	1x	No	38.9	-	config	log & model
RetinaNet	PoolFormer-S36	ImageNet-1K	1x	No	39.5	-	config	log & model
Mask R-CNN	PoolFormer-S12	ImageNet-1K	1x	No	37.3	34.6	config	log & model
Mask R-CNN	PoolFormer-S24	ImageNet-1K	1x	No	40.1	37.0	config	log & model
Mask R-CNN	PoolFormer-S36	ImageNet-1K	1x	No	41.0	37.7	config	log & model

All the models can also be downloaded by BaiDu Yun (password: esac).

Evaluation

To evaluate PoolFormer-S12 + RetinaNet on COCO val2017 on a single node with 8 GPUs run:

FORK_LAST3=1 dist_test.sh configs/retinanet_poolformer_s12_fpn_1x_coco.py /path/to/checkpoint_file 8 --out results.pkl --eval bbox

To evaluate PoolFormer-S12 + Mask R-CNN on COCO val2017, run:

dist_test.sh configs/mask_rcnn_poolformer_s12_fpn_1x_coco.py /path/to/checkpoint_file 8 --out results.pkl --eval bbox segm

Training

To train PoolFormer-S12 + RetinaNet on COCO train2017 on a single node with 8 GPUs for 12 epochs run:

FORK_LAST3=1 dist_train.sh configs/retinanet_poolformer_s12_fpn_1x_coco.py 8

To train PoolFormer-S12 + Mask R-CNN on COCO train2017:

dist_train.sh configs/mask_rcnn_poolformer_s12_fpn_1x_coco.py 8

Bibtex

@article{yu2021metaformer,
  title={MetaFormer is Actually What You Need for Vision},
  author={Yu, Weihao and Luo, Mi and Zhou, Pan and Si, Chenyang and Zhou, Yichen and Wang, Xinchao and Feng, Jiashi and Yan, Shuicheng},
  journal={arXiv preprint arXiv:2111.11418},
  year={2021}
}

Acknowledgment

Our implementation is mainly based on the following codebases. We gratefully thank the authors for their wonderful works.

mmdetection, PVT detection.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

detection

detection

README.md

Applying PoolFormer to Object Detection

Note

Environement Setup

Data preparation

Results and models on COCO

Evaluation

Training

Bibtex

Acknowledgment

Files

detection

Directory actions

More options

Directory actions

More options

Latest commit

History

detection

Folders and files

parent directory

README.md

Applying PoolFormer to Object Detection

Note

Environement Setup

Data preparation

Results and models on COCO

Evaluation

Training

Bibtex

Acknowledgment