WiderPerson: A Diverse Dataset for Dense Pedestrian Detection in the Wild

July 2020

tl;dr: A relatively scale (8k training images) dataset for crowded/dense human detection.

Overall impression

Overall not quite impressive. It fails to cite a closely related dataset CrowdHuman, and ablation study of the issue is not as extensive as well.

Key ideas

30 persons per image.
Annotate top of the head and middle of the feet (similar to CityPerson). The bbox is automatically generated with aspect ratio of 0.41. This is
Difficulty: > 100 pixel (easy), > 50 pixel (medium), > 20 pixel (hard). Similar to WiderFace.
NMS is a problem in crowded scenes, but it is not handled in this paper. Maybe try Visibility Guided NMS.

Technical details

Use pHash to avoid duplication of images.
Annotation tool with examples in the GUI.
Evaluation metric: MR

Notes

Tsinghua-Daimler datasets for cyclists
- Bounding Box based labels are provided for the classes: ("pedestrian", "cyclist", "motorcyclist", "tricyclist", "wheelchairuser", "mopedrider").
The EuroCity Persons Dataset: A Novel Benchmark for Object Detection T-PAMI 2019