Skip to content

SegTrackDetect - A framework for ROI-based Tiny Object Detection at full resolution.

Notifications You must be signed in to change notification settings

Cufix/SegTrackDetect

 
 

Repository files navigation

SegTrackDetect

architecture

SegTrackDetect is a modular framework designed for accurate small object detection using a combination of segmentation and tracking techniques. It performs detection within selected Regions of Interest (ROIs), providing a highly efficient solution for scenarios where detecting tiny objects with precision is critical. The framework's modularity empowers users to easily customize key components, including the ROI Estimation Module, ROI Prediction Module, and Object Detector. It also features our Overlapping Box Suppression Algorithm that efficiently combines detected objects from multiple sub-windows, filtering them to overcome the limitations of window-based detection methods. See the following sections for more details on the framework, its components, and customization options:

To get started with the framework right away, head to the Getting Started section.

Getting Started

Depencencies

We provide a Dockerfile that handles all the dependencies for you. Simply install the Docker Engine and, if you plan to run detection on a GPU, the NVIDIA Container Toolkit.

To download all the trained models described in Model ZOO and build a Docker image, simply run:

./build_and_run.sh

We currently support four datasets, and we provide scripts that downloads the datasets and converts them into supported format. To download and convert all of them, run:

./download_and_convert.sh

You can also download selected datasets by running corresponding scripts in the scripts directory.

Examples

SegTrackDetect framework supports tiny object detection on consecutive frames (video detection), as well as detection on independent windows.

To run detection on video data using one of the supported datasets, e.g. SeaDronesSee:

python inference_vid.py \
--roi_model 'SDS_large' --det_model 'SDS' --tracker 'sort' \
--ds 'SeaDronesSee' --split 'val' \
--bbox_type 'sorted' --allow_resize --obs_iou_th 0.1 \
--out_dir 'results/SDS/val' --debug

To run the detection on independent windows, e.g. MTSD, use:

python inference_img.py \
--roi_model 'MTSD' --det_model 'MTSD' \
--ds 'MTSD' --split 'val' \
--bbox_type 'sorted' --allow_resize --obs_iou_th 0.7 \
--out_dir 'results/MTSD/val' --debug
Argument Type Description
--roi_model str Specifies the ROI model to use (e.g., SDS_large). All available ROI models are defined here
--det_model str Specifies the detection model to use (e.g., SDS). All available detectors are defined here
--tracker str Specifies the tracker to use (e.g., sort). All available trackers are defoned here
--ds str Dataset to use for inference (e.g., SeaDronesSee). Available datasets
--split str Data split to use (e.g., val for validation). If present, the script will save the detections using the coco image ids used in val.json
--flist str An alternative version of providing an image list, path to the file with absolute paths to images.
--name str A name for provided flist, coco annotations name.json will be generated and saved in the dataset root directory
--bbox_type str Type of the detection window filtering algorithm (all - no filtering, naive, sorted).
--allow_resize flag Enables resizing of cropped detection windows. Siling window within large ROIs will be used otherwise.
--obs_iou_th float Sets the IoU threshold for Overlapping Box Suppresion (default is 0.7).
--cpu flag Use cpu for computations, if not set use cuda
--out_dir str Directory to save output results (e.g., results/SDS/val).
--debug flag Enables saving visualisation in out_dir
--vis_conf_th float Confidence threshold for the detections in visualisation, default 0.3.

All available models can be found in Model ZOO. Currently, we provide trained models for 4 detection tasks.

Customization

Existing Models

New Models

New Datasets

Metrics

We convert all datasets to coco format, and we provide a script for metrics computation.

Architecture

ROI Fusion Module

ROI Prediction with Object Trackers

ROI Estimation with Segmentation

Object Detection

Detection Aggregation and Filtering

Model ZOO

All models we use, are in TorchScrpt format.

Region of Interest Estimation

Model Objects of Interest Dataset Model name Input size Weights
u2netp traffic signs MTSD MTSD 576x576 here
unet fish ZebraFish ZeF20 160x256 here
unet people DroneCrowd DC_tiny 96x160 here
unet people DroneCrowd DC_small 192x320 here
unet people DroneCrowd DC_medium 384x640 here
unet people, boats SeaDronesSee SDS_tiny 64x96 here
unet people, boats SeaDronesSee SDS_small 128x192 here
unet people, boats SeaDronesSee SDS_medium 224x384 here
unet people, boats SeaDronesSee SDS_large 448x768 here

Object Detectors

Model Objects of Interest Dataset Model name Input size Weights
yolov4 traffic signs MTSD MTSD 960x960 here
yolov7 tiny fish ZebraFish ZeF20 160x256 here
yolov7 tiny people DroneCrowd SDS 320x512 here
yolov7 tiny people, boats SeaDronesSee DC 320x512 here

Datasets

Mapillary Traffic Sign Dataset

ZebraFish

DroneCrowd

SeaDronesSee

Licence

Acknowledgements

About

SegTrackDetect - A framework for ROI-based Tiny Object Detection at full resolution.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.1%
  • Shell 2.7%
  • Dockerfile 0.2%