Object Detector Analysis

This repository contains code, data, and analysis for a comparative study of modern object detection algorithms: Faster R-CNN, YOLO-World, and GroundingDINO. The project evaluates models on their bounding box accuracy and prompt-based detection capabilities, using metrics like IoU, Precision, Recall, and F1 Score.

📄 View full project report (PDF)

🔍 Objective

To analyze the effectiveness of prompt-based and traditional object detectors across:

Standard object detection (e.g., apples)
Prompt-based open-vocabulary detection
Robustness under diverse, real-world conditions

Models Analyzed

1. Faster R-CNN (ResNet-50)

Closed-vocabulary detector pretrained on COCO
Reliable for fixed-label detection
Not suitable for prompt-based or novel object recognition

2. YOLO-World

Single-stage detector with CLIP-based text embedding
Supports open-vocabulary detection using prompts
Highly sensitive to threshold tuning and prompt wording

3. GroundingDINO

Transformer-based architecture using CLIP for text grounding
Achieved highest IoU and F1 scores
Robust to prompt rewording and image complexity

 
   ObjectDetectorAnalysis/ 
   ├── apl/ # Apple image dataset for evaluation 
   ├── data/ # Dataset for robustness testing 
   ├── ResultIMGS/ # Model prediction output images 
   ├── FinalModelComp.ipynb # Comparison of YOLO-World, RCNN, GroundingDINO 
   ├── WeirdImgData.ipynb # Robustness testing on challenging images 
   ├── README.md # Project overview and usage instructions 
   └── .DS_Store # (System file — safe to delete)

Datasets

Apple Detection Dataset

Used to benchmark box prediction accuracy across models.

Robustness Evaluation Dataset

Curated to test:

Complex scenes
Small and overlapping objects
Occlusions
Unusual prompts
Blurry / noisy images
Symbol and scene detection

Metrics

Intersection over Union (IoU)
Precision / Recall
F1 Score

Key Findings:

GroundingDINO performed best in almost all categories.

YOLO-World requires precise threshold tuning and prompt phrasing.

Faster R-CNN lacks prompt support, good only for known COCO classes.

Model	Box Thresh	Text Thresh	Precision	Recall	F1 Score	Mean IoU
GroundingDINO	0.35	0.2	0.690	0.825	0.712	0.830
YOLO-World	0.08	-	0.673	0.910	0.732	0.897
Faster R-CNN	-	-	✓ (COCO only)	✗ (no prompts)	✗	✗

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Object Detector Analysis

🔍 Objective

Models Analyzed

1. Faster R-CNN (ResNet-50)

2. YOLO-World

3. GroundingDINO

Datasets

Apple Detection Dataset

Robustness Evaluation Dataset

Metrics

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
ResultIMGS		ResultIMGS
apl		apl
data		data
.DS_Store		.DS_Store
FinalModelComp.ipynb		FinalModelComp.ipynb
README.md		README.md
WeirdImgData.ipynb		WeirdImgData.ipynb

Folders and files

Latest commit

History

Repository files navigation

Object Detector Analysis

🔍 Objective

Models Analyzed

1. Faster R-CNN (ResNet-50)

2. YOLO-World

3. GroundingDINO

Datasets

Apple Detection Dataset

Robustness Evaluation Dataset

Metrics

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages