Skip to content

wokaikaixinxin/GSDet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

GSDet (IJCAI 2025)

GSDet: Gaussian Splatting for Oriented Object Detection

The baseline of GSDet is avaliable at ai4rs
The baseline of GSDet is avaliable at ai4rs
The baseline of GSDet is avaliable at ai4rs
The baseline of GSDet is avaliable at ai4rs

Abstract

Oriented object detection has advanced with the development of convolutional neural networks (CNNs) and transformers. However, modern detectors still rely on predefined object candidates, such as anchors in CNN-based methods or queries in transformer-based methods, which struggle to capture spatial information effectively. To address the limitations, we propose GSDet, a novel framework that formulates oriented object detection as Gaussian splatting. Specifically, our approach performs detection within a 3D feature space constructed from image features, where 3D Gaussians are employed to represent oriented objects. These 3D Gaussians are projected onto the image plane to form 2D Gaussians, which are then transformed into oriented boxes. Furthermore, we optimize the mean, anisotropic covariance, and confidence scores of these randomly initialized 3D Gaussians, using a decoder that incorporates 3D Gaussian sampling. Moreover, our method exhibits flexibility, enabling adaptive control and a dynamic number of Gaussians during inference. Experiments on 3 datasets indicate that GSDet achieves AP50 gains of 0.7% on DIOR-R, 0.3% on DOTA-v1.0, and 0.55% on DOTA-v1.5 when evaluated with adaptive control and outperforms mainstream detectors.

Insight

  • The model can learn targets from random inputs. If random inputs work, other types of inputs are likely feasible as well.
  • Oriented boxes can be represented as probability distributions, such as Gaussian distributions. Everything can be characterized by different probability distributions.
  • Diffusion models, Gaussian splatting, etc., all involve probability distributions and randomness.
  • The architecture is decoder-only, stacked with multiple decoder layers. The code borrows the two-stage class from OpenMMLab as the parent class, though using one-stage or Transformer as the parent is also possible.

Citation

@inproceedings{ijcai2025p101,
  title     = {GSDet: Gaussian Splatting for Oriented Object Detection},
  author    = {Ding, Zeyu and Zhao, Jiaqi and Zhou, Yong and Du, Wen-liang and Zhu, Hancheng and Yao, Rui},
  booktitle = {Proceedings of the Thirty-Fourth International Joint Conference on
               Artificial Intelligence, {IJCAI-25}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  editor    = {James Kwok},
  pages     = {900--908},
  year      = {2025},
  month     = {8},
  note      = {Main Track},
  doi       = {10.24963/ijcai.2025/101},
  url       = {https://doi.org/10.24963/ijcai.2025/101},
}

About

(IJCAI2025) GSDet: Gaussian Splatting for Oriented Object Detection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published