✂CoHD: A Counting-Aware🔢 Hierarchical Decoding Framework for Generalized Referring Expression Segmentation

Zhuoyan Luo*, Yinghao Wu*, Tianheng Cheng, Yong Liu, Yicheng Xiao, Hongfa Wang, Xiao-Ping Zhang, Yujiu Yang

Tsinghua University And Tencent

🔥 Updates

[2025/08/17] 🔥🔥🔥 The training code and checkpoints are released.
[2025/06/28] 🔥🔥🔥 COHD is accepted by ICCV 2025.

📖 Abstract

The newly proposed Generalized Referring Expression Segmentation (GRES) amplifies the formulation of classic RES by involving complex multiple/non-target scenarios. Recent approaches address GRES by directly extending the well-adopted RES frameworks with object-existence identification. However, these approaches tend to encode multi-granularity object information into a single representation, which makes it difficult to precisely represent comprehensive objects of different granularity. Moreover, the simple binary object-existence identification across all referent scenarios fails to specify their inherent differences, incurring ambiguity in object understanding. To tackle the above issues, we propose a Counting-Aware Hierarchical Decoding framework (CoHD) for GRES. By decoupling the intricate referring semantics into different granularity with a visual-linguistic hierarchy, and dynamic aggregating it with intra- and inter-selection, CoHD boosts multi-granularity comprehension with the reciprocal benefit of the hierarchical nature. Furthermore, we incorporate the counting ability by embodying multiple/single/non-target scenarios into count- and category-level supervision, facilitating comprehensive object perception. Experimental results on gRefCOCO, Ref-ZOM, R-RefCOCO, and RefCOCO benchmarks demonstrate the effectiveness and rationality of CoHD which outperforms state-of-the-art GRES methods by a remarkable margin.

📗 FrameWork

🍺 Visualizations

📖 Implementations

🛠️ Installation

Env: The code is training using CUDA 11.3 torch 1.12.1 torchvision 0.13.1 Python 3.8.8 (other versions may also be fine)

Dependencies:

Install Detectron2
Run sh make.sh under gres_model/modeling/pixel_decoder/ops
Install other required packages: pip install -r requirements.txt

Dataset:

Please download the annotations from Dataset

dataset
├── grefcoco
│   ├── grefs(unc).json
│   ├── instances.json
│   ├── cateid2coco.json
│   ├── cocoidtosuper.json
└── images
    └── train2014
        ├── COCO_train2014_xxxxxxxxxxxx.jpg
        ├── COCO_train2014_xxxxxxxxxxxx.jpg
        └── ...

⚡ Training & Evaluation

Note that, prepare the Swin-Base and Swin-Tiny pretrained model according to RELA

🚀 Training Scripts

Swin-Base

bash scripts/grefcoco/train_base.sh

Swin-Tiny

bash scripts/grefcoco/train_tiny.sh

🚀 Evaluation Scripts

Swin-Base

bash scripts/grefcoco/eval_base.sh

Swin-Tiny

bash scripts/grefcoco/eval_tiny.sh

🍺 Model And Performance

Grefcoco Validation Set

Method	Backbone	gIoU	cIoU	N-acc.	Checkpoint
CoHD	Swin-T	65.89	62.95	60.96	Model
CoHD	Swin-B	68.42	65.17	63.38	Model

🎤 TODO

Release RefZOM training and evaluation script
Release R-RefCOCO training and evaluation script
Relase Refcoco training and evaluation script

❤️ Acknowledgement

Code in this repository is built upon several public repositories. Thanks for the wonderful work ReLA! !

⭐️ BibTeX

if you find it helpful, please cite

@article{luo2024cohd,
  title={CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation},
  author={Luo, Zhuoyan and Wu, Yinghao and Cheng Tianheng and Liu, Yong and Xiao, Yicheng and Wang Hongfa and Zhang, Xiao-Ping and Yang, Yujiu},
  journal={arXiv preprint arXiv:2405.15658},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
configs		configs
gres_model		gres_model
scripts/grefcoco		scripts/grefcoco
tools		tools
.DS_Store		.DS_Store
README.md		README.md
requirements.txt		requirements.txt
train_net.py		train_net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

✂CoHD: A Counting-Aware🔢 Hierarchical Decoding Framework for Generalized Referring Expression Segmentation

🔥 Updates

📖 Abstract

📗 FrameWork

🍺 Visualizations

📖 Implementations

🛠️ Installation

⚡ Training & Evaluation

🍺 Model And Performance

🎤 TODO

❤️ Acknowledgement

⭐️ BibTeX

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

RobertLuo1/CoHD

Folders and files

Latest commit

History

Repository files navigation

✂CoHD: A Counting-Aware🔢 Hierarchical Decoding Framework for Generalized Referring Expression Segmentation

🔥 Updates

📖 Abstract

📗 FrameWork

🍺 Visualizations

📖 Implementations

🛠️ Installation

⚡ Training & Evaluation

🍺 Model And Performance

🎤 TODO

❤️ Acknowledgement

⭐️ BibTeX

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages