Skip to content

Dataset & Code for ACM Multimedia 2023 paper. "SemanticRT: A Large-Scale Dataset and Method for Robust Semantic Segmentation in Multispectral Images".

License

Notifications You must be signed in to change notification settings

jiwei0921/SemanticRT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository provides the SemanticRT dataset and ECM code for multispectral semantic segmentation (MSS). The repository is structured as follows.

  1. Task Introduction
  2. SemanticRT Dataset
  3. ECM Source Code

📖 Task Introduction

avatar

Introduction Figure. Visual illustration of the advantages of employing multispectral (RGB-Thermal) images for semantic segmentation. The complementary nature of RGB and thermal images are highlighted using yellow and green boxes, respectively. The RGB-only method, DeepLabV3+, is susceptible to incorrect segmentation or even missing target objects entirely. In contrast, multispectral segmentation methods, e.g., EGFNet and our ECM method, which incorporate thermal infrared information, effectively identify the segments within the context. Particularly, our results are visually closer to the ground truths compared to the state-of-the-art EGFNet.


📔 SemanticRT Dataset

SemanticRT dataset - the largest MSS dataset to date, comprises 11,371 high-quality, pixel-level annotated RGB-thermal image pairs. It covers a wide range of challenging scenarios in adverse lighting conditions such as low-light and pitch black, as displayed in the figure below.

avatar

Getting Started

  • Dataset Access

Download the SemanticRT dataset (Google Drive), which is structured as follows:

SemanticRT_dataset/
├─ train.txt
├─ val.txt
├─ test.txt
├─ test_day.txt
├─ test_night.txt
├─ test_mo.txt
├─ test_xxx.txt
│ ···
├─ rgb/
│  ├─ ···
├─ thermal/
│  ├─ ···
├─ labels/
│  ├─ ···
···

Training/testing/validation splits can be found in train.txt, test_xxx.txt or val.txt.

  • Dataset ColorMap

Here is the reference for SemanticRT dataset color visualization.

[
    (0, 0, 0),          # 0: background (unlabeled)
    (72, 61, 39),       # 1: car stop
    (0, 0, 255),        # 2: bike
    (148, 0, 211),      # 3: bicyclist
    (128, 128, 0),      # 4: motorcycle
    (64, 64, 128),      # 5: motorcyclist
    (0, 139, 139),      # 6: car
    (131, 139, 139),    # 7: tricycle
    (192, 64, 0),       # 8: traffic light
    (126, 192, 238),    # 9: box
    (244, 164, 96),     # 10:pole
    (211, 211, 211),    # 11:curve
    (205, 155, 155),    # 12:person
]
  • Dataset Acknowledgement

Our SemanticRT dataset is mainly based on LLVIP as well as other RGBT sources (OSU and INO). They are annotated and adjusted to better fit the MSS task. All data and annotations provided are strictly intended for non-commercial research purpose only. If you are interested in our SemanticRT dataset, we sincerely appreciate your citation of our work and strongly encourage you to cite the source datasets mentioned above.


📗 ECM Source Code

Installation

The code requires python>=3.7, as well as pytorch>=1.9 and torchvision>=0.11. Please follow the instructions here to install both PyTorch and TorchVision dependencies. Installing both PyTorch and TorchVision with CUDA support is strongly recommended. We also provide the environment rgbt.yaml used in this work for reference.

In this repo, we provide the ECM code for three benchmark datasets, including MFNet, PST900, and SemanticRT. In the following, we take ECM on SemanticRT dataset as an example.

Getting Started

  • Clone this repo.

    $ git clone https://github.com/jiwei0921/SemanticRT.git
    $ cd SemanticRT-main/ECM_SemanticRT
  • Model Training

First download the SemanticRT dataset. Then the model can be used in just a few adaptions to start training:

  1. Set your SemanticRT dataset path in ./configs/ECM.json
  2. Perform training, with python train_semanticRT.py
  • Model Inference

Meanwhile, the segmentation maps can be generated by loading our pre-trained model checkpoint, with:

  1. Set your SemanticRT dataset path in ./run/ECM.json
  2. Put the pre-trained ckpt into ./run
  3. Perform inference, with python test_semanticRT.py

Here, we provide the IoU results of MFNet for each class. For a more comprehensive evaluation of our ECM, you can refer to the model inference section above to access ECM's results.

Class Car Person Bike Curve Car Stop Guardrail Color Cone Bump
IoU(%) 87.5 73.4 61.7 46.8 37.5 9.1 51.1 56.9
  • Code Acknowledgement

This code repository was originally built from EGFNet. It was modified and extended to support our network design and dataset setup.


Citation

@InProceedings{ji2023semanticrt,
      title     = {SemanticRT: A Large-Scale Dataset and Method for Robust Semantic Segmentation in Multispectral Images},
      author    = {Ji, Wei and Li, Jingjing and Bian, Cheng and Zhang, Zhicheng and Cheng, Li},
      booktitle = {Proceedings of the 31th ACM International Conference on Multimedia},
      year      = {2023},
      pages     = {3307–3316}
}

If you have any further questions, please email us at [email protected].

About

Dataset & Code for ACM Multimedia 2023 paper. "SemanticRT: A Large-Scale Dataset and Method for Robust Semantic Segmentation in Multispectral Images".

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages