This repository provides the SemanticRT dataset and ECM code for multispectral semantic segmentation (MSS). The repository is structured as follows.
Introduction Figure. Visual illustration of the advantages of employing multispectral (RGB-Thermal) images for semantic segmentation. The complementary nature of RGB and thermal images are highlighted using yellow and green boxes, respectively. The RGB-only method, DeepLabV3+, is susceptible to incorrect segmentation or even missing target objects entirely. In contrast, multispectral segmentation methods, e.g., EGFNet and our ECM method, which incorporate thermal infrared information, effectively identify the segments within the context. Particularly, our results are visually closer to the ground truths compared to the state-of-the-art EGFNet.
SemanticRT dataset - the largest MSS dataset to date, comprises 11,371 high-quality, pixel-level annotated RGB-thermal image pairs. It covers a wide range of challenging scenarios in adverse lighting conditions such as low-light and pitch black, as displayed in the figure below.
- Dataset Access
Download the SemanticRT dataset (Google Drive), which is structured as follows:
SemanticRT_dataset/
├─ train.txt
├─ val.txt
├─ test.txt
├─ test_day.txt
├─ test_night.txt
├─ test_mo.txt
├─ test_xxx.txt
│ ···
├─ rgb/
│ ├─ ···
├─ thermal/
│ ├─ ···
├─ labels/
│ ├─ ···
···
Training/testing/validation splits can be found in train.txt
, test_xxx.txt
or val.txt
.
- Dataset ColorMap
Here is the reference for SemanticRT dataset color visualization.
[
(0, 0, 0), # 0: background (unlabeled)
(72, 61, 39), # 1: car stop
(0, 0, 255), # 2: bike
(148, 0, 211), # 3: bicyclist
(128, 128, 0), # 4: motorcycle
(64, 64, 128), # 5: motorcyclist
(0, 139, 139), # 6: car
(131, 139, 139), # 7: tricycle
(192, 64, 0), # 8: traffic light
(126, 192, 238), # 9: box
(244, 164, 96), # 10:pole
(211, 211, 211), # 11:curve
(205, 155, 155), # 12:person
]
- Dataset Acknowledgement
Our SemanticRT dataset is mainly based on LLVIP as well as other RGBT sources (OSU and INO). They are annotated and adjusted to better fit the MSS task. All data and annotations provided are strictly intended for non-commercial research purpose only. If you are interested in our SemanticRT dataset, we sincerely appreciate your citation of our work and strongly encourage you to cite the source datasets mentioned above.
The code requires python>=3.7
, as well as pytorch>=1.9
and torchvision>=0.11
. Please follow the instructions here to install both PyTorch and TorchVision dependencies. Installing both PyTorch and TorchVision with CUDA support is strongly recommended. We also provide the environment rgbt.yaml
used in this work for reference.
In this repo, we provide the ECM code for three benchmark datasets, including MFNet, PST900, and SemanticRT. In the following, we take ECM on SemanticRT dataset as an example.
-
Clone this repo.
$ git clone https://github.com/jiwei0921/SemanticRT.git $ cd SemanticRT-main/ECM_SemanticRT
-
Model Training
First download the SemanticRT dataset. Then the model can be used in just a few adaptions to start training:
- Set your SemanticRT dataset path in
./configs/ECM.json
- Perform training, with
python train_semanticRT.py
- Model Inference
Meanwhile, the segmentation maps can be generated by loading our pre-trained model checkpoint, with:
- Set your SemanticRT dataset path in
./run/ECM.json
- Put the pre-trained ckpt into
./run
- Perform inference, with
python test_semanticRT.py
Here, we provide the IoU results of MFNet for each class. For a more comprehensive evaluation of our ECM, you can refer to the model inference section above to access ECM's results.
Class | Car | Person | Bike | Curve | Car Stop | Guardrail | Color Cone | Bump |
---|---|---|---|---|---|---|---|---|
IoU(%) | 87.5 | 73.4 | 61.7 | 46.8 | 37.5 | 9.1 | 51.1 | 56.9 |
- Code Acknowledgement
This code repository was originally built from EGFNet. It was modified and extended to support our network design and dataset setup.
@InProceedings{ji2023semanticrt,
title = {SemanticRT: A Large-Scale Dataset and Method for Robust Semantic Segmentation in Multispectral Images},
author = {Ji, Wei and Li, Jingjing and Bian, Cheng and Zhang, Zhicheng and Cheng, Li},
booktitle = {Proceedings of the 31th ACM International Conference on Multimedia},
year = {2023},
pages = {3307–3316}
}
If you have any further questions, please email us at [email protected].