GitHub - LiBingyu01/StitchFusion-StitchFusion-Weaving-Any-Visual-Modalities-to-Enhance-Multimodal-Semantic-Segmentation

StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation

Bingyu Li, Da Zhang, Zhiyuan Zhao, Junyu Gao, Xuelong Li

This is the official implementation of our paper "StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation".

💬 Introduction

Multimodal semantic segmentation shows significant potential for enhancing segmentation accuracy in complex scenes. However, current methods often incorporate specialized feature fusion modules tailored to specific modalities, thereby restricting input flexibility and increasing the number of training parameters. To address these challenges, we propose StitchFusion, a straightforward yet effective modal fusion framework that integrates large-scale pre-trained models directly as encoders and feature fusers. This approach facilitates comprehensive multi-modal and multi-scale feature fusion, accommodating any visual modal inputs. Specifically, Our framework achieves modal integration during encoding by sharing multi-modal visual information. To enhance information exchange across modalities, we introduce a multi-directional adapter module (MultiAdapter) to enable cross-modal information transfer during encoding. By leveraging MultiAdapter to propagate multi-scale information across pre-trained encoders during the encoding process, StitchFusion achieves multi-modal visual information integration during encoding. Extensive comparative experiments demonstrate that our model achieves state-of-the-art performance on four multi-modal segmentation datasets with minimal additional parameters. Furthermore, the experimental integration of MultiAdapter with existing Feature Fusion Modules (FFMs) highlights their complementary nature.

🌟 News

2024/9/20: A researcher has inquired about reproducible. pth files, and we are currently organizing them. However, as the permissions have not been granted to interns, we may need to wait for a period of time. If there is any news, we will make an update as soon as possible.If you have any questions, please contact the author's email: [email protected]
2024/9/24: 更直接联系我的方式是[email protected]，这将直接发到我的手机客户端。
If you find this repo is useful, please STAR it that make the authors encouraged.
stitchfusion_with_tips_you_can_copy.py
I have updated the reproducible files and made additional versions of StitchFusion available at stitchfusion_with_tips_you_can_copy.py. You can simply copy these files and run the experiments. To use any of these versions, just copy the path of the .pth file into the EVAL/MODEL_PATH field in your chosen config.yaml file.
I have release the reproducible files for DELIVER dataset, However, during replication, I observed that the results differed slightly from the reported values, with variations of around a few tenths—some higher, some lower. Nevertheless, these differences do not affect the overall performance comparison of our model..
stitchfusion_with_tips_you_can_copy.py is all you need to reproduce the results.

🚀 Updates

2024/7/27: init repository.
2024/7/27: release the code for StitchFusion.
2024/8/02: upload the paper for StitchFusion.
2024/11/6：upload some checkpoint file for StitchFuion.
2024/11/12: release the reproducible files for DELIVER dataset.

🔍 StitchFusion model

Figure: Comparison of different model fusion paradigms.

Figure: MultiAdapter Module For StitchFusion Framwork At Different Density Levels.

👁️ Environment

First, create and activate the environment using the following commands:

conda env create -f environment.yaml
conda activate StitchFusion

📦 Data preparation

Download the dataset:

MCubeS, for multimodal material segmentation with RGB-A-D-N modalities.
FMB, for FMB dataset with RGB-Infrared modalities.
PST, for PST900 dataset with RGB-Thermal modalities.
DeLiver, for DeLiVER dataset with RGB-D-E-L modalities.
MFNet, for MFNet dataset with RGB-T modalities. Then, put the dataset under data directory as follows:

data/
├── MCubeS
│   ├── polL_color
│   ├── polL_aolp_sin
│   ├── polL_aolp_cos
│   ├── polL_dolp
│   ├── NIR_warped
│   ├── NIR_warped_mask
│   ├── GT
│   ├── SSGT4MS
│   ├── list_folder
│   └── SS
├── FMB
│   ├── test
│   │   ├── color
│   │   ├── Infrared
│   │   ├── Label
│   │   └── Visible
│   ├── train
│   │   ├── color
│   │   ├── Infrared
│   │   ├── Label
│   │   └── Visible
├── PST
│   ├── test
│   │   ├── rgb
│   │   ├── thermal
│   │   └── labels
│   ├── train
│   │   ├── rgb
│   │   ├── thermal
│   │   └── labels
├── DELIVER
|   ├── depth
│       ├── cloud
│       │   ├── test
│       │   │   ├── MAP_10_point102
│       │   │   │   ├── 045050_depth_front.png
│       │   │   │   ├── ...
│       │   ├── train
│       │   └── val
│       ├── fog
│       ├── night
│       ├── rain
│       └── sun
│   ├── event
│   ├── hha
│   ├── img
│   ├── lidar
│   └── semantic
├── MFNet
|   ├── img
|   └── ther

📦 Model Zoo

PST

All .pth will release later.

Model-Modal	mIoU	weight
StitchFusion-RGB-T	85.35	GoogleDrive

FMB

All .pth will release later.

Model-Modal	mIoU	weight
StitchFusion-RGB-T	64.85	GoogleDrive

MFNet

All .pth will release later.

Model-Modal	mIoU	weight
StitchFusion-RGB-T	57.91	GoogleDrive
StitchFusion-RGB-T	57.80	GoogleDrive
StitchFusion-RGB-T	58.13	GoogleDrive

DELIVER

All .pth will release later.

Model-Modal	mIoU	weight
StitchFusion-RGB-D	65.75	GoogleDrive
StitchFusion-RGB-E	57.31	GoogleDrive
StitchFusion-RGB-L	58.03	GoogleDrive
StitchFusion-RGB-DE	66.03	GoogleDrive
StitchFusion-RGB-DL	67.06	GoogleDrive
StitchFusion-RGB-DEL	68.18	GoogleDrive

Figure: Main Results: Comparision With SOTA Model.

Figure: Main Results: Per-Class Comparision in Different Modality Combination Config and With SOTA Model.

MCubeS

FMB

PST900

DeLiVER

MFNet

👁️ Training

Before training, please download pre-trained SegFormer, and put it in the correct directory following this structure:

checkpoints/pretrained/segformer
├── mit_b0.pth
├── mit_b1.pth
├── mit_b2.pth
├── mit_b3.pth
└── mit_b4.pth

To train StitchFusion model, please update the appropriate configuration file in configs/ with appropriate paths and hyper-parameters. Then run as follows:

cd path/to/StitchFusion
conda activate StitchFusion

python -m tools.train_mm --cfg configs/mcubes_rgbadn.yaml

python -m tools.train_mm --cfg configs/fmb_rgbt.yaml

python -m tools.train_mm --cfg configs/pst_rgbt.yaml

👁️ Evaluation

To evaluate StitchFusion models, please download respective model weights (GoogleDrive) and save them under any folder you like.

Then, update the EVAL section of the appropriate configuration file in configs/ and run:

cd path/to/StitchFusion
conda activate StitchFusion

python -m tools.val_mm --cfg configs/mcubes_rgbadn.yaml

python -m tools.val_mm --cfg configs/fmb_rgbt.yaml

python -m tools.val_mm --cfg configs/pst_rgbt.yaml

python -m tools.val_mm --cfg configs/deliver.yaml

python -m tools.val_mm --cfg configs/mfnet_rgbt.yaml

👁️ Evaluation

Figure: Visulization of StitchFusion On DeLiver Dataset. Figure: Visulization of StitchFusion On Mcubes Dataset.

🚩 License

This repository is under the Apache-2.0 license. For commercial use, please contact with the authors.

📜 Citations

@article{li2024stitchfusion,
  title={StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation},
  author={Li, Bingyu and Zhang, Da and Zhao, Zhiyuan and Gao, Junyu and Li, Xuelong},
  journal={arXiv preprint arXiv:2408.01343},
  year={2024}
}

🔈 Acknowledgements

Our codebase is based on the following Github repositories. Thanks to the following public repositories:

Note: This is a research level repository and might contain issues/bugs. Please contact the authors for any query.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
checkpoints/pretrained/segformers		checkpoints/pretrained/segformers
checkpoints_training		checkpoints_training
configs		configs
data		data
figs		figs
getting_start		getting_start
semseg		semseg
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
requirements.txt		requirements.txt
stitchfusion_with_tips_you_can_copy.py		stitchfusion_with_tips_you_can_copy.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation

💬 Introduction

🌟 News

🚀 Updates

🔍 StitchFusion model

👁️ Environment

📦 Data preparation

📦 Model Zoo

PST

FMB

MFNet

DELIVER

MCubeS

FMB

PST900

DeLiVER

MFNet

👁️ Training

👁️ Evaluation

👁️ Evaluation

🚩 License

📜 Citations

🔈 Acknowledgements

About

Releases

Packages

Contributors 2

Languages

License

LiBingyu01/StitchFusion-StitchFusion-Weaving-Any-Visual-Modalities-to-Enhance-Multimodal-Semantic-Segmentation

Folders and files

Latest commit

History

Repository files navigation

StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation

💬 Introduction

🌟 News

🚀 Updates

🔍 StitchFusion model

👁️ Environment

📦 Data preparation

📦 Model Zoo

PST

FMB

MFNet

DELIVER

MCubeS

FMB

PST900

DeLiVER

MFNet

👁️ Training

👁️ Evaluation

👁️ Evaluation

🚩 License

📜 Citations

🔈 Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages