MobileUNETR

A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation: [ECCV 2024 -- BioImage Computing] (ORAL) https://arxiv.org/abs/2409.03062

Architecture

Parameter Distribution and Computational Complexity

🚀 News

Repository Construction Complete 06/09/2024
We will continue to update the GitHub repository with new experiments with a wide range of datasets, so be sure to check back regularly.
In the meantime -> Checkout our other projects: https://github.com/OSUPCVLab/SegFormer3D

Overview:

* Segmentation approaches broadly fall into 2 categories. 
    1. End to End CNN Based Segmentation Methods
    2. Transformer Based Encoder with a CNN Based Decoder. 
* Many Transformer based segmentation approaches rely primarily on CNN based decoders overlooking the benefits of the Transformer architecture within the decoder. 
* We address the need for an efficient/lightweight segmentation architecture by introducing MobileUNETR, which aims to overcome the performance constraints associated with both CNNs and Transformers while minimizing model size, presenting a promising stride towards efficient image segmentation. 
* MobileUNETR has 3 main features. 
  1.  MobileUNETR comprises of a lightweight hybrid CNN-Transformer encoder to help balance local and global contextual feature extraction in an efficient manner.
  2.  A novel hybrid decoder that simultaneously utilizes low-level and global features at different resolutions within the decoding stage for accurate mask generation.
  3. surpassing large and complex architectures, MobileUNETR achieves superior performance with 3 million parameters and a computational complexity of 1.3 GFLOPs.

Stand Alone Model [Please Read]

To help improve ease of use of the MobileUNETR architecture, the model is constructed as a single stand alone file. If you want to use the model outside of the provided code base simply grab the mobileunetr.py file from architectures folder and insert it into your own project.

Example:

# import from mobileunetr.py file
from mobileunetr import build_mobileunetr_s, build_mobileunetr_xs, build_mobileunetr_xxs
import torch

# create model
mobileunetr_s = build_mobileunetr_s(num_classes= 1, image_size=512)

mobileunetr_xs = build_mobileunetr_xs(num_classes=1, image_size=512)

mobileunetr_xxs = build_mobileunetr_xxs(num_classes=1, image_size= 512)

# forward pass
data = torch.randn((4, 3, 512, 512))
out = mobileunetr_xxs.forward(data)
print(f"input tensor: {data.shape}")
print(f"output tensor: {out.shape}")

Data and Data Processing

ISIC Data -- https://challenge.isic-archive.com/data/
PH2 Data -- https://www.fc.up.pt/addi/ph2%20database.html
Data Preprocessing
For each dataset ISIC 2016, ISIC 2017 ... etc, Simply create a csv file with N rows 2 columns. Where N is the number of items in the dataset and 2 columns ["image", "mask"] are paths to input image and the path to target mask.
Once you have a train.csv and a test.csv (lets assume for ISIC 2016), inside experiments/isic_2016/exp_2_dice_b8_a2/config.yaml update the data path for the train and test csv files. And Follow the steps below to run the experiment.

Run Your Experiment

In order to run an experiment, we provide a template folder placed under MobileUNETR_HOME_PATH/experiments/isic_2016/experiment_folder that you can use to setup your experiment. While inside the "experiment_folder" run your experiment on a single GPU with:

cd MobileUNETR
cd experiments/isic_2016/exp_2_dice_b8_a2/
# the default gpu device is set to cuda:0 (you can change it)
CUDA_VISIBLE_DEVICES="0" accelerate launch run_experiment.py

You might want to change the hyperparameters (batch size, learning rate, weight decay etc.) of your experiment. For that you need to edit the config.yaml file inside your experiment folder.

As the experiment is running, the logs (train loss, vlaidation loss and dice score) will be written to the terminal. You can log your experiment on wandb (you need to setup an account there) if you set mode: "online" in the wandb_parameters section of the config.yaml. The default value is mode: "offline". If you want to log the result to your wandb account, put your wandb info into the wandb_parameters section of the config.yaml and your entire experiment will be logged under your wandb entity (e.g. pcvlab) page.

ISIC 2016 Performance

ISIC 2017 Performance

ISIC 2018 Performance

ISIC PH2 Performance

Advanced Architectures and Training Methods

Experiments: Extending to Complex Real World Scenes (Cityscapes, Potsdamn and Vaihigen)

Cityscapes Results

Potsdam and Vaihigen Results (GT [Left], Prediction Overlay [Right])

Potsdam Left Table and Vaihigen Right Table

Citation

If you liked our paper, please consider citing it [will update TBD sections soon]

@inproceedings{perera2024mobileunetr,
  title={MobileUNETR: A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation},
  author={Perera, Shehan, Erzurumlu, Yunus, Gulati, Deepak and Yilmaz, Alper},
  booktitle={Proceedings of the IEEE/CVF European Conference on Computer Vision (ECCV)},
  pages={TBD},
  year={2024}
}

@article{perera2024mobileunetr,
  title={MobileUNETR: A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation},
  author={Perera, Shehan, Erzurumlu, Yunus, Gulati, Deepak and Yilmaz, Alper},
  journal={https://arxiv.org/abs/2409.03062},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
architectures		architectures
augmentations		augmentations
dataloaders		dataloaders
example_configs		example_configs
experiments_medical		experiments_medical
losses		losses
misc		misc
optimizers		optimizers
resources		resources
train_scripts		train_scripts
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MobileUNETR

A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation: [ECCV 2024 -- BioImage Computing] (ORAL) https://arxiv.org/abs/2409.03062

Architecture

Parameter Distribution and Computational Complexity

🚀 News

Overview:

Stand Alone Model [Please Read]

Data and Data Processing

Run Your Experiment

ISIC 2016 Performance

ISIC 2017 Performance

ISIC 2018 Performance

ISIC PH2 Performance

Advanced Architectures and Training Methods

Experiments: Extending to Complex Real World Scenes (Cityscapes, Potsdamn and Vaihigen)

Cityscapes Results

Potsdam and Vaihigen Results (GT [Left], Prediction Overlay [Right])

Potsdam Left Table and Vaihigen Right Table

Citation

About

Releases

Packages

Languages

OSUPCVLab/MobileUNETR

Folders and files

Latest commit

History

Repository files navigation

MobileUNETR

A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation: [ECCV 2024 -- BioImage Computing] (ORAL) https://arxiv.org/abs/2409.03062

Architecture

Parameter Distribution and Computational Complexity

🚀 News

Overview:

Stand Alone Model [Please Read]

Data and Data Processing

Run Your Experiment

ISIC 2016 Performance

ISIC 2017 Performance

ISIC 2018 Performance

ISIC PH2 Performance

Advanced Architectures and Training Methods

Experiments: Extending to Complex Real World Scenes (Cityscapes, Potsdamn and Vaihigen)

Cityscapes Results

Potsdam and Vaihigen Results (GT [Left], Prediction Overlay [Right])

Potsdam Left Table and Vaihigen Right Table

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages