Skip to content

The project involves using a dataset of 100 high-resolution retinal fundus images for blood vessel segmentation to aid in early detection of retinal pathologies | Implemented U-Net architecture from scratch, known for its efficiency in semantic segmentation with limited data | The final model achieved 86% IoU score | PyTorch Lightning

Notifications You must be signed in to change notification settings

YoussefAboelwafa/Retina_Blood_Vessel_Segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Retina Blood Vessel Segmentation

Description

  • The project involves using a dataset of 100 high-resolution retinal fundus images for blood vessel segmentation to aid in early detection of retinal pathologies.
  • Implemented U-Net architecture from scratch, known for its efficiency in semantic segmentation with limited data.
  • Hyperparameter tuning is applied using parallelism with multiple GPUs to efficiently handle the computational demands.
  • The final model achieved 86% IoU score on validation set and 70% on test set.
  • Data augmentation techniques were applied to the training dataset to enhance model robustness and generalization by artificially expanding the diversity of training samples.
  • Going modular with a series of different python scripts for more reproducibility using pytorch lightning for simplified codebase, scalability, and advanced features.

Dataset

  • This dataset contains a comprehensive collection of retinal fundus images, meticulously annotated for blood vessel segmentation. Accurate segmentation of blood vessels is a critical task in ophthalmology as it aids in the early detection and management of various retinal pathologies, such as diabetic retinopathy & glaucoma. The dataset comprises a total of 100 high-resolution retinal fundus images captured using state-of-the-art imaging equipment. Each image comes with corresponding pixel-level ground truth annotations indicating the exact location of blood vessels. These annotations facilitate the development and evaluation of advanced segmentation algorithms.
  • The dataset comprises a total of 100 retinal fundus images divided into 80 train images & 20 test images.
  • The 80 train images are divided into 60 images for training & 20 images for validation.
  • Dataset link on Kaggle

__results___47_0

U-Net Architecture

U-Net is widely used in semantic segmentation because it excels at capturing fine-grained details and spatial context, thanks to its encoder-decoder architecture with skip connections. This design enables precise boundary delineation and efficient training even with a limited amount of labeled data. Moreover, U-Net's ability to preserve spatial information throughout the network significantly improves segmentation accuracy.

image

Main Components:

  1. Encoder (contracting path)
  2. Bottleneck
  3. Decoder (expansive path)
  4. Skip Connections

Encoder:

  • Extract features from input images.
  • Repeated 3x3 conv (valid conv) + ReLU layers.
  • 2x2 max pooling to downsample (reduce spatial dimensions).
  • Double channels with after the max pooling.

Bottleneck:

  • Pivotal role in bridging the encoder and decoder.
  • Capture the most abstract and high-level features from the input image.
  • Serves as a feature-rich layer that condenses the spatial dimensions while preserving the semantic information.
  • Enable the decoder to reconstruct the output image with high fidelity.
  • The large number of channels in the bottleneck: Balance the loss of spatial information due to down-sampling by enriching the feature space.

Decoder:

  • Repeated 3x3 conv (valid conv) + ReLU layers.
  • Upsample using transpose convolution.
  • Halves channels after transpose convolution.
  • Successive blocks in decoder: Series of gradual upsampling operations & gradual refinement helps in generating a high-quality segmentation map with accurate boundaries.

Skip Connections:

  • Preservation of Spatial Information because during the downsampling process, spatial information can be lost.
  • Combining Low-level and High-level Features.
  • Gradient Flow Improvement.
  • Better Localization.
  • Cropping is used in U-Net skip connections primarily due to the following reasons:
    • Size Mismatch: ensures that the sizes are compatible for concatenation.
    • Aligning the central regions: which contain more reliable information.

Output:

  • The final layer of the U-Net decoder typically has several filters equal to the number of classes, producing an output feature map for each class.
  • The final layer of the U-Net can be a 1x1 convolution to map the feature maps to the desired number of output classes for segmentation.
  • If there are C classes, the output will be of shape (H _ W _ C).
  • Interpolation methods like bilinear or nearest-neighbor interpolation can be used at the final layer to adjust the output dimensions to match the input. This ensures that each pixel in the input image has a corresponding label in the output segmentation map.
  • The softmax function is applied to each pixel location across all the channels

Model Evaluation

Loss Function:

  • The choice of loss function is crucial for training a U-Net model for blood vessel segmentation.
  • The Binary Cross-Entropy (BCE) loss is commonly used for binary segmentation tasks, such as blood vessel segmentation.
  • BCE loss is well-suited for pixel-wise classification problems where each pixel is classified as either a blood vessel or background.

Loss

Evaluation Metric:

  • IoU (Intersection over Union):
    • Measures the overlap between the predicted segmentation and the ground truth.
    • IoU is calculated as the ratio of the intersection area to the union area of the predicted and ground truth segmentation masks.
    • A higher IoU indicates better segmentation accuracy.

IoU

Hyperparameters:

Hyperparameters are passed using JSON file to the training script.

Job id epochs batch_size learning_rate val_IoU val_loss test_IoU test_loss
17855 100 4 1e-04 0.6163 0.1475 - -
17857 100 16 1e-04 0.4087 0.2136 - -
17931 200 4 1e-04 0.6783 0.1251 0.6779 0.125
17932 200 4 5e-05 0.6466 0.1361 - -
17939 200 4 1e-04 0.6204 0.1457 - -
17941 300 4 1e-04 0.6126 0.1551 - -
17942 300 4 5e-05 0.5701 0.1618 - -
18049 400 4 1e-04 0.7242 0.0961 0.6827 0.1307
18808 1000 4 1e-04 0.8609 0.0454 0.701 0.1881

Experiments link on Comet

Number of GPUs used in the training is 4 GPUs

The best hyperparameters for my training after multiple experiments are:

  • Learning Rate: 0.0001
  • Optimizer: Adam
  • Batch Size: 4
  • Epochs: 1000

At epoch 992 the model has the best performance with:

  • IoU score = 0.8609
  • validation loss = 0.0454

The model is saved to disk for future use.

Inference:

1 2 3 4 5

About

The project involves using a dataset of 100 high-resolution retinal fundus images for blood vessel segmentation to aid in early detection of retinal pathologies | Implemented U-Net architecture from scratch, known for its efficiency in semantic segmentation with limited data | The final model achieved 86% IoU score | PyTorch Lightning

Topics

Resources

Stars

Watchers

Forks

Languages