Image Deblurring

Authors: addobosz, Wector1

Description of the data set

We used A Curated List of Image Deblurring Datasets from kaggle, which is "a list of popular image deblurring datasets". We experimented only with a subset of those datasets, due to an enormous amount of images in some of them. Nonetheless, aside from smaller amount of training data, our model most likely didn't suffer a lot from this issue, because almost all of those datasets used kernels proposed by Shent et al. 2018.

Exemplative pairs of sharp and blurred images, respectively:

Description of the problem

The image deblurring problem is a fundamental challenge in computer vision and image processing, involving the restoration of sharp images from their blurred counterparts. This task is complicated by the highly divergent and unpredictable nature of blur, which can stem from a variety of sources, such as motion, camera shake, or defocused lenses.

Blurring is typically modeled as the convolution of a sharp image with a blur kernel, which encapsulates the nature and extent of the degradation. However, in real-world scenarios, the blur kernels often vary significantly across different images—or even within the same image—making the deblurring process inherently ill-posed. Motion blur, for example, introduces additional complexity as it depends on the trajectory and speed of movement, leading to non-uniform and dynamic degradation patterns.

To address this problem, deblurring approaches commonly leverage pairs of sharp and blurred images. These pairs provide crucial information for learning the intricate relationships between the two states, enabling the development of models capable of predicting sharp reconstructions. Despite advancements, the diversity of blur kernels and the unpredictable characteristics of real-world blur remain significant hurdles, pushing the boundaries of research in this field.

Description of used architectures

The final architecture

Image Deblurring Architecture

Overview

This architecture is designed for the challenging task of image deblurring, combining key features from two influential models: U-Net and ResNet. It leverages an encoder-decoder structure inspired by U-Net and incorporates residual blocks for efficient gradient flow and feature refinement.

Key Features

1. U-Net Inspiration

Encoder-Decoder Structure: The architecture uses a hierarchical feature learning approach with downsampling (via max pooling) and upsampling (via transposed convolutions).
Transposed Convolutions: These layers are used during the decoding phase to restore the spatial resolution of the image, making the model adept at reconstructing sharp images from blurred inputs.

2. Residual Blocks

Deep Feature Learning: Residual blocks form the core of the architecture, allowing the model to learn complex features without gradient vanishing issues.
Feature Reuse: Skip connections in residual blocks enable efficient feature propagation, critical for recovering fine image details.

Layer Breakdown

Name	Type	# Parameters	Purpose
`conv2d`	Conv2D	448	Initial feature extraction
`conv2d_1`	Conv2D	11,008	Deeper feature extraction
`max_pooling2d`	MaxPooling2D	0	Downsampling for hierarchical learning
`conv2d_2`	Conv2D	92,288	Learning higher-level features
`max_pooling2d_1`	MaxPooling2D	0	Further downsampling
`residual_block`	ResidualBlock	333,184	Complex feature learning (ResNet-inspired)
`residual_block_1`	ResidualBlock	312,704	Additional feature refinement
`conv2d_transpose`	Conv2DTranspose	147,584	Upsampling (U-Net-inspired)
`conv2d_transpose_1`	Conv2DTranspose	73,792	Final upsampling
`conv2d_9`	Conv2D	1,731	Output refinement

Highlights

U-Net Architecture: The encoder-decoder framework with transposed convolutions enables effective deblurring by processing features hierarchically and reconstructing images with high fidelity.
Residual Blocks: Inspired by ResNet, these blocks ensure robust learning of both local details and global context, critical for handling diverse and unpredictable blur patterns.

This hybrid architecture effectively combines the strengths of U-Net and ResNet, making it highly suited for the intricate task of image deblurring.

Model analysis: size in memory, number of parameters

Size in memory: 14.1 MB Number of parameters: 972739

Description of the training and the required commands to run it

To run the training, the following commands are required:

python run.py --model_path --dataset_dir --dataset_name

Description of used metrics, loss, and evaluation

When it comes to used metrics, we chose a number of loss functions and optimizers that we found reasonable, and tested them via grid search (comparing the result of each combination of those 2). It turned out that the best results were achieved by RMSprop optimizer and MAE loss function. From other metrics MSE looked quite promising. Ultimately we tested the final model with each of those metrics, both achieving satifactory results

Furthermore, in the way of iterative development, we introduced more advanced loss functions, merging them at the end. Here is a quick description:

Sobel Loss: we apply the sobel filter to both x and y dimensions and take the absolute difference between y and $\hat{y}$,
Fourier Loss: in a similar fashion to its predecessor, we compute the difference between fourier transform of y and $\hat{y}$.

The aforementioned losses provide an intuitive way to counteract the blurring effect. Sobel filter puts emphasis on edges, which is crucial in reconstructing fuzzy images, in particular motion blur. Fourier loss works in a similar spirit, although its not exactly the same. Thorugh analyzing the frequency dimension we can assess not only edges but also the specific distribution of frequencies in the ground truth image.

Perceptual Loss:

Perceptual loss functions are designed to capture perceptual differences between images, such as content and style discrepancies, which are not always evident at the pixel level. They are often employed in tasks where the goal is to generate images that are visually pleasing to humans, such as in neural style transfer, super-resolution, and image synthesis.

The core idea behind perceptual loss is to use the feature maps from various layers of a CNN, which has been pre-trained on a large dataset like ImageNet. By extracting these feature maps from both the target image and the generated image, we can compute the difference in the high-level features that the network has learned to detect, such as edges, textures, and patterns.

Source: Perceptual Loss Functions by DeepAI

Plots: training and validation loss, metrics

We monitored our model's training process via Weight & Biases. Comprehensive plots and other insightful metrics are contained in the following report.

Used hyperparameters

Hyperparameters were chosen experimentally, via comparison between results of different combinations of learning rates with rho parameter values (RMSprop optimizer's parameter). The best hyperparameter combination turned out to be as follows: Learning Rate: 0.001 Rho: 0.9

List of libraries and tools is available in the requirements.txt file

Description of the runtime environment

Runtime environment: Kaggle Notebook GPU count: 2 GPU type: Tesla T4

Training and inference time

Training duration: 2h 13m 17s Inference time: 28ms

Preparation of a bibliography

Inspiration and Scientific works:

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Perceptual Loss Deep AI

Fourier Transform-Based U-Shaped Network for Single Image Motion Deblrring

Dataset

A Curated List of Image Deblurring Datasets

Helen

CelebA

Link to Git

Link to github repository

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
models		models
report		report
README.md		README.md
best_deblur_model.keras		best_deblur_model.keras
dataset.py		dataset.py
notebook_deblurring.ipynb		notebook_deblurring.ipynb
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Deblurring

Authors: addobosz, Wector1

Description of the data set

Description of the problem

Description of used architectures

The final architecture

Image Deblurring Architecture

Overview

Key Features

1. U-Net Inspiration

2. Residual Blocks

Layer Breakdown

Highlights

Model analysis: size in memory, number of parameters

Description of the training and the required commands to run it

Description of used metrics, loss, and evaluation

Plots: training and validation loss, metrics

Used hyperparameters

List of libraries and tools is available in the requirements.txt file

Description of the runtime environment

Training and inference time

Preparation of a bibliography

Inspiration and Scientific works:

Dataset

Link to Git

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Image Deblurring

Authors: addobosz, Wector1

Description of the data set

Description of the problem

Description of used architectures

The final architecture

Image Deblurring Architecture

Overview

Key Features

1. U-Net Inspiration

2. Residual Blocks

Layer Breakdown

Highlights

Model analysis: size in memory, number of parameters

Description of the training and the required commands to run it

Description of used metrics, loss, and evaluation

Plots: training and validation loss, metrics

Used hyperparameters

List of libraries and tools is available in the requirements.txt file

Description of the runtime environment

Training and inference time

Preparation of a bibliography

Inspiration and Scientific works:

Dataset

Link to Git

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages