Skip to content

srm-mic/PaletteNet-Implementation

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PaletteNet-Implementation

This project was built as a part of the RESCON event hosted by SRM Machine Intelligence Community, in which we were asked to implement a research paper. Our team Phoneix choose PaletteNet: Image Recolorization with Given Color Palette", by Junho Cho, Sangdoo Yun, Kyoungmu Lee and Jin Young Choi.

You can download the paper from here

Quick trial

https://colab.research.google.com/drive/1DqwgiGBmf14kGhsdCTcOl_FzvYilagoD#scrollTo=7JflO10d9TWZ

*  Go to the link and click connect on the top right.
*  Press the run button next to the first code block to run it. It will download the model files and weights.
*  You can run all the blocks using (CTRL + ENTER).	
*  Run all the blocks in a sequence. The last one will show you the re-colored image.

PaletteNet is a deep neural network, which recolors an image according to a given target color palette. This takes two inputs a source image to be re-colored and a target palette. Human experts with a commercial software take on average 18 minutes to recolor an image, while PaletteNet automatically recolors plausible results in less than a second.

How we implemented it?

We created our own dataset since this task requires both source image and corresponding palette. We scrapped 1043 high-quality images from "https://www.design-seeds.com". This created a raw dataset of 1.1GB which was stored on AWS S3.

Pre-processing

  1. Cropping-As the source image and palette are attached we first need to separate them and then resize the image to 384,286.
  2. Hue-Shift-To train the model we created 20 variants of the image using the algorithm shown below.
RGB -> LAB and cache L 
RGB -> HSV --hue shift--> H*SV -> LAB
Final hue-shifted image: LA∗B∗

After the pre-processing the size of the dataset increased to 20540 images and it occupied 3.1 GB of space.

Model

======= image

  • Our model is divided into three parts: Feature Encoder, Recoloring Decoder and Discrimantor.
  • Feature Encoder is made up of ResNet blocks which extracts the content feature to a tensor of size [512x25x16].
  • Recoloring Decoder takes the target palette and content features as the input and outputs the final recolored image.

Training:

There are two phases of training:

  • In the first phase, we train FE+RD using Euclidean loss
  • In the second phase, we use a discriminator to differentiate between original images( no hue-shift ) and generated images.
  • We are using the Adam optimizer with β1=0.5, lr=0.0002 and batch-size=12.

Testing:

We visually compare the generated and expected 'a' and 'b' components.

image

Problems we faced:

  • We were not provided with any sort of dataset.
  • The paper did not describe the data preparation pipeline, only the algorithm.
  • The training process was quite daunting because of two separate training loops and tools we had to make for visualization.

Results:

image1

image3

image2

image3

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.8%
  • Python 0.2%