ResGAN: Image Synthesis from Text Description

By Naman Sharma, An Lijun, Zhang Miaolin

This work aims to solve the problem of image synthesis from text descriptions. The trained GAN model accepts text descriptions of an object and tries to generate images based on that text description. This work is based on top of a similar work done by Scott Read in his article Generative Adversarial Text-to-Image Synthesis, implemented at his Github repository. His implementation referred to as the "original" from now on.

Proposed Architecture

Given that text description can contain details which can be useful on different layers of the network, we propose injecting repeated instances of the text description throughout the network. For example, color may be useful in the lower layers while textures may be useful in higher layers. The generative and discriminative architectures are given below.

Discriminator Architecture: Same as the original architecture Generator Architecture: With text embeddings injected at different layers Based on the above architecture, we can see that we are able to achieve a lower loss on the generator because of the injected text embeddings.

Example images

We also show how the quality of the images change based on the number of epochs for which the network is trained.

Training the network

Follow these steps to get the datasets and train the network:

Download the birds and flowers and COCO captions
Download the birds and flowers and COCO images
Download the text encoders for birds and flowers and COCO descriptions
Put the downloaded datasets into the ./data folder
Run trainer.py. Once the data is trained, you can see the results in the folder ./results

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
GAN		GAN
images		images
README.md		README.md
trainer.py		trainer.py
txt2image_dataset.py		txt2image_dataset.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ResGAN: Image Synthesis from Text Description

Proposed Architecture

Example images

Training the network

About

Releases

Packages

Languages

sharma-n/ResGAN

Folders and files

Latest commit

History

Repository files navigation

ResGAN: Image Synthesis from Text Description

Proposed Architecture

Example images

Training the network

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages