New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[12] ViTGAN: Training GANs with Vision Transformers #12

Open

hyoseok1223 opened this issue Dec 23, 2022 · 1 comment

Assignees

Labels

GAN ICLR '22 Transformer

Contributor

hyoseok1223 commented Dec 23, 2022 •

edited by omocomo

Loading

Links

Paper : https://arxiv.org/abs/2107.04589
Github : -

한 줄 요약

ViT를 GAN에 적용해 CNN-based GAN에 견줄만한 성능을 낸다.

선택 이유

CNN-based GAN이 아닌 ViT를 GAN에 처음 적용한 논문이라 흥미로워서 읽어보았다.
또한, ViT를 GAN에 적용하며 생긴 불안정성을 해결하는 방법들도 알 수 있는 논문이다.

hyoseok1223 assigned omocomo

omocomo added ICLR '22 GAN Transformer labels

omocomo mentioned this issue

[17] StyleSwin: Transformer-based GAN for High-resolution Image Generation #17

Open

Contributor

omocomo commented Jan 4, 2023

[Review Link]
https://omocomo.tistory.com/entry/GAN-ViTGAN-Training-GANs-with-Vision-Transformers

[Summary]

ViT를 GAN에 적용해, Transformer-based GAN을 제안한다.
ViTGAN 학습을 안정화하고 잘 수렴할 수 있도록 하는 새로운 방법들을 제안한다.
Discriminator → self-attention에서의 enforcing lipschitzness, improved spectral normalization, overlapping image patches
Generator → self-modulated layer normalization과 implicit neural representation for patch generation

[Contribution]

ViT를 GAN에 적용한 첫 논문이며, CNN-based GAN과 견줄만한 성능을 보여줬다.
앞으로의 연구를 통해 Transformer-based GAN을 더 발전시킬 수 있는 가능성이 있다.

[Comment]

아직 CNN-based GAN을 완전 뛰어넘지는 못했다.
또한, 32x32, 64x64 로 매우 작은 이미지만 결과로 생성했다.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment