Skip to content

AttentionX/Multimodal_Generation_Papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

Multimodal_Generation_Papers

Recommended papers for Introduction to Multimodal Generative Models

Survey Papers

Year Title Venue Paper Code
2023 Multimodal Image Synthesis and Editing: The Generative AI Era TPAMI 2023 https://arxiv.org/abs/2112.13592
2023 Text-to-image Diffusion Models in Generative AI: A Survey https://arxiv.org/abs/2303.07909
2023 Vision + Language Applications: A Survey GCV@CVPR2023 https://arxiv.org/abs/2305.14598

Diffusion

Title Venue Code link Paper link Year
High Resolution Image Synthesis with Latent Diffusion Models CVPR https://github.com/CompVis/latent-diffusion https://arxiv.org/abs/2112.10752 2022
BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing https://github.com/salesforce/LAVIS/blob/59273f651b9bffb193d1b12a235e909e9f826dda/projects/blip-diffusion/README.md https://arxiv.org/abs/2305.14720 2023
3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models https://arxiv.org/abs/2212.00842 2022
HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models https://github.com/JiauZhang/hyperdreambooth https://arxiv.org/abs/2307.06949 2023
Magic3D : High-Resolution Text-to-3D Content Creation CVPR https://arxiv.org/abs/2211.10440v2 2023
DreamBooth3D: Subject-Driven Text-to-3D Generation ICCV https://arxiv.org/abs/2303.13508 2023
Conditional Text Image Generation With Diffusion Models https://arxiv.org/abs/2306.10804 2023
DCFace: Synthetic Face Generation with Dual Condition Diffusion Model CVPR https://github.com/mk-minchul/dcface https://arxiv.org/abs/2304.07060 2023
3D Neural Field Generation using Triplane Diffusion CVPR https://github.com/JRyanShue/NFD https://arxiv.org/abs/2211.16677v1 2023
DiffCollage: Parallel Generation of Large Content With Diffusion Models CVPR https://github.com/sbyebss/DiffCollage https://arxiv.org/abs/2303.17076v1 2023
Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models CVPR https://arxiv.org/abs/2212.14704 2023
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation CVPR https://github.com/google/dreambooth https://arxiv.org/abs/2208.12242 2023
LayoutDiffusion: Controllable Diffusion Model for Layout-to-Image Generation CVPR https://github.com/zgctroy/layoutdiffusion https://arxiv.org/abs/2303.17189 2023
LayoutDM: Transformer-Based Diffusion Model for Layout Generation CVPR https://arxiv.org/abs/2305.02567 2023
NeuralField-LDM: Scene Generation With Hierarchical Latent Diffusion Models CVPR https://arxiv.org/abs/2304.09787 2023
Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation CVPR https://github.com/pals-ttic/sjc/ https://arxiv.org/abs/2212.00774 2023
SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation CVPR https://github.com/yccyenchicheng/SDFusion https://arxiv.org/abs/2212.04493 2023
Shifted Diffusion for Text-to-Image Generation CVPR https://github.com/drboog/Shifted_Diffusion https://arxiv.org/abs/2211.15388 2023
SINE: SINgle Image Editing with Text-to-Image Diffusion Models CVPR https://github.com/zhang-zx/sine https://arxiv.org/abs/2212.04489 2023
SparseFusion: Distilling View-Conditioned Diffusion for 3D Reconstruction ICCV https://github.com/yichen928/sparsefusion https://arxiv.org/abs/2304.14340 2023
Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models CVPR https://github.com/ucsb-nlp-chang/diffusiondisentanglement https://arxiv.org/abs/2212.08698 2023
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation CVPR https://arxiv.org/abs/2303.08320 2023
Multi-Concept Customization of Text-to-Image Diffusion CVPR https://github.com/adobe-research/custom-diffusion https://arxiv.org/abs/2212.04488 2023
RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation CVPR https://github.com/Anciukevicius/RenderDiffusion https://arxiv.org/abs/2211.09869 2023

NeRF

Year Title Venue Paper Code
2021 FENeRF: Face Editing in Neural Radiance Fields CVPR https://arxiv.org/abs/2111.15490 https://github.com/MrTornado24/FENeRF
2021 StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis ICLR https://arxiv.org/abs/2110.08985 https://github.com/facebookresearch/StyleNeRF
2022 3DMM-RF: Convolutional Radiance Fields for 3D Face Modeling https://arxiv.org/abs/2209.07366
2021 MoFaNeRF: Morphable Facial Neural Radiance Field ECCV https://arxiv.org/abs/2112.02308 https://github.com/zhuhao-nju/mofanerf
2023 ABLE-NeRF: Attention-Based Rendering with Learnable Embeddings for Neural Radiance Field CVPR https://arxiv.org/abs/2303.13817 https://github.com/TangZJ/able-nerf
2022 CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields CVPR https://arxiv.org/abs/2112.05139 https://github.com/cassiePython/CLIPNeRF
2023 Blended-NeRF: Zero-Shot Object Generation and Blending in Existing Neural Radiance Fields ICCV https://arxiv.org/abs/2306.12760 https://github.com/orig333/Blended-NeRF
2023 3D-Aware Multi-Class Image-to-Image Translation with NeRFs CVPR https://arxiv.org/abs/2303.15012 https://github.com/sen-mao/3di2i-translation

Text-to-NeRF

Year Title Venue arxiv link github link
2022 Zero-Shot Text-Guided Object Generation with Dream Fields CVPR https://arxiv.org/abs/2112.01455 https://github.com/ashawkey/dreamfields-torch
2022 CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields CVPR https://arxiv.org/abs/2112.05139 https://github.com/cassiePython/CLIPNeRF
2022 DreamFusion: Text-to-3D using 2D Diffusion arXiv https://arxiv.org/abs/2209.14988
2023 NeuralLift-360: Lifting An In-the-wild 2D Photo to A 3D Object with 360° Views CVPR https://arxiv.org/abs/2211.16431 https://github.com/VITA-Group/NeuralLift-360
2023 Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures CVPR https://arxiv.org/abs/2211.07600 https://github.com/eladrich/latent-nerf
2023 SKED: Sketch-guided Text-based 3D Editing ICCV https://arxiv.org/abs/2303.10735 https://github.com/aryanmikaeili/SKED
2023 3D-CLFusion: Fast Text-to-3D Rendering with Contrastive Latent Diffusion arXiv https://arxiv.org/abs/2303.11938
2023 Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions ICCV https://arxiv.org/abs/2303.12789 https://github.com/ayaanzhaque/instruct-nerf2nerf
2023 CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout arXiv https://arxiv.org/abs/2303.13843
2023 DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models arXiv https://arxiv.org/abs/2304.00916
2023 DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model arXiv https://arxiv.org/abs/2304.02827 https://github.com/janeyeon/ditto-nerf-code
2023 Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields arXiv https://arxiv.org/abs/2305.11588 https://github.com/eckertzhang/Text2NeRF
2023 Towards Language-guided Interactive 3D Generation: LLMs as Layout Interpreter with Generative Feedback arXiv https://arxiv.org/abs/2305.15808

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages