Implement text decoding submodule to ME2-VD

Brain Diffuser decodes both the CLIP Text vectors for the image captions as well as the CLIP Image vectors for the images themselves. Brain Diffuser is the best performer on imagery, thus, one potential avenue for improving imagery recons might be incorporating text decoding for dual guidance in versatile diffusion. This issue is to implement a decoding submodule for the caption info, potentially with it's own diffusion prior if GPU memory allows.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement text decoding submodule to ME2-VD #16

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement text decoding submodule to ME2-VD #16

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions