This repository contains the academic website for our CVPR 2025 paper on uncertainty quantification in text-to-image (T2I) generative models. We introduce PUNC (Prompt-based UNCertainty Estimation), a novel method that leverages Large Vision-Language Models (LVLMs) to better address uncertainties arising from the semantics of prompts and generated images.
- First work to quantify and evaluate uncertainty of T2I models with respect to the prompt
- Novel PUNC method using LVLMs for semantic uncertainty estimation in text space
- Uncertainty disentanglement of aleatoric and epistemic uncertainties via precision and recall
- Comprehensive dataset of text prompts and generation pairs for further research
- Practical applications in bias detection, copyright protection, and OOD detection
- Clone the repository
git clone https://github.com/ENSTA-U2IS-AI/Uncertainty_diffusion.git cd Uncertainty_diffusion
- Gianni Franchi - ENSTA Paris
- Nacim Belkhir - mirai
- Dat Trong NGUYEN - ENSTA Paris
- Guoxuan Xia - Imperial College London
- Andrea Pilzer - NVIDIA
@inproceedings{franchi2024uncertainty,
title={Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation},
author={Franchi, Gianni and Belkhir, Nacim and Nguyen, Dat Trong and Xia, Guoxuan and Pilzer, Andrea},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2025}
}
This work was performed using HPC resources from:
- GENCI-IDRIS (Grant 2023-[AD011011970R3])
- EuroHPC Development access to LEONARDO, hosted by CINECA
- Design & Development: Nacim Belkhir
This project is open source. Please cite our work if you use this code or website template.
- 📄 Paper: ArXiv:2412.03178
- 💻 Code: GitHub Repository
- 🎓 Conference: CVPR 2025
For questions about the research, please contact the authors. For website technical issues, contact Nacim Belkhir.