This project is a Beta version to support and execute the new Stability.ai model - Stable Diffusion 3 Medium. Suggestions for improvements are welcome!
We implemented optimization improvements for local execution such as CPU offloading and quantization of the transformer model and the T5 text-encoder.
Then run the following command at the command prompt:
git clone https://github.com/DEVAIEXP/SD3.git
cd SD3
#to create virtual env
python -m venv venv
(on linux)
source venv/bin/activate
(on windows)
.\venv\Scripts\activate.bat
#continue installing requirements
pip install torch==2.2.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 --upgrade
pip install bitsandbytes==0.43.1 --upgrade
pip install -r requirements.txt
#normal run without quantization full 16bit precision
python app.py
#to enable CPU offloading, pass the --lowvram parameter to start. Note: Do not use this parameter in conjunction with the parameters below:
python app.py --lowvram
#to quantize MMDiT model pass the --mmdit_load_mode parameter to start with values 8bit or 4bit
python app.py --mmdit_load_mode=8bit
#to quantize T5-XXL text_encoder model pass the --t5_load_mode parameter to start with values 8bit or 4bit
python app.py --t5_load_mode=8bit
#or together if you are going to quantize both
python app.py --mmdit_load_mode=8bit --t5_load_mode=8bit
Below is a matrix that presents the results of the images applying the quantization of the MMDiT model and the T5-XXL:
MMDiT/T5-XXL precision |
16bit | 8bit | 4bit |
---|---|---|---|
16bit | |||
8bit | |||
4bit |
This project is released under the Apache License 2.0.
If you have any questions, please contact: [email protected]