Enabling transformer and T5 to be quantized with different types #173

omidsakhi · 2024-09-15T21:38:08Z

Enabling transformer and T5 to be quantized with different types including qint8, qfloat8_e4m3fn (qfloat8 is the alias) and qfloat8_e5m2 by specifying quantization_type_transformer or quantization_type_t5 in the model config section of a config yaml file.

…uding qint8, qfloat8_e4m3fn (qfloat8 is the alias) and qfloat8_e5m2 by specifying quantization_type_transformer or quantization_type_t5 in the model config section of a config yaml file.

omidsakhi · 2024-09-15T22:05:38Z

A bit of a back story why this PR exists. I have the following spec:

Windows
RTX 4090
48GB RAM
CUDA 12.4
Python 3.12.6
PyTorch 2.4.1+cu124
transformers 4.44.2
diffusers 0.31.0.dev0
optimum-quanto 0.2.4

And I found that ai-toolkit is not able to generate pre-training samples and/or training due to qfloat8 quantization. The generation creates black (blank) images for me due to encountering invalid values. the training encounters inf values. The solution that I have found so far is to switch the quantization from qfloat8 to qint8 for both transformer and T5. At this point it is not clear which of the modules above are causing the qfloat8 quantization to fail.

Enabling transformer and T5 to be quantized with different types incl…

256ce54

…uding qint8, qfloat8_e4m3fn (qfloat8 is the alias) and qfloat8_e5m2 by specifying quantization_type_transformer or quantization_type_t5 in the model config section of a config yaml file.

omidsakhi added 2 commits September 22, 2024 16:13

Merge branch 'ostris:main' into main

b3bdcd3

Merge branch 'ostris:main' into main

f1c7691

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enabling transformer and T5 to be quantized with different types #173

Enabling transformer and T5 to be quantized with different types #173

omidsakhi commented Sep 15, 2024

omidsakhi commented Sep 15, 2024

Enabling transformer and T5 to be quantized with different types #173

Are you sure you want to change the base?

Enabling transformer and T5 to be quantized with different types #173

Conversation

omidsakhi commented Sep 15, 2024

omidsakhi commented Sep 15, 2024