You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi all, I get a degradation in results after an INT8 quantization with torchscript, after updating my torch_tensorrt, torch and tensorrt versions. I have listed the dependencies for both cases below, is this expected?
Note: In the later version, need to switch import torch_tensorrt.ptq to import torch_tensorrt.ts.ptq, the rest of the script is identical
While the previous versions work well (I get a quantized model that produces close-enough results to the original model), for the later version, I get garbage outputs (I can see there is something wrong with the calibration as the output tensor values is always within a small range 0.18-0.21, whereas it should take any value between -1,1). I'm posting the quantization script approximately, however, I cannot post the model details unfortunately, as it's proprietary.
Would appreciate all forms of help :), also would love to submit a fix for the underlying issue (if one is present).
The text was updated successfully, but these errors were encountered:
I have also tried the script with following dependencies to bisect the issue.
Torch: 2.2.1
TensorRT: 8.6.1
torch_tensorrt: 2.2.0
Python: 3.11
CUDA: 12.1
With these dependencies, it also works as expected (good results)
TS INT8 degradation later versions
Hi all, I get a degradation in results after an INT8 quantization with torchscript, after updating my torch_tensorrt, torch and tensorrt versions. I have listed the dependencies for both cases below, is this expected?
Earlier Version (Works Well):
Torch: 2.0.1
CUDA: 11.8
torch_tensorrt: 1.4.0
Tensorrt: 8.5.3.1
GPU: A100
Python: 3.9
Later Version (Degradation in Results): Torch 2.4.0
CUDA 12.1
torch_tensorrt: 2.4.0
Tensorrt: 10.1.0
GPU: A100
Python: 3.11
Script (Approximately, as I can't submit the model):
Note: In the later version, need to switch
import torch_tensorrt.ptq
toimport torch_tensorrt.ts.ptq
, the rest of the script is identicalWhile the previous versions work well (I get a quantized model that produces close-enough results to the original model), for the later version, I get garbage outputs (I can see there is something wrong with the calibration as the output tensor values is always within a small range 0.18-0.21, whereas it should take any value between -1,1). I'm posting the quantization script approximately, however, I cannot post the model details unfortunately, as it's proprietary.
Would appreciate all forms of help :), also would love to submit a fix for the underlying issue (if one is present).
The text was updated successfully, but these errors were encountered: