You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
unfortunately no. quantized onnx models can only be run on CPU & onnxruntime-gpu does not support quantization.
if you want more details on this question you should create an issue in onnxruntime.
@mblank5 no, this library uses onnxruntime, and to support GPUs you need to have onnxruntime-GPU installed.
BUT you can uninstall onnxruntime after the fastt5 library is installed, and install onnxruntime-gpu and try running the model but not sure you'll get speed up. for more info refer to this issue.
with onnxruntime, you'll get speed up if you are using modern CPUs and more CPU cores. refer benchmark section of README.
Just wonder whether this quantized onnx t5 can run on GPU.
The text was updated successfully, but these errors were encountered: