-
Notifications
You must be signed in to change notification settings - Fork 351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
can i server my model with triton inference server #761
Comments
after torch-tensorrt complie |
We are currently working on integration with Triton's LibTorch backend so this workflow is supported out of the box. cc: @borisfom |
Can we serve it with tensorrt backend in Triton ? I tried converting to trt engine using the following code. trt_ts_engine = torch_tensorrt.ts.convert_method_to_trt_engine(traced_script_module,
method_name='forward',
inputs=inputs,
truncate_long_and_double=True)
with open(f"{OUT_PATH}/model.plan", 'wb') as f:
f.write(trt_ts_engine) But I get this "Version tag does not match" error in Triton.
Using: |
It's working for me for fp32/fp16 with nvcr.io/nvidia/pytorch:21.11-py3 and nvcr.io/nvidia/tritonserver:21.11-py3 images. |
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days |
Any updates? Is this working yet on triton server? |
❓ Question
What you have already tried
Environment
conda
,pip
,libtorch
, source):Additional context
The text was updated successfully, but these errors were encountered: