This directory contains documents related to the HTTP/REST and GRPC protocols used by Triton. Triton uses the KServe community standard inference protocols plus several extensions that are defined in the following documents:
- Binary tensor data extension
- Classification extension
- Schedule policy extension
- Sequence extension
- Shared-memory extension
- Model configuration extension
- Model repository extension
- Statistics extension
- Trace extension
- Logging extension
- Parameters extension
Note that some extensions introduce new fields onto the inference protocols, and the other extensions define new protocols that Triton follows, please refer to the extension documents for detail.
For the GRPC protocol, the protobuf specification is also available. In addition, you can find the GRPC health checking protocol protobuf specification here.
You can configure the Triton endpoints, which implement the protocols, to restrict access to some protocols and to control network settings, please refer to protocol customization guide for detail.
Assuming your host or docker config
supports IPv6 connections, tritonserver
can be configured to use IPv6
HTTP endpoints as follows:
$ tritonserver ... --http-address ipv6:[::1]&
...
I0215 21:04:11.572305 571 grpc_server.cc:4868] Started GRPCInferenceService at 0.0.0.0:8001
I0215 21:04:11.572528 571 http_server.cc:3477] Started HTTPService at ipv6:[::1]:8000
I0215 21:04:11.614167 571 http_server.cc:184] Started Metrics Service at ipv6:[::1]:8002
This can be confirmed via netstat
, for example:
$ netstat -tulpn | grep tritonserver
tcp6 0 0 :::8000 :::* LISTEN 571/tritonserver
tcp6 0 0 :::8001 :::* LISTEN 571/tritonserver
tcp6 0 0 :::8002 :::* LISTEN 571/tritonserver
And can be tested via curl
, for example:
$ curl -6 --verbose "http://[::1]:8000/v2/health/ready"
* Trying ::1:8000...
* TCP_NODELAY set
* Connected to ::1 (::1) port 8000 (#0)
> GET /v2/health/ready HTTP/1.1
> Host: [::1]:8000
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Length: 0
< Content-Type: text/plain
<
* Connection #0 to host ::1 left intact
This table maps various Triton Server error codes to their corresponding HTTP status codes. It can be used as a reference guide for understanding how Triton Server errors are handled in HTTP responses.
Triton Server Error Code | HTTP Status Code | Description |
---|---|---|
TRITONSERVER_ERROR_INTERNAL |
500 | Internal Server Error |
TRITONSERVER_ERROR_NOT_FOUND |
404 | Not Found |
TRITONSERVER_ERROR_UNAVAILABLE |
503 | Service Unavailable |
TRITONSERVER_ERROR_UNSUPPORTED |
501 | Not Implemented |
TRITONSERVER_ERROR_UNKNOWN ,TRITONSERVER_ERROR_INVALID_ARG ,TRITONSERVER_ERROR_ALREADY_EXISTS ,TRITONSERVER_ERROR_CANCELLED |
400 |
Bad Request (default for other errors) |