Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triton server crash on hitting inference endpoint #258

Open
vaibhavjainwiz opened this issue Oct 30, 2024 · 3 comments
Open

Triton server crash on hitting inference endpoint #258

vaibhavjainwiz opened this issue Oct 30, 2024 · 3 comments
Assignees

Comments

@vaibhavjainwiz
Copy link

vaibhavjainwiz commented Oct 30, 2024

Triton Inference server restart everytime I hit the /infer endpoint. I am usin Kserve to deploy model on K8s.

Input :

curl --location 'https://<url>/v2/models/dali/infer' \ --header 'Content-Type: application/json' \ --data '{ "inputs": [ { "name": "DALI_INPUT_0", "shape": [ 1699 ], "datatype": "UINT8", "data": [ 255, 216, 255, 224, 0, 16, 74, 70, 73, 70, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 255, 226, 1, 216, 73, 67, 67, 95, 80, 82, 79, 70, 73, 76, 69, 0, 1, 1, 0, 0, 1, 200, 0, 0, 0, 0, 4, 48, 0, 0, 109, 110, 116, 114, 82, 71, 66, 32, 88, 89, 90, 32, 7, 224, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 97, 99, 115, 112, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 246, 214, 0, 1, 0, 0, 0, 0, 211, 45, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 100, 101, 115, 99, 0, 0, 0, 240, 0, 0, 0, 36, 114, 88, 89, 90, 0, 0, 1, 20, 0, 0, 0, 20, 103, 88, 89, 90, 0, 0, 1, 40, 0, 0, 0, 20, 98, 88, 89, 90, 0, 0, 1, 60, 0, 0, 0, 20, 119, 116, 112, 116, 0, 0, 1, 80, 0, 0, 0, 20, 114, 84, 82, 67, 0, 0, 1, 100, 0, 0, 0, 40, 103, 84, 82, 67, 0, 0, 1, 100, 0, 0, 0, 40, 98, 84, 82, 67, 0, 0, 1, 100, 0, 0, 0, 40, 99, 112, 114, 116, 0, 0, 1, 140, 0, 0, 0, 60, 109, 108, 117, 99, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 12, 101, 110, 85, 83, 0, 0, 0, 8, 0, 0, 0, 28, 0, 115, 0, 82, 0, 71, 0, 66, 88, 89, 90, 32, 0, 0, 0, 0, 0, 0, 111, 162, 0, 0, 56, 245, 0, 0, 3, 144, 88, 89, 90, 32, 0, 0, 0, 0, 0, 0, 98, 153, 0, 0, 183, 133, 0, 0, 24, 218, 88, 89, 90, 32, 0, 0, 0, 0, 0, 0, 36, 160, 0, 0, 15, 132, 0, 0, 182, 207, 88, 89, 90, 32, 0, 0, 0, 0, 0, 0, 246, 214, 0, 1, 0, 0, 0, 0, 211, 45, 112, 97, 114, 97, 0, 0, 0, 0, 0, 4, 0, 0, 0, 2, 102, 102, 0, 0, 242, 167, 0, 0, 13, 89, 0, 0, 19, 208, 0, 0, 10, 91, 0, 0, 0, 0, 0, 0, 0, 0, 109, 108, 117, 99, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 12, 101, 110, 85, 83, 0, 0, 0, 32, 0, 0, 0, 28, 0, 71, 0, 111, 0, 111, 0, 103, 0, 108, 0, 101, 0, 32, 0, 73, 0, 110, 0, 99, 0, 46, 0, 32, 0, 50, 0, 48, 0, 49, 0, 54, 255, 219, 0, 67, 0, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 219, 0, 67, 1, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 192, 0, 17, 8, 0, 214, 0, 236, 3, 1, 34, 0, 2, 17, 1, 3, 17, 1, 255, 196, 0, 23, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 255, 196, 0, 39, 16, 1, 1, 1, 0, 1, 2, 5, 4, 3, 1, 1, 0, 0, 0, 0, 0, 0, 1, 17, 33, 49, 81, 2, 18, 65, 97, 240, 113, 129, 145, 177, 161, 209, 225, 193, 241, 255, 196, 0, 21, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 255, 196, 0, 20, 17, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 218, 0, 12, 3, 1, 0, 2, 17, 3, 17, 0, 63, 0, 136, 168, 0, 101, 92, 160, 138, 101, 48, 1, 115, 58, 241, 251, 103, 96, 42, 179, 230, 250, 38, 208, 108, 99, 111, 122, 104, 54, 48, 3, 99, 26, 186, 13, 33, 166, 128, 47, 11, 192, 50, 47, 6, 125, 193, 17, 64, 64, 5, 13, 64, 70, 181, 53, 0, 86, 227, 42, 43, 106, 198, 174, 136, 169, 178, 51, 118, 160, 22, 109, 217, 207, 237, 156, 106, 55, 192, 56, 142, 185, 62, 72, 153, 62, 111, 246, 14, 99, 121, 19, 39, 127, 159, 128, 100, 107, 39, 115, 39, 112, 64, 227, 184, 2, 160, 10, 7, 32, 44, 185, 253, 50, 3, 83, 149, 73, 209, 81, 81, 26, 1, 129, 172, 76, 84, 65, 64, 105, 89, 211, 81, 90, 19, 77, 5, 42, 106, 1, 27, 98, 54, 168, 140, 180, 153, 111, 32, 202, 53, 229, 167, 148, 25, 26, 242, 251, 158, 95, 127, 224, 25, 23, 12, 4, 84, 192, 20, 50, 153, 64, 69, 64, 15, 186, 128, 178, 247, 93, 97, 65, 68, 68, 85, 64, 84, 0, 0, 0, 22, 34, 192, 88, 211, 49, 160, 42, 91, 101, 146, 127, 234, 180, 2, 42, 2, 111, 207, 182, 155, 252, 159, 63, 140, 78, 62, 117, 5, 212, 211, 100, 77, 128, 186, 168, 160, 51, 183, 113, 165, 69, 115, 69, 168, 168, 42, 0, 0, 10, 138, 128, 0, 0, 0, 0, 2, 192, 128, 173, 50, 208, 13, 185, 182, 2, 43, 55, 248, 128, 127, 17, 157, 237, 18, 221, 73, 160, 187, 126, 72, 125, 191, 226, 85, 244, 160, 125, 63, 13, 75, 172, 47, 191, 172, 6, 213, 34, 162, 185, 223, 84, 90, 138, 128, 214, 24, 12, 128, 10, 141, 78, 139, 130, 176, 55, 137, 130, 53, 229, 135, 150, 42, 160, 207, 150, 47, 150, 40, 42, 100, 49, 80, 24, 84, 85, 65, 191, 72, 195, 115, 164, 250, 2, 86, 47, 73, 239, 203, 126, 46, 149, 206, 243, 103, 210, 1, 47, 114, 251, 25, 103, 43, 61, 61, 253, 125, 251, 118, 128, 153, 126, 83, 203, 90, 253, 231, 29, 247, 250, 250, 150, 241, 123, 231, 40, 51, 48, 151, 148, 51, 20, 111, 195, 235, 59, 86, 153, 157, 111, 217, 164, 87, 52, 84, 84, 116, 157, 4, 84, 87, 49, 81, 81, 185, 208, 39, 65, 20, 0, 26, 0, 0, 0, 0, 24, 84, 85, 65, 169, 210, 50, 212, 232, 5, 233, 92, 239, 165, 246, 253, 58, 49, 103, 167, 222, 127, 64, 158, 107, 211, 78, 139, 47, 135, 213, 60, 87, 111, 0, 187, 61, 254, 210, 67, 102, 100, 137, 101, 157, 125, 83, 211, 64, 206, 53, 103, 54, 36, 189, 86, 113, 61, 239, 79, 167, 112, 106, 122, 222, 245, 111, 74, 78, 15, 23, 68, 86, 17, 81, 81, 208, 189, 4, 168, 172, 163, 117, 133, 70, 224, 8, 160, 32, 52, 168, 170, 128, 0, 0, 131, 0, 40, 55, 56, 156, 176, 185, 178, 115, 152, 13, 37, 154, 112, 3, 23, 241, 127, 100, 201, 121, 149, 171, 202, 101, 244, 191, 144, 75, 118, 158, 153, 12, 189, 162, 243, 223, 62, 128, 153, 157, 127, 31, 219, 82, 122, 222, 169, 38, 40, 42, 91, 58, 46, 198, 120, 230, 162, 162, 4, 84, 116, 61, 69, 130, 167, 137, 136, 215, 137, 32, 141, 9, 201, 202, 40, 28, 128, 162, 128, 156, 156, 170, 130, 10, 3, 152, 81, 80, 51, 231, 229, 103, 88, 80, 103, 231, 232, 34, 241, 220, 19, 162, 234, 31, 168, 11, 169, 162, 2, 234, 40, 8, 190, 91, 223, 246, 53, 230, 153, 254, 127, 160, 205, 224, 133, 230, 147, 168, 58, 39, 70, 186, 57, 248, 188, 91, 196, 4, 188, 183, 38, 70, 124, 49, 180, 4, 84, 20, 69, 1, 68, 216, 108, 84, 81, 55, 218, 156, 246, 69, 85, 103, 105, 200, 51, 66, 138, 139, 58, 194, 164, 185, 77, 4, 158, 191, 61, 3, 160, 4, 59, 253, 149, 59, 130, 0, 13, 79, 95, 163, 45, 32, 13, 100, 206, 159, 63, 44, 155, 126, 96, 23, 173, 17, 96, 38, 219, 214, 172, 154, 212, 146, 40, 40, 136, 138, 162, 0, 168, 32, 52, 2, 130, 234, 8, 46, 154, 138, 12, 80, 162, 162, 44, 244, 69, 128, 120, 186, 172, 147, 25, 187, 243, 253, 38, 207, 144, 11, 197, 16, 6, 178, 38, 46, 254, 153, 239, 238, 11, 122, 247, 232, 221, 207, 245, 206, 242, 0, 0, 11, 25, 80, 104, 77, 52, 20, 64, 20, 79, 194, 128, 6, 10, 208, 125, 142, 80, 3, 41, 158, 224, 11, 147, 220, 200, 12, 81, 108, 69, 68, 89, 213, 0, 106, 244, 162, 105, 160, 78, 135, 116, 0, 166, 116, 231, 175, 177, 83, 122, 123, 2, 231, 227, 23, 61, 183, 163, 59, 122, 27, 238, 131, 92, 51, 122, 136, 160, 10, 2, 162, 128, 0, 40, 0, 0, 13, 128, 138, 40, 0, 0, 35, 56, 210, 3, 3, 105, 145, 70, 81, 172, 48, 70, 69, 202, 101, 4, 12, 166, 80, 17, 113, 112, 25, 26, 192, 16, 80, 0, 80, 65, 64, 69, 0, 0, 21, 176, 16, 0, 0, 0, 68, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 20, 0, 64, 0, 31, 255, 217 ] } ], "outputs": [ { "name": "DALI_OUTPUT_0" } ] }'

config.pbtxt

name: "dali"
backend: "dali"
max_batch_size: 0
input [
{
   name: "DALI_INPUT_0"
   data_type: TYPE_UINT8
   dims: [ -1 ]
}
]

output [
{
   name: "DALI_OUTPUT_0"
   data_type: TYPE_UINT8
   dims: [ 224, 224, 3 ]
}
]

instance_group [
{
  kind: KIND_GPU
}
]

dali.py

import nvidia.dali as dali
from nvidia.dali.plugin.triton import autoserialize

@autoserialize 
@dali.pipeline_def(batch_size=256, num_threads=4, device_id=0) 
def pipe():
    images = dali.fn.external_source(device="cpu", name="DALI_INPUT_0")
    images = dali.fn.decoders.image(images, device="mixed")
    images = dali.fn.resize(images, resize_x=224, resize_y=224)
    return images   
@szalpal
Copy link
Member

szalpal commented Oct 30, 2024

Hi @vaibhavjainwiz ,

let me help with your problem. I'd like to understand what happens first, so could you clarify? The DALI Pipeline code you've sent tells in the comment that the decoding and resize happens on the CPU, however, the code tells that it actually happens on GPU:

    images = dali.fn.decoders.image(images, device="mixed")  # Decode on CPU
    images = dali.fn.resize(images, resize_x=224, resize_y=224)  # Resize on CPU

The device='mixed' will perform the decoding on the GPU and the resize will infer the GPU device from previous operation. What precisely was you intention here?

@vaibhavjainwiz
Copy link
Author

Hi @vaibhavjainwiz ,

let me help with your problem. I'd like to understand what happens first, so could you clarify? The DALI Pipeline code you've sent tells in the comment that the decoding and resize happens on the CPU, however, the code tells that it actually happens on GPU:

    images = dali.fn.decoders.image(images, device="mixed")  # Decode on CPU
    images = dali.fn.resize(images, resize_x=224, resize_y=224)  # Resize on CPU

The device='mixed' will perform the decoding on the GPU and the resize will infer the GPU device from previous operation. What precisely was you intention here?

Sorry for confusion, I was trying out this pipeline on both CPU and GPU. These comments are left over, please ignore them. I am removing these comments from issue description to avoid more confusion.

@szalpal
Copy link
Member

szalpal commented Oct 30, 2024

Thank you for the clarification. I've run your model with the sample provided as a stand-alone DALI pipeline and everything worked fine. Although I didn't plug it in the K8s nor Triton. Do you happen to have any Triton stack trace which might help with narrowing down the issue?

Also, something that comes to my mind when looking at the configuration, you're setting max_batch_size=0, while in the DALI pipeline you're setting batch_size=256. The max_batch_size=0 option in Triton is generally used for models that do not support batching. Could you check if setting these two params to the same value (e.g. max_batch_size=256) helps?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants