-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to send binary data (audio file) in perf_analyzer? #145
Comments
@matthewkotila, by any chance would you happen to know the solution for this issue? |
CC: @matthewkotila |
Experienced the same issue of inability to profile my model with native tools. @dyastremsky Any ideas where it could be answered? |
The team working on Tools who would know more (like @matthewkotila) is quite occupied at the moment, so there will be a delay in response. I am not familiar with the specific requirements of PA input files, especially in an audio context, but I did see this unofficial solution available that may be helpful in the meantime. Instructions for running these are here. This solution may also provide some direction, though note that it's for older versions of Triton. |
Thanks for the information! Looks like the library examples use JSON to send WAV PCM data instead of the more efficient raw binary WAV format. Not ideal since it requires changing Triton model signatures, but could work as a temporary fix if there aren't better options right now. |
Thanks for responding. Some more information for this use case here as well: triton-inference-server/server#3206 |
I have the same issue for images, I usually send the images as encoded bytes to Triton and I would like to be able to use the perf analyzer to benchmark my pipelines. |
There is a solution for a single file. After that you may get an error with shape. But I still don't understand how to get this to work on multiple files. |
Could you elaborate? If your model has multiple inputs that you want to supplied binary data for, you should be able to include one file per input in the |
@matthewkotila, It's not about multiple inputs. It's about multiple requests. |
Unfortunately we don't support supplying binary files for more than one request, but you should be able to convert the binary data into b64 representation and include that in an input data JSON supplied to PA. That will allow you to supply more than one request's worth of input data. I agree, what you've request would be good to have--I've noted the feature request but don't have a timeline of when we would be able to work on it/deliver it. |
@matthewkotila |
I am doing this for encoded images for benchmarking, but in production I sent bytes directly. The cost of decoding b64 is not that big so the benchmark should not be too far off |
The decoding of the b64 data happens inside Perf Analyzer (the client) before sending to the server. You wouldn't have to change anything regarding how you set up your triton service. But yes, it is client-side computational time that theoretically could impact PA's ability to maintain concurrency or a desired request rate (but unlikely as above person mentioned), and could be lessened with the feature request you made. |
@matthewkotila hello, do you have an example how to convert wav file to b64 json which supported by PA. I've tried different ways but received errors like |
Description
(same issue triton-inference-server/server#3206)
I have a triton model that accepts a binary string. I want to send a wav file, if I do it through the client - everything works, if through the perf analyzer - it does not work.
Triton Information
Triton:
nvcr.io/nvidia/tritonserver:23.01-py3
Triton SDK for perf analyzer:
nvcr.io/nvidia/tritonserver:23.07-py3-sdk
To Reproduce
config.pbtxt
If I'm trying to send a
wav
file:If I'm trying to send a binary string of a
wav
file:Generated as follows
The string is forwarded, but after
in_0.as_numpy()[0]
it looks likeb'RIFFx\\x15\\x00\\x00WAVEfmt \\x10\\x00\\x00\\x00\\x01\\x00\\x01\\x00@\\x1f...'
.But it should look like this
b'RIFFx\x15\x00\x00WAVEfmt \x10\x00\x00\x00\x01\x00\x01\x00@\x1f
client.py is working
The text was updated successfully, but these errors were encountered: