Question about running samples #3258

shakibakh · 2020-03-18T23:00:47Z

shakibakh
Mar 18, 2020

Hello,

I am new to onnxruntime. I am running a simple example , my environment is as follows:
Windows 10
visual studio 2017
Cuda 10.1
cuDNN 7.6.5
onnxruntime from nuget
Run in Release mode

In line 108 when creating the tensor (Ort::Value::CreateTensor), I get the following error:
[E:onnxruntime:test, allocator.cc:28 onnxruntime::IAllocator::CalcMemSizeForArrayWithAlignment] D:\1\s\onnxruntime\core/common/safeint.h:17 SafeIntExceptionHandler::SafeIntOnOverflow Integer overflow
size overflow

(I have tried different examples and am getting the same error)
Does anybody know what am I doing wrong?

Thank you

RyanUnderhill · 2020-03-19T19:43:15Z

RyanUnderhill
Mar 19, 2020
Collaborator

Can you use a debugger and see what the parameters to the CalcMemSizeForArrayWithAlignment function are at the point the exception is thrown?

The nmemb, size, and alignment parameters (out is not needed):

bool IAllocator::CalcMemSizeForArrayWithAlignment(size_t nmemb, size_t size, size_t alignment, size_t* out)

0 replies

addisonklinke · 2020-03-19T21:20:06Z

addisonklinke
Mar 19, 2020

@RyanUnderhill I run into the same error as @shakibakh when trying to subsitute my ONNX model instead of the squeezenet demo

The CalcMemSizeForArrayWithAlignment error comes from line 135 of the C API sample. The dimensions of my model input are (-1, 3, 68, 136), and I believe this causes an issue with passing input_node_dims.data() to CreateTensorWithDataAsOrtValue. The -1 dimension results from a PyTorch to ONNX export with a dynamic axis designation. I have been able to run the dynamically batched model through your Python API. How can the same be accomplished with the C API?

0 replies

RyanUnderhill · 2020-03-20T23:11:18Z

RyanUnderhill
Mar 20, 2020
Collaborator

@skottmckay Hey Scott, since you added these new allocation functions do you have any ideas?

0 replies

skottmckay · 2020-03-22T03:23:47Z

skottmckay
Mar 22, 2020
Collaborator

You need to provide the actual dimensions for the amount of data you want to allocate. i.e. you need to replace the -1 with the actual batch size before allocating the data.

0 replies

addisonklinke · 2020-03-23T16:52:26Z

addisonklinke
Mar 23, 2020

@skottmckay I see, that makes sense from a memory allocation standpoint. How can I accomplish multiple batch sizes in a single script without having to export a separate ONNX model for each allowable batch size?

For instance, say I want to support batches of 10, 20, 30, ... 60. From your description, it sounds like I would need 6 different ONNX models with varying sizes for the batch dimension. Or would I export a single ONNX model with dynamic batch size (-1), and then allocate the allowable sizes using the C API like below? I am not familiar with the ONNX internals for dynamic axes, so my confusion is whether the batch sizes must be hard-coded in the loaded model, or if ONNX Runtime can communicate them to a dynamic model based on the size of memory allocated?

const int64_t* input_shape{new int64_t[4]{batch_size, channels, height, width}};
CreateTensorWithDataAsOrtValue(
    memory_info, 
    input_tensor_values.data(), 
    input_tensor_size * sizeof(float), 
    input_shape, 
    4, 
    ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT, 
    &input_tensor);

0 replies

skottmckay · 2020-03-23T21:18:40Z

skottmckay
Mar 23, 2020
Collaborator

A single model can be used with dynamic batch sizes. When you create the tensor you provide the shape (and that is kept in the OrtValue with the data), so the batch size is known from that.

Not sure how you're exporting your model, but note that -1 is not a valid dim_value for a dynamic batch size. In the ONNX model the dimension can have a dim_param (a string name such as 'N' or 'batch' would be typical) or neither a dim_param or dim_value, to be treated as dynamic.

0 replies

addisonklinke · 2020-03-23T21:29:12Z

addisonklinke
Mar 23, 2020

I exported from PyTorch following their tutorial. It looks like the 'batch_size' identifier would be the dim_param label you referenced

import torch
import torch.onnx

model = someNet()
model.eval()
dummy = torch.randn(1, 3, height, width)
torch.onnx.export(
    model=model,
    args=dummy,
    f='/path/for/model.onnx',
    input_names=['images'],
    output_names=['out1', 'out2'],
    dynamic_axes={
        'images': {0: 'batch_size'},
        'out1': {0: 'batch_size'},
        'out2': {0: 'batch_size'}})

Is there something I should change to avoid the -1 dimension? I used this model with the ORT Python API and didn't have any issues running inference for different sized batches

0 replies

skottmckay · 2020-03-24T05:20:16Z

skottmckay
Mar 24, 2020
Collaborator

Yes, the 'batch_size' is a dim_param for the first dimension of the model input.

Using that example where you have dimensions of {'batch_size', 3, 244, 244} you need to create some input to run the model. You decide the batch size of that input. If you just want to send in a single image, the 'shape' argument in the call to CreateTensorWithDataAsOrtValue would be {1, 3, 244, 244} as the batch size would be 1. You control the values in that shape, so if you're getting a -1 from somewhere you need to replace it with the correct batch size for the data.

Possibly the confusion comes from that call also taking a value for the length of the data. Currently that's just used to ensure the amount of data required for the provided shape does not exceed the buffer size. Technically the buffer length information could be used to deduce what value a -1 could be replaced with, however it may not always be the case that that's the correct thing to do (e.g. user has a larger-than-required buffer used across multiple requests).

0 replies

blgnksy · 2020-05-14T16:09:55Z

blgnksy
May 14, 2020

@addisonklinke When using my own network, I just added a line before input tensor's memory allocation in order to fix the batch size. So this example code is now working:
input_node_dims[0] = 1;

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about running samples #3258

{{title}}

Replies: 9 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Question about running samples #3258

shakibakh Mar 18, 2020

Replies: 9 comments

RyanUnderhill Mar 19, 2020 Collaborator

addisonklinke Mar 19, 2020

RyanUnderhill Mar 20, 2020 Collaborator

skottmckay Mar 22, 2020 Collaborator

addisonklinke Mar 23, 2020

skottmckay Mar 23, 2020 Collaborator

addisonklinke Mar 23, 2020

skottmckay Mar 24, 2020 Collaborator

blgnksy May 14, 2020

shakibakh
Mar 18, 2020

RyanUnderhill
Mar 19, 2020
Collaborator

addisonklinke
Mar 19, 2020

RyanUnderhill
Mar 20, 2020
Collaborator

skottmckay
Mar 22, 2020
Collaborator

addisonklinke
Mar 23, 2020

skottmckay
Mar 23, 2020
Collaborator

addisonklinke
Mar 23, 2020

skottmckay
Mar 24, 2020
Collaborator

blgnksy
May 14, 2020