Is there any limit on the size of tensor of double type that can be created in onnxruntime? #4614

Karnav123 · 2020-07-24T20:16:28Z

Karnav123
Jul 24, 2020

Answered by skottmckay

Jul 27, 2020

AFAIK there's no hardcoded limit. If there's sufficient memory, you can create it.

View full answer

skottmckay · 2020-07-27T07:45:22Z

skottmckay
Jul 27, 2020
Collaborator

AFAIK there's no hardcoded limit. If there's sufficient memory, you can create it.

0 replies

Karnav123 · 2020-07-28T21:39:46Z

Karnav123
Jul 28, 2020
Author

AFAIK there's no hardcoded limit. If there's sufficient memory, you can create it.

I tried to run the model on a tensor of size 22*15000 but it gives memory error. Although the tensor gets created successfully. The intraop thread is set to 8. My system has 32 GB RAM.

0 replies

skottmckay · 2020-08-02T23:07:00Z

skottmckay
Aug 2, 2020
Collaborator

If there's a memory error it's probably running out of memory. How much memory is required depends on the model. There will be copies of data at different points if the input buffer to an operator cannot be re-used for the output. Operators may decrease or increase the size of the data (e.g. you could Resize to make the data small or larger). Memory needs to be allocated for model outputs. Memory needs to be allocated for any weights in the model.

0 replies

Karnav123 · 2020-08-03T12:23:41Z

Karnav123
Aug 3, 2020
Author

If there's a memory error it's probably running out of memory. How much memory is required depends on the model. There will be copies of data at different points if the input buffer to an operator cannot be re-used for the output. Operators may decrease or increase the size of the data (e.g. you could Resize to make the data small or larger). Memory needs to be allocated for model outputs. Memory needs to be allocated for any weights in the model.

I think I have a similar issue to the following:
#1632

My approach is for a large size data is to-

Create multiple batches of small size.
Call the session.Run(...) in a loop.

But going by the above issue I understand that we need to create a new session to avoid memory leak, but loading a large model, again and again, will lead to performance degradation.

Is there any better approach to handle large data in batches and not create a new session regularly.

0 replies

skottmckay · 2020-08-04T23:10:13Z

skottmckay
Aug 4, 2020
Collaborator

If your model supports batches (i.e. the input shapes have a symbolic or unknown dimension value for the batch size) you should not need to create new session every time. I don't believe there's a memory leak. It's more that each session has its own copy of the model and potentially its own memory arenas if those are enabled, so having lots of sessions incurs more duplicated memory usage/overhead.

Are the batches of the same size?

How many times can you call Run with a single session before it runs out of memory?

Are you free-ing the memory returned as the Run output each time?

0 replies

Karnav123 · 2020-08-05T13:30:33Z

Karnav123
Aug 5, 2020
Author

If your model supports batches (i.e. the input shapes have a symbolic or unknown dimension value for the batch size) you should not need to create new session every time. I don't believe there's a memory leak. It's more that each session has its own copy of the model and potentially its own memory arenas if those are enabled, so having lots of sessions incurs more duplicated memory usage/overhead.

@skottmckay

Yes, the input shape has a symbolic dimension value. I did get the memory arena comment that you mentioned. However, while creating the tensor I am using the following:

auto memoryInfo = Ort::MemoryInfo::CreateCPU(OrtArenaAllocator, OrtMemTypeDefault);

Are the batches of the same size?

No, it may not be of same size.

How many times can you call Run with a single session before it runs out of memory?

So If I create batch size of 1000 then it runs out of memory after executing 2-3 times.

Are you free-ing the memory returned as the Run output each time?

I am calling Run in the following way:
auto output_tensors = session->Run(Ort::RunOptions{ nullptr }, input_node_names.data(), input_tensor.data(), input_node_names.size(), output_node_names.data(), 2);

I am not freeing the memory. Is there any method to free the memory created in the arena? Do we need to free the memory returned by CreateTensor() method also?

0 replies

skottmckay · 2020-08-05T21:35:12Z

skottmckay
Aug 5, 2020
Collaborator

If you're using the C++ API it should release the memory for the model inputs and outputs automatically when the Ort::Value goes out of scope.

You could try disabling the memory pattern planner first.
If that doesn't help enough you could also try disabling the arena.

import onnxruntime
so = onnxruntime.SessionOptions()
so.enable_mem_pattern = False
so.enable_cpu_mem_arena = False
session = onnxruntime.InferenceSession(<model path>, so)

0 replies

Karnav123 · 2020-08-05T21:40:47Z

Karnav123
Aug 5, 2020
Author

If you're using the C++ API it should release the memory for the model inputs and outputs automatically when the Ort::Value goes out of scope.

You could try disabling the memory pattern planner first.

If that doesn't help enough you could also try disabling the arena.
import onnxruntime

so = onnxruntime.SessionOptions()

so.enable_mem_pattern = False

so.enable_cpu_mem_arena = False

session = onnxruntime.InferenceSession(<model path>, so)

@skottmckay
If I understand correctly then for C++ APIs I need not free any memory including that of Ort::Value that I get from Run method. Also, is the above code snippet also available in c++ APIs. What about the memory arena we create while CreateTensor API call in C++.

0 replies

skottmckay · 2020-08-05T22:12:00Z

skottmckay
Aug 5, 2020
Collaborator

Yes. Look at the Ort::SessionOptions class in the C++ API. It has DisableMemPattern and DisableCpuMemArena methods.

The CreateTensor API doesn't use the arena.

0 replies

Karnav123 · 2020-08-06T15:08:26Z

Karnav123
Aug 6, 2020
Author

Yes. Look at the Ort::SessionOptions class in the C++ API. It has DisableMemPattern and DisableCpuMemArena methods.

The CreateTensor API doesn't use the arena.
Thank you for your help. Currently, it is working. I will ask for help if I get more issues. However is there a good document to understand the details of various APIs and memory management details present in C++.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there any limit on the size of tensor of double type that can be created in onnxruntime? #4614

{{title}}

Replies: 10 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Is there any limit on the size of tensor of double type that can be created in onnxruntime? #4614

Karnav123 Jul 24, 2020

Replies: 10 comments

skottmckay Jul 27, 2020 Collaborator

Karnav123 Jul 28, 2020 Author

skottmckay Aug 2, 2020 Collaborator

Karnav123 Aug 3, 2020 Author

skottmckay Aug 4, 2020 Collaborator

Karnav123 Aug 5, 2020 Author

skottmckay Aug 5, 2020 Collaborator

Karnav123 Aug 5, 2020 Author

skottmckay Aug 5, 2020 Collaborator

Karnav123 Aug 6, 2020 Author

Karnav123
Jul 24, 2020

skottmckay
Jul 27, 2020
Collaborator

Karnav123
Jul 28, 2020
Author

skottmckay
Aug 2, 2020
Collaborator

Karnav123
Aug 3, 2020
Author

skottmckay
Aug 4, 2020
Collaborator

Karnav123
Aug 5, 2020
Author

skottmckay
Aug 5, 2020
Collaborator

Karnav123
Aug 5, 2020
Author

skottmckay
Aug 5, 2020
Collaborator

Karnav123
Aug 6, 2020
Author