Replies: 1 comment
-
ORT allocates memory on the native heap rather than the Java heap as the Java API is a pass through to the native ORT code. 10x expansion seems kinda high to me, though it'll be model dependent as it likes to keep buffers around for computational results for the layers and the size of those buffers will depend on the model structure and the input size. How that actually happens can be controlled a bit with the session options, but the ORT developers would have a better idea of how the native code manages memory. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am using the onnxruntime Java api in a database environment. I just want to confirm that what I am seeing as far as memory usage is what is expected. I am seeing the native machine memory increasing about 10-12x the size of the input model so for instance when I load my 1GB model I see native memory going up by 11.75GB. I see similar increases in my 10MB and 100MB models as well. Is this what is expected for this library? Other libraries we run in this environment are Java libraries which we can track and store in a Java cache and we need to make sure we adhere to certain memory limits of the machine. The onnxruntime doesn't seem to allocate the space in the JVM heap so I'm trying to figure out how I will try to limit memory usage and was wondering if the 10-12x is a good formula or if there is a better way to track how much memory the library is using?
Thanks.
Beta Was this translation helpful? Give feedback.
All reactions