You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.
Hi , I have a question and that is :
I know that nGraph supports dynamic shape for input but is there any improvement for such models ? (especially NLP models like Bert )
The text was updated successfully, but these errors were encountered:
I thought of dynamic shapes as like C++ vectors; there's allocated space and something separate that indicates how much of that space you are actually using. You might cache particular compiled combinations of max sizes for inputs. If you are a server, dynamic batch size can help with latency since you would only need to compute for as many samples as you had ready. You could also imagine kernels that could make use of knowing actual sample lengths to reduce the computation in those transformer GEMMs (some hardware skips 0 arithmetic and some hardware does a tile of GEMM in the same amount of time as a partial tile, so it wouldn't matter for them). Whatever Intel did with ngraph along those lines would be in the OpenVINO repo.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi , I have a question and that is :
I know that nGraph supports dynamic shape for input but is there any improvement for such models ? (especially NLP models like Bert )
The text was updated successfully, but these errors were encountered: