FP16 slower than FP32 #9282
soundarthiaga
started this conversation in
Show & Tell
Replies: 1 comment
-
Hi @soundarthiaga ,please make a PR so that the change can be understood/reviewed. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I noticed that FP16 model performs slower than FP32.
the issue is in graph optimization.
The graph optimization is done first and then the typecasting is done. So, in the case of FP16 graph no optimization takes place.
But once we modify the code by typecasting first and then optimizing the FP16 graph gets optimized.
I have the code fix for this in inference_session.cc
Can you help me to push it into the repo.
Beta Was this translation helpful? Give feedback.
All reactions