FP16 slower than FP32 #9282

soundarthiaga · 2021-10-06T00:02:21Z

soundarthiaga
Oct 6, 2021

I noticed that FP16 model performs slower than FP32.
the issue is in graph optimization.
The graph optimization is done first and then the typecasting is done. So, in the case of FP16 graph no optimization takes place.
But once we modify the code by typecasting first and then optimizing the FP16 graph gets optimized.

I have the code fix for this in inference_session.cc
Can you help me to push it into the repo.

ashbhandare · 2021-10-11T17:09:20Z

ashbhandare
Oct 11, 2021

Hi @soundarthiaga ,please make a PR so that the change can be understood/reviewed.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FP16 slower than FP32 #9282

{{title}}

Replies: 1 comment

{{title}}

Select a reply

FP16 slower than FP32 #9282

soundarthiaga Oct 6, 2021

Replies: 1 comment

ashbhandare Oct 11, 2021

soundarthiaga
Oct 6, 2021

ashbhandare
Oct 11, 2021