[MKL-DNN] Enabling Winograd algorithm for AVX 512? #12990

masahi · 2018-10-26T22:26:51Z

masahi
Oct 26, 2018

Hi, I am trying MXNet + MKL-DNN on an AVX 512 capable machine.

Looking at the code in mkldnn_convolution.cc, I'm assuming that MXNet always choose a direct algorithm for mkl-dnn convolution (by specifying mkldnn::algorithm::convolution_direct). But MKL-DNN implements Winograd algo for AVX 512, and it should be much faster than direct algo when applicable.

Why not enable it if there is a support for AVX 512? Complication are that we need to check cpu feature
and that winograd algo is applicable for only certain filter shape. Ideally, MKL-DNN should choose the best algo automatically so that MXNet doesn't need to specify what algo to use. @pengzhao-intel @TaoLv

pengzhao-intel · 2018-10-27T00:51:34Z

pengzhao-intel
Oct 27, 2018
Collaborator

It's nice that you are using MKL-DNN backend :)

Winograd outperforms the direct convolution for small filter size and we have the plan to enable it in the next step.

0 replies

masahi · 2018-10-27T01:00:17Z

masahi
Oct 27, 2018
Author

Thanks, let me keep this thread open to track progress.

0 replies

TaoLv · 2018-10-27T01:02:09Z

TaoLv
Oct 27, 2018
Collaborator

@masahi Thanks for asking that. Since winograd may impact the numerical stability and model accuracy, so I don't think it can be chosen by library implictly. Besides, Winograd doesn't always outperform direct convolution in MKL-DNN. It depends on the implementation and kernel size/shape. To make this feature request more clear, could you also explain what does you model look like and what's you expectation from MKL-DNN Winograd?

0 replies

masahi · 2018-10-27T01:33:28Z

masahi
Oct 27, 2018
Author

@TaoLv Good point on numerical accuracy of winograd. Performance - accuracy trade off is an interesting topic. Since you already have int8 convolution (which trades accuracy for performance way more than winograd), it also makes sense to make winograd available so users can opt-in.

I want to test performance of MKL-DNN AVX 512 winograd to compare with my home grown winograd implementation written in TVM. I can't talk about my architecture, but you can imagine a network where almost all convolutions are 3 x 3, similar to VGG.

I tested my network on both MXNet + MKL-DNN and TVM + NNVM. At the moment, the TVM compiled model with winograd convolution is way faster than MXNet + MKL-DNN with AVX 512 direct convolution. I expect big speed up from MXNet by enabling mkl-dnn winograd.

0 replies

TaoLv · 2018-10-27T10:17:19Z

TaoLv
Oct 27, 2018
Collaborator

@TaoLv Good point on numerical accuracy of winograd. Performance - accuracy trade off is an interesting topic. Since you already have int8 convolution (which trades accuracy for performance way more than winograd), it also makes sense to make winograd available so users can opt-in.

I meant MKL-DNN shouldn't "choose the best algo automatically" because of the accuracy issue. At least there should be an environment variable to enable it at framework level. INT8 is also enabled by the user explictly and it need be tuned to exclude some layers for accuracy. If we agree that winograd cannot be applied to all convolution layers, this kind of tuning should also be applied at model level.

I tested my network on both MXNet + MKL-DNN and TVM + NNVM. At the moment, the TVM compiled model with winograd convolution is way faster than MXNet + MKL-DNN with AVX 512 direct convolution. I expect big speed up from MXNet by enabling mkl-dnn winograd.

Do you have any chance to compare TVM and MKL-DNN winograd at kernel level? Maybe you can try benchdnn for that.

0 replies

masahi · 2018-10-27T12:34:51Z

masahi
Oct 27, 2018
Author

Yes, something like an environment variable to enable winograd is also what I had in mind.

Thanks for suggesting benchdnn. I'll try it.

0 replies

frankfliu · 2018-10-29T22:49:48Z

frankfliu
Oct 29, 2018

@mxnet-label-bot [Question]

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MKL-DNN] Enabling Winograd algorithm for AVX 512? #12990

{{title}}

Replies: 7 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

[MKL-DNN] Enabling Winograd algorithm for AVX 512? #12990

masahi Oct 26, 2018

Replies: 7 comments

pengzhao-intel Oct 27, 2018 Collaborator

masahi Oct 27, 2018 Author

TaoLv Oct 27, 2018 Collaborator

masahi Oct 27, 2018 Author

TaoLv Oct 27, 2018 Collaborator

masahi Oct 27, 2018 Author

frankfliu Oct 29, 2018

masahi
Oct 26, 2018

pengzhao-intel
Oct 27, 2018
Collaborator

masahi
Oct 27, 2018
Author

TaoLv
Oct 27, 2018
Collaborator

masahi
Oct 27, 2018
Author

TaoLv
Oct 27, 2018
Collaborator

masahi
Oct 27, 2018
Author

frankfliu
Oct 29, 2018