Use Intel VNNI for int dot product #3512

amitdo · 2021-08-03T13:55:26Z

There are two variants:

AVX512_VNNI (Tiger Lake, Rocket Lake) - 512bit/256bit/128bit
AVX_VNNI - (upcoming Alder Lake) - 256bit/128bit

VNNI replaces 3 simd instructions with one instruction.

It seems that we can use it inside MultiplyGroup().

https://software.intel.com/content/www/us/en/develop/articles/intel-advanced-vector-extensions-512-intel-avx-512-new-vector-neural-network-instruction.html

The text was updated successfully, but these errors were encountered:

stweil · 2021-09-05T08:06:30Z

That requires access to test machines which support those instructions. Up to now I don't have any server with AVX512.

amitdo · 2021-09-05T08:57:44Z

AVX_VNNI - (upcoming Alder Lake) - 256bit/128bit

This will be available soon. Maybe we could ask users to help testing this feature a few months from the launch.

amitdo · 2022-08-02T10:55:36Z

@stweil, can I add AVX_VNNI (256 bit) detection as a first step?

stweil · 2022-08-03T11:35:37Z

Sure, but who has such hardware to test it?

amitdo · 2022-08-03T23:00:04Z

There are tens of millions of people with Intel's Alder Lake, some of them are Tesseract users. We can ask in the forum to test the detection (and later the intdotproductvnni). Hopefully, we will find at least one person that has this CPU and is willing to help.

stweil · 2022-08-05T17:45:51Z

@amitdo, I just noticed that the notebook which I used for AVX512F also has AVX512VNNI. :-)
Do you already have code for the detection? If not, I can add it ~~(just have to find the right documentation which bit in cpuid is used)~~. Wikipedia has the required documentation.

amitdo · 2022-08-06T08:31:43Z

Go ahead!

Could you do the AVX_VNNI detection too? You are more familiar with the detection code than me.

Since most of our files already have:

(C) Copyright <year>, Google Inc.

I think we can look at other Google projects with the same license as ours, and use parts of the code if we need it.

google/cpu_features#263

https://github.com/tensorflow/tensorflow/blob/18d203b1ef84b1e2d4de9eb249194ab1386bdd7b/tensorflow/core/platform/cpu_info.cc

stweil · 2022-08-06T10:01:14Z

Detection is now implemented by commit 0daf18c.

amitdo · 2022-08-06T11:11:12Z

I see that you check that avx/avx2 is supported by the OS. Do you also check somewhere that avx512 is supported by the OS?

stweil · 2022-08-06T12:32:40Z

No, currently only the hardware capabilities are checked for avx512. Up to now nobody complained, so maybe AVX512F was only used on operating systems which support it. I'll add a check for OS support. Thank you for the hint!

amitdo · 2022-08-06T16:38:05Z

I will try to implement intsimdmatrixavx512vnni.cpp.

stweil · 2022-08-06T16:44:32Z

Great. Maybe you can use https://github.com/stweil/tesseract/tree/avx512-vnni (which adds the framework, but simply copied the existing AVX2 code) as a starting point.

amitdo · 2022-08-06T16:57:53Z

Yes, thank you. Please open a draft PR with that code. I'll push the needed changes to your PR.

stweil · 2022-08-06T17:10:56Z

See PR #3894.

amitdo · 2022-08-07T19:54:14Z

Stefan,

There are two ways to implement intsimdmatrixavx512vnni.cpp:

The 'right and complete way', which is also the 'complex way':
a) First convert intsimdmatrixavx2.cpp to intsimdmatrixavx512.cpp.
b) Then convert intsimdmatrixavx512.cpp to intsimdmatrixavx512vnni.cpp.
The 'simple way', which is incomplete but still expected to work fine and to be much faster than intsimdmatrixavx2.cpp:

AVX512-VNNI supports 256-bit vector operations, not just 512-bit vector operations. Since AVX2 uses 256-bit vectors, I believe only a few changes are needed to convert intsimdmatrixavx2.cpp to intsimdmatrixavx512vnni.cpp which will use 256-bits vectors instead of 512-bit vectors.

I want to implement the second way in PR #3894. We can still implement the first way later.

What do you think about my suggestion?

stweil · 2022-08-08T05:06:06Z

intsimdmatrixavx512vnni.cpp?

amitdo · 2022-08-08T06:14:20Z

intsimdmatrixavx512vnni.cpp?

Fixed :-)

amitdo added RFC SIMD labels Aug 3, 2021

amitdo added the feature request label Aug 17, 2021

amitdo added this to the 6.0.0 milestone Oct 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Intel VNNI for int dot product #3512

Use Intel VNNI for int dot product #3512

amitdo commented Aug 3, 2021 •

edited

Loading

stweil commented Sep 5, 2021

amitdo commented Sep 5, 2021

amitdo commented Aug 2, 2022

stweil commented Aug 3, 2022

amitdo commented Aug 3, 2022 •

edited

Loading

stweil commented Aug 5, 2022 •

edited

Loading

amitdo commented Aug 6, 2022 •

edited

Loading

stweil commented Aug 6, 2022

amitdo commented Aug 6, 2022

stweil commented Aug 6, 2022

amitdo commented Aug 6, 2022 •

edited

Loading

stweil commented Aug 6, 2022

amitdo commented Aug 6, 2022

stweil commented Aug 6, 2022

amitdo commented Aug 7, 2022 •

edited

Loading

stweil commented Aug 8, 2022

amitdo commented Aug 8, 2022

Use Intel VNNI for int dot product #3512

Use Intel VNNI for int dot product #3512

Comments

amitdo commented Aug 3, 2021 • edited Loading

stweil commented Sep 5, 2021

amitdo commented Sep 5, 2021

amitdo commented Aug 2, 2022

stweil commented Aug 3, 2022

amitdo commented Aug 3, 2022 • edited Loading

stweil commented Aug 5, 2022 • edited Loading

amitdo commented Aug 6, 2022 • edited Loading

stweil commented Aug 6, 2022

amitdo commented Aug 6, 2022

stweil commented Aug 6, 2022

amitdo commented Aug 6, 2022 • edited Loading

stweil commented Aug 6, 2022

amitdo commented Aug 6, 2022

stweil commented Aug 6, 2022

amitdo commented Aug 7, 2022 • edited Loading

stweil commented Aug 8, 2022

amitdo commented Aug 8, 2022

amitdo commented Aug 3, 2021 •

edited

Loading

amitdo commented Aug 3, 2022 •

edited

Loading

stweil commented Aug 5, 2022 •

edited

Loading

amitdo commented Aug 6, 2022 •

edited

Loading

amitdo commented Aug 6, 2022 •

edited

Loading

amitdo commented Aug 7, 2022 •

edited

Loading