From 2452284c9e0664ecf99b95678b9cd643dd4db828 Mon Sep 17 00:00:00 2001 From: Yermalayeu Ihar Date: Tue, 24 Dec 2024 14:04:28 +0300 Subject: [PATCH] +add AMX-BF16 kernel DepthwiseConvolution_k3p1d1s1w6 for class SynetMergedConvolution16b. --- docs/2024.html | 1 - docs/2025.html | 2 + ...6SynetMergedConvolution16bDepthwise3x3.cpp | 114 +++++++++++++++++- src/Test/TestSynetMergedConvolution16b.cpp | 4 +- 4 files changed, 117 insertions(+), 4 deletions(-) diff --git a/docs/2024.html b/docs/2024.html index d48966e67b..96cd25799f 100644 --- a/docs/2024.html +++ b/docs/2024.html @@ -71,7 +71,6 @@
New features
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w6 for framework SynetMergedConvolution32f.
  • AVX-512BW kernel Convolution32fNhwcDepthwise_k7p3d1s1w8 for framework SynetMergedConvolution32f.
  • AMX-BF16 kernel DepthwiseConvolution_k5p2d1s1w8 for class SynetMergedConvolution16b.
  • -
  • AMX-BF16 kernel DepthwiseConvolution_k3p1d1s1w8 for class SynetMergedConvolution16b.
  • Base implementation of function Yuv444pToRgbaV2.
  • Improving
    diff --git a/docs/2025.html b/docs/2025.html index 53bf7f9b42..de2a5344d5 100644 --- a/docs/2025.html +++ b/docs/2025.html @@ -44,6 +44,8 @@
    New features
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetTiledScale2D32f.
  • AMX-BF16 kernel DepthwiseConvolution_k5p2d1s1w6 for class SynetMergedConvolution16b.
  • AMX-BF16 kernel DepthwiseConvolution_k5p2d1s1w4 for class SynetMergedConvolution16b.
  • +
  • AMX-BF16 kernel DepthwiseConvolution_k3p1d1s1w8 for class SynetMergedConvolution16b.
  • +
  • AMX-BF16 kernel DepthwiseConvolution_k3p1d1s1w6 for class SynetMergedConvolution16b.
  • Improving