Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile bug: macOS Vulkan build fails #10923

Open
soerenkampschroer opened this issue Dec 20, 2024 · 6 comments · May be fixed by #10927
Open

Compile bug: macOS Vulkan build fails #10923

soerenkampschroer opened this issue Dec 20, 2024 · 6 comments · May be fixed by #10927
Labels
bug Something isn't working build Compilation issues Vulkan Issues specific to the Vulkan backend

Comments

@soerenkampschroer
Copy link

soerenkampschroer commented Dec 20, 2024

Git commit

eb5c3dc

Operating systems

Mac

GGML backends

Vulkan

Problem description & steps to reproduce

The build process fails with errors in ggml_vk_host_get.

First Bad Commit

No response

Relevant log output

❯ cmake -B build -DGGML_METAL=OFF -DGGML_VULKAN=1
-- The C compiler identification is AppleClang 16.0.0.16000026
-- The CXX compiler identification is AppleClang 16.0.0.16000026
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/local/bin/git (found version "2.47.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- Including CPU backend
-- Accelerate framework found
-- Could NOT find OpenMP_C (missing: OpenMP_C_FLAGS OpenMP_C_LIB_NAMES)
-- Could NOT find OpenMP_CXX (missing: OpenMP_CXX_FLAGS OpenMP_CXX_LIB_NAMES)
-- Could NOT find OpenMP (missing: OpenMP_C_FOUND OpenMP_CXX_FOUND)
CMake Warning at ggml/src/ggml-cpu/CMakeLists.txt:53 (message):
  OpenMP not found
Call Stack (most recent call first):
  ggml/src/CMakeLists.txt:298 (ggml_add_cpu_backend_variant_impl)


-- x86 detected
-- Adding CPU backend variant ggml-cpu: -march=native
-- Looking for dgemm_
-- Looking for dgemm_ - found
-- Found BLAS: /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.2.sdk/System/Library/Frameworks/Accelerate.framework
-- BLAS found, Libraries: /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.2.sdk/System/Library/Frameworks/Accelerate.framework
-- BLAS found, Includes:
-- Including BLAS backend
-- Found Vulkan: /usr/local/lib/libvulkan.dylib (found version "1.4.303") found components: glslc glslangValidator
-- Vulkan found
-- GL_NV_cooperative_matrix2 supported by glslc
-- Including Vulkan backend
-- Configuring done (2.4s)
-- Generating done (0.7s)
-- Build files have been written to: /Users/soeren/Documents/Projects/llama.cpp/build


❯ cmake --build build --config Release
[  1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o
[  1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
[  2%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
[  2%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
[  3%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
[  3%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
[  4%] Linking CXX shared library libggml-base.dylib
[  4%] Built target ggml-base
[  4%] Building CXX object ggml/src/ggml-vulkan/vulkan-shaders/CMakeFiles/vulkan-shaders-gen.dir/vulkan-shaders-gen.cpp.o
[  5%] Linking CXX executable ../../../../bin/vulkan-shaders-gen
[  5%] Built target vulkan-shaders-gen
[  6%] Generate vulkan shaders
ggml_vulkan: Generating and compiling shaders to SPIR-V
[  6%] Building CXX object ggml/src/ggml-vulkan/CMakeFiles/ggml-vulkan.dir/ggml-vulkan.cpp.o
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:1368:2: warning: extra ';' outside of a function is incompatible with C++98 [-Wc++98-compat-extra-semi]
 1368 | };
      |  ^
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4777:9: error: no matching function for call to 'ggml_vk_host_get'
 4777 |         ggml_vk_host_get(ctx->device, q->data, d_Q, q_buf_offset);
      |         ^~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:3095:13: note: candidate function not viable: no known conversion from 'uint64_t' (aka 'unsigned long long') to 'size_t &' (aka 'unsigned long &') for 4th argument
 3095 | static void ggml_vk_host_get(vk_device& device, const void * ptr, vk_buffer& buf, size_t& buf_offset) {
      |             ^                                                                     ~~~~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4778:9: error: no matching function for call to 'ggml_vk_host_get'
 4778 |         ggml_vk_host_get(ctx->device, k->data, d_K, q_buf_offset);
      |         ^~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:3095:13: note: candidate function not viable: no known conversion from 'uint64_t' (aka 'unsigned long long') to 'size_t &' (aka 'unsigned long &') for 4th argument
 3095 | static void ggml_vk_host_get(vk_device& device, const void * ptr, vk_buffer& buf, size_t& buf_offset) {
      |             ^                                                                     ~~~~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4779:9: error: no matching function for call to 'ggml_vk_host_get'
 4779 |         ggml_vk_host_get(ctx->device, v->data, d_V, q_buf_offset);
      |         ^~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:3095:13: note: candidate function not viable: no known conversion from 'uint64_t' (aka 'unsigned long long') to 'size_t &' (aka 'unsigned long &') for 4th argument
 3095 | static void ggml_vk_host_get(vk_device& device, const void * ptr, vk_buffer& buf, size_t& buf_offset) {
      |             ^                                                                     ~~~~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4780:9: error: no matching function for call to 'ggml_vk_host_get'
 4780 |         ggml_vk_host_get(ctx->device, dst->data, d_D, q_buf_offset);
      |         ^~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:3095:13: note: candidate function not viable: no known conversion from 'uint64_t' (aka 'unsigned long long') to 'size_t &' (aka 'unsigned long &') for 4th argument
 3095 | static void ggml_vk_host_get(vk_device& device, const void * ptr, vk_buffer& buf, size_t& buf_offset) {
      |             ^                                                                     ~~~~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4786:13: error: no matching function for call to 'ggml_vk_host_get'
 4786 |             ggml_vk_host_get(ctx->device, mask->data, d_M, q_buf_offset);
      |             ^~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:3095:13: note: candidate function not viable: no known conversion from 'uint64_t' (aka 'unsigned long long') to 'size_t &' (aka 'unsigned long &') for 4th argument
 3095 | static void ggml_vk_host_get(vk_device& device, const void * ptr, vk_buffer& buf, size_t& buf_offset) {
      |             ^                                                                     ~~~~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:5482:9: error: no matching function for call to 'ggml_vk_host_get'
 5482 |         ggml_vk_host_get(ctx->device, k->data, d_K, k_offset);
      |         ^~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:3095:13: note: candidate function not viable: no known conversion from 'uint64_t' (aka 'unsigned long long') to 'size_t &' (aka 'unsigned long &') for 4th argument
 3095 | static void ggml_vk_host_get(vk_device& device, const void * ptr, vk_buffer& buf, size_t& buf_offset) {
      |             ^                                                                     ~~~~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:5483:9: error: no matching function for call to 'ggml_vk_host_get'
 5483 |         ggml_vk_host_get(ctx->device, v->data, d_V, v_offset);
      |         ^~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:3095:13: note: candidate function not viable: no known conversion from 'uint64_t' (aka 'unsigned long long') to 'size_t &' (aka 'unsigned long &') for 4th argument
 3095 | static void ggml_vk_host_get(vk_device& device, const void * ptr, vk_buffer& buf, size_t& buf_offset) {
      |             ^                                                                     ~~~~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:5484:9: error: no matching function for call to 'ggml_vk_host_get'
 5484 |         ggml_vk_host_get(ctx->device, r->data, d_R, r_offset);
      |         ^~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:3095:13: note: candidate function not viable: no known conversion from 'uint64_t' (aka 'unsigned long long') to 'size_t &' (aka 'unsigned long &') for 4th argument
 3095 | static void ggml_vk_host_get(vk_device& device, const void * ptr, vk_buffer& buf, size_t& buf_offset) {
      |             ^                                                                     ~~~~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:5485:9: error: no matching function for call to 'ggml_vk_host_get'
 5485 |         ggml_vk_host_get(ctx->device, tf->data, d_TF, tf_offset);
      |         ^~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:3095:13: note: candidate function not viable: no known conversion from 'uint64_t' (aka 'unsigned long long') to 'size_t &' (aka 'unsigned long &') for 4th argument
 3095 | static void ggml_vk_host_get(vk_device& device, const void * ptr, vk_buffer& buf, size_t& buf_offset) {
      |             ^                                                                     ~~~~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:5486:9: error: no matching function for call to 'ggml_vk_host_get'
 5486 |         ggml_vk_host_get(ctx->device, td->data, d_TD, td_offset);
      |         ^~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:3095:13: note: candidate function not viable: no known conversion from 'uint64_t' (aka 'unsigned long long') to 'size_t &' (aka 'unsigned long &') for 4th argument
 3095 | static void ggml_vk_host_get(vk_device& device, const void * ptr, vk_buffer& buf, size_t& buf_offset) {
      |             ^                                                                     ~~~~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:5487:9: error: no matching function for call to 'ggml_vk_host_get'
 5487 |         ggml_vk_host_get(ctx->device, state->data, d_State, state_offset);
      |         ^~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:3095:13: note: candidate function not viable: no known conversion from 'uint64_t' (aka 'unsigned long long') to 'size_t &' (aka 'unsigned long &') for 4th argument
 3095 | static void ggml_vk_host_get(vk_device& device, const void * ptr, vk_buffer& buf, size_t& buf_offset) {
      |             ^                                                                     ~~~~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:5488:9: error: no matching function for call to 'ggml_vk_host_get'
 5488 |         ggml_vk_host_get(ctx->device, dst->data, d_D, dst_offset);
      |         ^~~~~~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:3095:13: note: candidate function not viable: no known conversion from 'uint64_t' (aka 'unsigned long long') to 'size_t &' (aka 'unsigned long &') for 4th argument
 3095 | static void ggml_vk_host_get(vk_device& device, const void * ptr, vk_buffer& buf, size_t& buf_offset) {
      |             ^                                                                     ~~~~~~~~~~~~~~~~~~
1 warning and 12 errors generated.
make[2]: *** [ggml/src/ggml-vulkan/CMakeFiles/ggml-vulkan.dir/ggml-vulkan.cpp.o] Error 1
make[1]: *** [ggml/src/ggml-vulkan/CMakeFiles/ggml-vulkan.dir/all] Error 2
make: *** [all] Error 2
@slaren slaren added bug Something isn't working build Compilation issues Vulkan Issues specific to the Vulkan backend and removed bug-unconfirmed labels Dec 20, 2024
@soerenkampschroer
Copy link
Author

I'm able to compile tags up until b4273. The errors above start there up until the latest tag.

Using a build with tag b4272 it's just generating @@@ symbols.

./llama-cli -m ~/Downloads/Llama-3.2-3B-Instruct-f16.gguf -p "Hey how are you?" -n 128 --n-gpu-layers 40
❯ ./llama-cli -m ~/Downloads/Llama-3.2-3B-Instruct-f16.gguf -p "Hey how are you?" -n 128 --n-gpu-layers 40
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6800 (MoltenVK) | uma: 0 | fp16: 1 | warp size: 64
build: 1 (6fe6247) with Apple clang version 16.0.0 (clang-1600.0.26.6) for x86_64-apple-darwin24.1.0
main: llama backend init
main: load the model and apply lora adapter, if any
llama_load_model_from_file: using device Vulkan0 (AMD Radeon RX 6800) - 16368 MiB free
llama_model_loader: loaded meta data with 31 key-value pairs and 255 tensors from /Users/soeren/Downloads/Llama-3.2-3B-Instruct-f16.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Llama 3.2 3B Instruct
llama_model_loader: - kv   3:                           general.finetune str              = Instruct
llama_model_loader: - kv   4:                           general.basename str              = Llama-3.2
llama_model_loader: - kv   5:                         general.size_label str              = 3B
llama_model_loader: - kv   6:                            general.license str              = llama3.2
llama_model_loader: - kv   7:                               general.tags arr[str,6]       = ["facebook", "meta", "pytorch", "llam...
llama_model_loader: - kv   8:                          general.languages arr[str,8]       = ["en", "de", "fr", "it", "pt", "hi", ...
llama_model_loader: - kv   9:                          llama.block_count u32              = 28
llama_model_loader: - kv  10:                       llama.context_length u32              = 131072
llama_model_loader: - kv  11:                     llama.embedding_length u32              = 3072
llama_model_loader: - kv  12:                  llama.feed_forward_length u32              = 8192
llama_model_loader: - kv  13:                 llama.attention.head_count u32              = 24
llama_model_loader: - kv  14:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv  15:                       llama.rope.freq_base f32              = 500000.000000
llama_model_loader: - kv  16:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  17:                 llama.attention.key_length u32              = 128
llama_model_loader: - kv  18:               llama.attention.value_length u32              = 128
llama_model_loader: - kv  19:                          general.file_type u32              = 1
llama_model_loader: - kv  20:                           llama.vocab_size u32              = 128256
llama_model_loader: - kv  21:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv  22:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  23:                         tokenizer.ggml.pre str              = llama-bpe
llama_model_loader: - kv  24:                      tokenizer.ggml.tokens arr[str,128256]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  25:                  tokenizer.ggml.token_type arr[i32,128256]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  26:                      tokenizer.ggml.merges arr[str,280147]  = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
llama_model_loader: - kv  27:                tokenizer.ggml.bos_token_id u32              = 128000
llama_model_loader: - kv  28:                tokenizer.ggml.eos_token_id u32              = 128009
llama_model_loader: - kv  29:                    tokenizer.chat_template str              = {{- bos_token }}\n{%- if custom_tools ...
llama_model_loader: - kv  30:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   58 tensors
llama_model_loader: - type  f16:  197 tensors
llm_load_vocab: special tokens cache size = 256
llm_load_vocab: token to piece cache size = 0.7999 MB
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = llama
llm_load_print_meta: vocab type       = BPE
llm_load_print_meta: n_vocab          = 128256
llm_load_print_meta: n_merges         = 280147
llm_load_print_meta: vocab_only       = 0
llm_load_print_meta: n_ctx_train      = 131072
llm_load_print_meta: n_embd           = 3072
llm_load_print_meta: n_layer          = 28
llm_load_print_meta: n_head           = 24
llm_load_print_meta: n_head_kv        = 8
llm_load_print_meta: n_rot            = 128
llm_load_print_meta: n_swa            = 0
llm_load_print_meta: n_embd_head_k    = 128
llm_load_print_meta: n_embd_head_v    = 128
llm_load_print_meta: n_gqa            = 3
llm_load_print_meta: n_embd_k_gqa     = 1024
llm_load_print_meta: n_embd_v_gqa     = 1024
llm_load_print_meta: f_norm_eps       = 0.0e+00
llm_load_print_meta: f_norm_rms_eps   = 1.0e-05
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale    = 0.0e+00
llm_load_print_meta: n_ff             = 8192
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: causal attn      = 1
llm_load_print_meta: pooling type     = 0
llm_load_print_meta: rope type        = 0
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 500000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn  = 131072
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: ssm_d_conv       = 0
llm_load_print_meta: ssm_d_inner      = 0
llm_load_print_meta: ssm_d_state      = 0
llm_load_print_meta: ssm_dt_rank      = 0
llm_load_print_meta: ssm_dt_b_c_rms   = 0
llm_load_print_meta: model type       = 3B
llm_load_print_meta: model ftype      = F16
llm_load_print_meta: model params     = 3.21 B
llm_load_print_meta: model size       = 5.98 GiB (16.00 BPW)
llm_load_print_meta: general.name     = Llama 3.2 3B Instruct
llm_load_print_meta: BOS token        = 128000 '<|begin_of_text|>'
llm_load_print_meta: EOS token        = 128009 '<|eot_id|>'
llm_load_print_meta: EOT token        = 128009 '<|eot_id|>'
llm_load_print_meta: EOM token        = 128008 '<|eom_id|>'
llm_load_print_meta: LF token         = 128 'Ä'
llm_load_print_meta: EOG token        = 128008 '<|eom_id|>'
llm_load_print_meta: EOG token        = 128009 '<|eot_id|>'
llm_load_print_meta: max token length = 256
ggml_vulkan: Compiling shaders..............................Done!
llm_load_tensors: offloading 28 repeating layers to GPU
llm_load_tensors: offloading output layer to GPU
llm_load_tensors: offloaded 29/29 layers to GPU
llm_load_tensors:   CPU_Mapped model buffer size =   751.50 MiB
llm_load_tensors:      Vulkan0 model buffer size =  6128.17 MiB
.................................................................................
llama_new_context_with_model: n_seq_max     = 1
llama_new_context_with_model: n_ctx         = 4096
llama_new_context_with_model: n_ctx_per_seq = 4096
llama_new_context_with_model: n_batch       = 2048
llama_new_context_with_model: n_ubatch      = 512
llama_new_context_with_model: flash_attn    = 0
llama_new_context_with_model: freq_base     = 500000.0
llama_new_context_with_model: freq_scale    = 1
llama_new_context_with_model: n_ctx_per_seq (4096) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
llama_kv_cache_init:    Vulkan0 KV buffer size =   448.00 MiB
llama_new_context_with_model: KV self size  =  448.00 MiB, K (f16):  224.00 MiB, V (f16):  224.00 MiB
llama_new_context_with_model: Vulkan_Host  output buffer size =     0.49 MiB
llama_new_context_with_model:    Vulkan0 compute buffer size =   256.50 MiB
llama_new_context_with_model: Vulkan_Host compute buffer size =    14.01 MiB
llama_new_context_with_model: graph nodes  = 902
llama_new_context_with_model: graph splits = 2
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
main: llama threadpool init, n_threads = 6

system_info: n_threads = 6 (n_threads_batch = 6) / 12 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | ACCELERATE = 1 | AARCH64_REPACK = 1 |

sampler seed: 3701522956
sampler params:
	repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
	dry_multiplier = 0.000, dry_base = 1.750, dry_allowed_length = 2, dry_penalty_last_n = -1
	top_k = 40, top_p = 0.950, min_p = 0.050, xtc_probability = 0.000, xtc_threshold = 0.100, typical_p = 1.000, temp = 0.800
	mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampler chain: logits -> logit-bias -> penalties -> dry -> top-k -> typical -> top-p -> min-p -> xtc -> temp-ext -> dist
generate: n_ctx = 4096, n_batch = 2048, n_predict = 128, n_keep = 1

Hey how are you? JustertiF,@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

llama_perf_sampler_print:    sampling time =      11.51 ms /   134 runs   (    0.09 ms per token, 11646.10 tokens per second)
llama_perf_context_print:        load time =   14322.05 ms
llama_perf_context_print: prompt eval time =     282.04 ms /     6 tokens (   47.01 ms per token,    21.27 tokens per second)
llama_perf_context_print:        eval time =    1589.11 ms /   127 runs   (   12.51 ms per token,    79.92 tokens per second)
llama_perf_context_print:       total time =    1902.63 ms /   133 tokens

jeffbolznv added a commit to jeffbolznv/llama.cpp that referenced this issue Dec 21, 2024
@jeffbolznv jeffbolznv linked a pull request Dec 21, 2024 that will close this issue
@jeffbolznv
Copy link
Collaborator

@soerenkampschroer please try #10927.

@soerenkampschroer
Copy link
Author

soerenkampschroer commented Dec 21, 2024

@jeffbolznv Thank you, that works!

Compilation Log
❯ cmake -B build -DGGML_METAL=OFF -DGGML_VULKAN=1
-- The C compiler identification is AppleClang 16.0.0.16000026
-- The CXX compiler identification is AppleClang 16.0.0.16000026
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/local/bin/git (found version "2.47.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- Including CPU backend
-- Accelerate framework found
-- Could NOT find OpenMP_C (missing: OpenMP_C_FLAGS OpenMP_C_LIB_NAMES)
-- Could NOT find OpenMP_CXX (missing: OpenMP_CXX_FLAGS OpenMP_CXX_LIB_NAMES)
-- Could NOT find OpenMP (missing: OpenMP_C_FOUND OpenMP_CXX_FOUND)
CMake Warning at ggml/src/ggml-cpu/CMakeLists.txt:53 (message):
  OpenMP not found
Call Stack (most recent call first):
  ggml/src/CMakeLists.txt:298 (ggml_add_cpu_backend_variant_impl)


-- x86 detected
-- Adding CPU backend variant ggml-cpu: -march=native
-- Looking for dgemm_
-- Looking for dgemm_ - found
-- Found BLAS: /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.2.sdk/System/Library/Frameworks/Accelerate.framework
-- BLAS found, Libraries: /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.2.sdk/System/Library/Frameworks/Accelerate.framework
-- BLAS found, Includes:
-- Including BLAS backend
-- Found Vulkan: /usr/local/lib/libvulkan.dylib (found version "1.4.303") found components: glslc glslangValidator
-- Vulkan found
-- GL_NV_cooperative_matrix2 supported by glslc
-- Including Vulkan backend
-- Configuring done (2.6s)
-- Generating done (0.7s)
-- Build files have been written to: /Users/soeren/Documents/Projects/llama.cpp/build
❯ cmake --build build --config Release
[  1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o
[  1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
[  2%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
[  2%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
[  3%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
[  3%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
[  4%] Linking CXX shared library libggml-base.dylib
[  4%] Built target ggml-base
[  4%] Building CXX object ggml/src/ggml-vulkan/vulkan-shaders/CMakeFiles/vulkan-shaders-gen.dir/vulkan-shaders-gen.cpp.o
[  5%] Linking CXX executable ../../../../bin/vulkan-shaders-gen
[  5%] Built target vulkan-shaders-gen
[  6%] Generate vulkan shaders
ggml_vulkan: Generating and compiling shaders to SPIR-V
[  6%] Building CXX object ggml/src/ggml-vulkan/CMakeFiles/ggml-vulkan.dir/ggml-vulkan.cpp.o
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:1368:2: warning: extra ';' outside of a function is incompatible with C++98 [-Wc++98-compat-extra-semi]
 1368 | };
      |  ^
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4801:9: warning: variable 'k_buf_offset' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
 4801 |     if (!K_uma) {
      |         ^~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4828:55: note: uninitialized use occurs here
 4828 |                                     vk_subbuffer{d_K, k_buf_offset, VK_WHOLE_SIZE},
      |                                                       ^~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4801:5: note: remove the 'if' if its condition is always true
 4801 |     if (!K_uma) {
      |     ^~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4772:38: note: initialize the variable 'k_buf_offset' to silence this warning
 4772 |     size_t q_buf_offset, k_buf_offset, v_buf_offset, d_buf_offset, m_buf_offset;
      |                                      ^
      |                                       = 0
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4805:9: warning: variable 'v_buf_offset' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
 4805 |     if (!V_uma) {
      |         ^~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4829:55: note: uninitialized use occurs here
 4829 |                                     vk_subbuffer{d_V, v_buf_offset, VK_WHOLE_SIZE},
      |                                                       ^~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4805:5: note: remove the 'if' if its condition is always true
 4805 |     if (!V_uma) {
      |     ^~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4772:52: note: initialize the variable 'v_buf_offset' to silence this warning
 4772 |     size_t q_buf_offset, k_buf_offset, v_buf_offset, d_buf_offset, m_buf_offset;
      |                                                    ^
      |                                                     = 0
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4814:9: warning: variable 'm_buf_offset' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
 4814 |     if (!M_uma) {
      |         ^~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4830:55: note: uninitialized use occurs here
 4830 |                                     vk_subbuffer{d_M, m_buf_offset, VK_WHOLE_SIZE},
      |                                                       ^~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4814:5: note: remove the 'if' if its condition is always true
 4814 |     if (!M_uma) {
      |     ^~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4772:80: note: initialize the variable 'm_buf_offset' to silence this warning
 4772 |     size_t q_buf_offset, k_buf_offset, v_buf_offset, d_buf_offset, m_buf_offset;
      |                                                                                ^
      |                                                                                 = 0
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4809:9: warning: variable 'd_buf_offset' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
 4809 |     if (!D_uma) {
      |         ^~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4831:55: note: uninitialized use occurs here
 4831 |                                     vk_subbuffer{d_D, d_buf_offset, VK_WHOLE_SIZE},
      |                                                       ^~~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4809:5: note: remove the 'if' if its condition is always true
 4809 |     if (!D_uma) {
      |     ^~~~~~~~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4772:66: note: initialize the variable 'd_buf_offset' to silence this warning
 4772 |     size_t q_buf_offset, k_buf_offset, v_buf_offset, d_buf_offset, m_buf_offset;
      |                                                                  ^
      |                                                                   = 0
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:6723:16: warning: 'return' will never be executed [-Wunreachable-code-return]
 6723 |         return false;
      |                ^~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:7837:15: warning: 'break' will never be executed [-Wunreachable-code-break]
 7837 |             } break;
      |               ^~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:7820:15: warning: 'break' will never be executed [-Wunreachable-code-break]
 7820 |             } break;
      |               ^~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:7753:15: warning: 'break' will never be executed [-Wunreachable-code-break]
 7753 |             } break;
      |               ^~~~~
/Users/soeren/Documents/Projects/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:7707:13: warning: 'break' will never be executed [-Wunreachable-code-break]
 7707 |             break;
      |             ^~~~~
10 warnings generated.
[  7%] Building CXX object ggml/src/ggml-vulkan/CMakeFiles/ggml-vulkan.dir/ggml-vulkan-shaders.cpp.o
[  7%] Linking CXX shared library libggml-vulkan.dylib
[  7%] Built target ggml-vulkan
[  7%] Building C object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu.c.o
[  8%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu.cpp.o
[  8%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu-aarch64.cpp.o
[  9%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu-hbm.cpp.o
[  9%] Building C object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu-quants.c.o
[ 10%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu-traits.cpp.o
[ 10%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/amx/amx.cpp.o
[ 11%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/amx/mmq.cpp.o
[ 11%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/llamafile/sgemm.cpp.o
[ 12%] Linking CXX shared library libggml-cpu.dylib
[ 12%] Built target ggml-cpu
[ 12%] Building CXX object ggml/src/ggml-blas/CMakeFiles/ggml-blas.dir/ggml-blas.cpp.o
[ 13%] Linking CXX shared library libggml-blas.dylib
[ 13%] Built target ggml-blas
[ 14%] Building CXX object ggml/src/CMakeFiles/ggml.dir/ggml-backend-reg.cpp.o
[ 14%] Linking CXX shared library libggml.dylib
[ 14%] Built target ggml
[ 15%] Building CXX object src/CMakeFiles/llama.dir/llama.cpp.o
[ 15%] Building CXX object src/CMakeFiles/llama.dir/llama-vocab.cpp.o
[ 16%] Building CXX object src/CMakeFiles/llama.dir/llama-grammar.cpp.o
[ 16%] Building CXX object src/CMakeFiles/llama.dir/llama-sampling.cpp.o
[ 17%] Building CXX object src/CMakeFiles/llama.dir/unicode.cpp.o
[ 17%] Building CXX object src/CMakeFiles/llama.dir/unicode-data.cpp.o
[ 18%] Linking CXX shared library libllama.dylib
[ 18%] Built target llama
[ 18%] Generating build details from Git
-- Found Git: /usr/local/bin/git (found version "2.47.1")
[ 19%] Building CXX object common/CMakeFiles/build_info.dir/build-info.cpp.o
[ 19%] Built target build_info
[ 19%] Building CXX object common/CMakeFiles/common.dir/arg.cpp.o
[ 20%] Building CXX object common/CMakeFiles/common.dir/common.cpp.o
[ 20%] Building CXX object common/CMakeFiles/common.dir/console.cpp.o
[ 21%] Building CXX object common/CMakeFiles/common.dir/json-schema-to-grammar.cpp.o
[ 21%] Building CXX object common/CMakeFiles/common.dir/log.cpp.o
[ 22%] Building CXX object common/CMakeFiles/common.dir/ngram-cache.cpp.o
[ 22%] Building CXX object common/CMakeFiles/common.dir/sampling.cpp.o
[ 23%] Building CXX object common/CMakeFiles/common.dir/speculative.cpp.o
[ 23%] Linking CXX static library libcommon.a
[ 23%] Built target common
[ 23%] Building CXX object tests/CMakeFiles/test-tokenizer-0.dir/test-tokenizer-0.cpp.o
[ 24%] Linking CXX executable ../bin/test-tokenizer-0
[ 24%] Built target test-tokenizer-0
[ 25%] Building CXX object tests/CMakeFiles/test-sampling.dir/test-sampling.cpp.o
[ 25%] Building CXX object tests/CMakeFiles/test-sampling.dir/get-model.cpp.o
[ 26%] Linking CXX executable ../bin/test-sampling
[ 26%] Built target test-sampling
[ 26%] Building CXX object tests/CMakeFiles/test-grammar-parser.dir/test-grammar-parser.cpp.o
[ 27%] Building CXX object tests/CMakeFiles/test-grammar-parser.dir/get-model.cpp.o
[ 27%] Linking CXX executable ../bin/test-grammar-parser
[ 27%] Built target test-grammar-parser
[ 28%] Building CXX object tests/CMakeFiles/test-grammar-integration.dir/test-grammar-integration.cpp.o
[ 28%] Building CXX object tests/CMakeFiles/test-grammar-integration.dir/get-model.cpp.o
[ 29%] Linking CXX executable ../bin/test-grammar-integration
[ 29%] Built target test-grammar-integration
[ 30%] Building CXX object tests/CMakeFiles/test-llama-grammar.dir/test-llama-grammar.cpp.o
[ 30%] Building CXX object tests/CMakeFiles/test-llama-grammar.dir/get-model.cpp.o
[ 31%] Linking CXX executable ../bin/test-llama-grammar
[ 31%] Built target test-llama-grammar
[ 32%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/test-json-schema-to-grammar.cpp.o
[ 33%] Building CXX object tests/CMakeFiles/test-json-schema-to-grammar.dir/get-model.cpp.o
[ 33%] Linking CXX executable ../bin/test-json-schema-to-grammar
[ 33%] Built target test-json-schema-to-grammar
[ 33%] Building CXX object tests/CMakeFiles/test-tokenizer-1-bpe.dir/test-tokenizer-1-bpe.cpp.o
[ 34%] Linking CXX executable ../bin/test-tokenizer-1-bpe
[ 34%] Built target test-tokenizer-1-bpe
[ 34%] Building CXX object tests/CMakeFiles/test-tokenizer-1-spm.dir/test-tokenizer-1-spm.cpp.o
[ 35%] Linking CXX executable ../bin/test-tokenizer-1-spm
[ 35%] Built target test-tokenizer-1-spm
[ 35%] Building CXX object tests/CMakeFiles/test-log.dir/test-log.cpp.o
[ 36%] Building CXX object tests/CMakeFiles/test-log.dir/get-model.cpp.o
[ 36%] Linking CXX executable ../bin/test-log
[ 36%] Built target test-log
[ 37%] Building CXX object tests/CMakeFiles/test-arg-parser.dir/test-arg-parser.cpp.o
[ 37%] Building CXX object tests/CMakeFiles/test-arg-parser.dir/get-model.cpp.o
[ 38%] Linking CXX executable ../bin/test-arg-parser
[ 38%] Built target test-arg-parser
[ 39%] Building CXX object tests/CMakeFiles/test-chat-template.dir/test-chat-template.cpp.o
[ 39%] Building CXX object tests/CMakeFiles/test-chat-template.dir/get-model.cpp.o
[ 40%] Linking CXX executable ../bin/test-chat-template
[ 40%] Built target test-chat-template
[ 40%] Building CXX object tests/CMakeFiles/test-gguf.dir/test-gguf.cpp.o
[ 41%] Building CXX object tests/CMakeFiles/test-gguf.dir/get-model.cpp.o
[ 41%] Linking CXX executable ../bin/test-gguf
[ 41%] Built target test-gguf
[ 42%] Building CXX object tests/CMakeFiles/test-backend-ops.dir/test-backend-ops.cpp.o
[ 42%] Building CXX object tests/CMakeFiles/test-backend-ops.dir/get-model.cpp.o
[ 43%] Linking CXX executable ../bin/test-backend-ops
[ 43%] Built target test-backend-ops
[ 44%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/test-model-load-cancel.cpp.o
[ 44%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/get-model.cpp.o
[ 45%] Linking CXX executable ../bin/test-model-load-cancel
[ 45%] Built target test-model-load-cancel
[ 45%] Building CXX object tests/CMakeFiles/test-autorelease.dir/test-autorelease.cpp.o
[ 46%] Building CXX object tests/CMakeFiles/test-autorelease.dir/get-model.cpp.o
[ 46%] Linking CXX executable ../bin/test-autorelease
[ 46%] Built target test-autorelease
[ 46%] Building CXX object tests/CMakeFiles/test-barrier.dir/test-barrier.cpp.o
[ 47%] Building CXX object tests/CMakeFiles/test-barrier.dir/get-model.cpp.o
[ 47%] Linking CXX executable ../bin/test-barrier
[ 47%] Built target test-barrier
[ 47%] Building CXX object tests/CMakeFiles/test-quantize-fns.dir/test-quantize-fns.cpp.o
[ 48%] Building CXX object tests/CMakeFiles/test-quantize-fns.dir/get-model.cpp.o
[ 48%] Linking CXX executable ../bin/test-quantize-fns
[ 48%] Built target test-quantize-fns
[ 49%] Building CXX object tests/CMakeFiles/test-quantize-perf.dir/test-quantize-perf.cpp.o
[ 49%] Building CXX object tests/CMakeFiles/test-quantize-perf.dir/get-model.cpp.o
[ 50%] Linking CXX executable ../bin/test-quantize-perf
[ 50%] Built target test-quantize-perf
[ 50%] Building CXX object tests/CMakeFiles/test-rope.dir/test-rope.cpp.o
[ 51%] Building CXX object tests/CMakeFiles/test-rope.dir/get-model.cpp.o
[ 51%] Linking CXX executable ../bin/test-rope
[ 51%] Built target test-rope
[ 52%] Building C object tests/CMakeFiles/test-c.dir/test-c.c.o
[ 52%] Linking C executable ../bin/test-c
[ 52%] Built target test-c
[ 52%] Building CXX object examples/batched-bench/CMakeFiles/llama-batched-bench.dir/batched-bench.cpp.o
[ 53%] Linking CXX executable ../../bin/llama-batched-bench
[ 53%] Built target llama-batched-bench
[ 53%] Building CXX object examples/batched/CMakeFiles/llama-batched.dir/batched.cpp.o
[ 54%] Linking CXX executable ../../bin/llama-batched
[ 54%] Built target llama-batched
[ 54%] Building CXX object examples/embedding/CMakeFiles/llama-embedding.dir/embedding.cpp.o
[ 55%] Linking CXX executable ../../bin/llama-embedding
[ 55%] Built target llama-embedding
[ 55%] Building CXX object examples/eval-callback/CMakeFiles/llama-eval-callback.dir/eval-callback.cpp.o
[ 56%] Linking CXX executable ../../bin/llama-eval-callback
[ 56%] Built target llama-eval-callback
[ 56%] Building CXX object examples/gbnf-validator/CMakeFiles/llama-gbnf-validator.dir/gbnf-validator.cpp.o
[ 57%] Linking CXX executable ../../bin/llama-gbnf-validator
[ 57%] Built target llama-gbnf-validator
[ 57%] Building C object examples/gguf-hash/CMakeFiles/sha256.dir/deps/sha256/sha256.c.o
[ 57%] Built target sha256
[ 58%] Building C object examples/gguf-hash/CMakeFiles/xxhash.dir/deps/xxhash/xxhash.c.o
[ 58%] Built target xxhash
[ 59%] Building C object examples/gguf-hash/CMakeFiles/sha1.dir/deps/sha1/sha1.c.o
[ 59%] Built target sha1
[ 60%] Building CXX object examples/gguf-hash/CMakeFiles/llama-gguf-hash.dir/gguf-hash.cpp.o
[ 60%] Linking CXX executable ../../bin/llama-gguf-hash
[ 60%] Built target llama-gguf-hash
[ 61%] Building CXX object examples/gguf-split/CMakeFiles/llama-gguf-split.dir/gguf-split.cpp.o
[ 61%] Linking CXX executable ../../bin/llama-gguf-split
[ 61%] Built target llama-gguf-split
[ 62%] Building CXX object examples/gguf/CMakeFiles/llama-gguf.dir/gguf.cpp.o
[ 62%] Linking CXX executable ../../bin/llama-gguf
[ 62%] Built target llama-gguf
[ 63%] Building CXX object examples/gritlm/CMakeFiles/llama-gritlm.dir/gritlm.cpp.o
[ 63%] Linking CXX executable ../../bin/llama-gritlm
[ 63%] Built target llama-gritlm
[ 64%] Building CXX object examples/imatrix/CMakeFiles/llama-imatrix.dir/imatrix.cpp.o
[ 64%] Linking CXX executable ../../bin/llama-imatrix
[ 64%] Built target llama-imatrix
[ 65%] Building CXX object examples/infill/CMakeFiles/llama-infill.dir/infill.cpp.o
[ 65%] Linking CXX executable ../../bin/llama-infill
[ 65%] Built target llama-infill
[ 65%] Building CXX object examples/llama-bench/CMakeFiles/llama-bench.dir/llama-bench.cpp.o
[ 66%] Linking CXX executable ../../bin/llama-bench
[ 66%] Built target llama-bench
[ 67%] Building CXX object examples/lookahead/CMakeFiles/llama-lookahead.dir/lookahead.cpp.o
[ 67%] Linking CXX executable ../../bin/llama-lookahead
[ 67%] Built target llama-lookahead
[ 68%] Building CXX object examples/lookup/CMakeFiles/llama-lookup.dir/lookup.cpp.o
[ 68%] Linking CXX executable ../../bin/llama-lookup
[ 68%] Built target llama-lookup
[ 69%] Building CXX object examples/lookup/CMakeFiles/llama-lookup-create.dir/lookup-create.cpp.o
[ 69%] Linking CXX executable ../../bin/llama-lookup-create
[ 69%] Built target llama-lookup-create
[ 70%] Building CXX object examples/lookup/CMakeFiles/llama-lookup-merge.dir/lookup-merge.cpp.o
[ 70%] Linking CXX executable ../../bin/llama-lookup-merge
[ 70%] Built target llama-lookup-merge
[ 71%] Building CXX object examples/lookup/CMakeFiles/llama-lookup-stats.dir/lookup-stats.cpp.o
[ 71%] Linking CXX executable ../../bin/llama-lookup-stats
[ 71%] Built target llama-lookup-stats
[ 71%] Building CXX object examples/main/CMakeFiles/llama-cli.dir/main.cpp.o
[ 72%] Linking CXX executable ../../bin/llama-cli
[ 72%] Built target llama-cli
[ 73%] Building CXX object examples/parallel/CMakeFiles/llama-parallel.dir/parallel.cpp.o
[ 73%] Linking CXX executable ../../bin/llama-parallel
[ 73%] Built target llama-parallel
[ 74%] Building CXX object examples/passkey/CMakeFiles/llama-passkey.dir/passkey.cpp.o
[ 74%] Linking CXX executable ../../bin/llama-passkey
[ 74%] Built target llama-passkey
[ 75%] Building CXX object examples/perplexity/CMakeFiles/llama-perplexity.dir/perplexity.cpp.o
[ 76%] Linking CXX executable ../../bin/llama-perplexity
[ 76%] Built target llama-perplexity
[ 76%] Building CXX object examples/quantize/CMakeFiles/llama-quantize.dir/quantize.cpp.o
[ 77%] Linking CXX executable ../../bin/llama-quantize
[ 77%] Built target llama-quantize
[ 77%] Building CXX object examples/retrieval/CMakeFiles/llama-retrieval.dir/retrieval.cpp.o
[ 78%] Linking CXX executable ../../bin/llama-retrieval
[ 78%] Built target llama-retrieval
[ 79%] Generating loading.html.hpp
[ 79%] Generating index.html.gz.hpp
[ 79%] Building CXX object examples/server/CMakeFiles/llama-server.dir/server.cpp.o
[ 80%] Linking CXX executable ../../bin/llama-server
[ 80%] Built target llama-server
[ 80%] Building CXX object examples/save-load-state/CMakeFiles/llama-save-load-state.dir/save-load-state.cpp.o
[ 81%] Linking CXX executable ../../bin/llama-save-load-state
[ 81%] Built target llama-save-load-state
[ 81%] Building CXX object examples/run/CMakeFiles/llama-run.dir/run.cpp.o
[ 82%] Linking CXX executable ../../bin/llama-run
[ 82%] Built target llama-run
[ 82%] Building CXX object examples/simple/CMakeFiles/llama-simple.dir/simple.cpp.o
[ 83%] Linking CXX executable ../../bin/llama-simple
[ 83%] Built target llama-simple
[ 83%] Building CXX object examples/simple-chat/CMakeFiles/llama-simple-chat.dir/simple-chat.cpp.o
[ 84%] Linking CXX executable ../../bin/llama-simple-chat
[ 84%] Built target llama-simple-chat
[ 84%] Building CXX object examples/speculative/CMakeFiles/llama-speculative.dir/speculative.cpp.o
[ 85%] Linking CXX executable ../../bin/llama-speculative
[ 85%] Built target llama-speculative
[ 85%] Building CXX object examples/speculative-simple/CMakeFiles/llama-speculative-simple.dir/speculative-simple.cpp.o
[ 86%] Linking CXX executable ../../bin/llama-speculative-simple
[ 86%] Built target llama-speculative-simple
[ 86%] Building CXX object examples/tokenize/CMakeFiles/llama-tokenize.dir/tokenize.cpp.o
[ 87%] Linking CXX executable ../../bin/llama-tokenize
[ 87%] Built target llama-tokenize
[ 87%] Building CXX object examples/tts/CMakeFiles/llama-tts.dir/tts.cpp.o
[ 88%] Linking CXX executable ../../bin/llama-tts
[ 88%] Built target llama-tts
[ 88%] Building CXX object examples/gen-docs/CMakeFiles/llama-gen-docs.dir/gen-docs.cpp.o
[ 89%] Linking CXX executable ../../bin/llama-gen-docs
[ 89%] Built target llama-gen-docs
[ 89%] Building CXX object examples/convert-llama2c-to-ggml/CMakeFiles/llama-convert-llama2c-to-ggml.dir/convert-llama2c-to-ggml.cpp.o
[ 90%] Linking CXX executable ../../bin/llama-convert-llama2c-to-ggml
[ 90%] Built target llama-convert-llama2c-to-ggml
[ 90%] Building CXX object examples/cvector-generator/CMakeFiles/llama-cvector-generator.dir/cvector-generator.cpp.o
[ 91%] Linking CXX executable ../../bin/llama-cvector-generator
[ 91%] Built target llama-cvector-generator
[ 91%] Building CXX object examples/export-lora/CMakeFiles/llama-export-lora.dir/export-lora.cpp.o
[ 92%] Linking CXX executable ../../bin/llama-export-lora
[ 92%] Built target llama-export-lora
[ 92%] Building CXX object examples/quantize-stats/CMakeFiles/llama-quantize-stats.dir/quantize-stats.cpp.o
[ 93%] Linking CXX executable ../../bin/llama-quantize-stats
[ 93%] Built target llama-quantize-stats
[ 94%] Building CXX object examples/llava/CMakeFiles/llava.dir/llava.cpp.o
[ 94%] Building CXX object examples/llava/CMakeFiles/llava.dir/clip.cpp.o
[ 94%] Built target llava
[ 94%] Linking CXX static library libllava_static.a
[ 94%] Built target llava_static
[ 95%] Linking CXX shared library libllava_shared.dylib
[ 95%] Built target llava_shared
[ 96%] Building CXX object examples/llava/CMakeFiles/llama-llava-cli.dir/llava-cli.cpp.o
[ 96%] Linking CXX executable ../../bin/llama-llava-cli
[ 96%] Built target llama-llava-cli
[ 97%] Building CXX object examples/llava/CMakeFiles/llama-minicpmv-cli.dir/minicpmv-cli.cpp.o
[ 97%] Linking CXX executable ../../bin/llama-minicpmv-cli
[ 97%] Built target llama-minicpmv-cli
[ 97%] Building CXX object examples/llava/CMakeFiles/llama-qwen2vl-cli.dir/qwen2vl-cli.cpp.o
[ 98%] Linking CXX executable ../../bin/llama-qwen2vl-cli
[ 98%] Built target llama-qwen2vl-cli
[ 98%] Building CXX object pocs/vdot/CMakeFiles/llama-vdot.dir/vdot.cpp.o
[ 99%] Linking CXX executable ../../bin/llama-vdot
[ 99%] Built target llama-vdot
[ 99%] Building CXX object pocs/vdot/CMakeFiles/llama-q8dot.dir/q8dot.cpp.o
[100%] Linking CXX executable ../../bin/llama-q8dot
[100%] Built target llama-q8dot

I do still have the issue of models only generating "@@@@@" as mentioned above when running on the GPU but that may just be my environment.

BTW: I was able to get my AMD GPU working with Metal when building the repo yesterday. The version from brew didn't work on my GPU so I thought I'd share just in case it's a side effect you're not aware of. Performance isn't great at about 5 tokens/s compared to 10 tokens/s when running on the CPU but it's something 👍

@jeffbolznv
Copy link
Collaborator

Thanks for checking. I don't know what would be causing the corruption. Is this using MoltenVK? Can you run test-backend-ops and see if any are failing?

@soerenkampschroer
Copy link
Author

No problem, happy to help where I can.

I'm using MoltenVK, I installed it through brew (I ran brew install cmake libomp vulkan-headers glslang molten-vk shaderc before building).

I ran test-backend-ops and Vulkan is indeed failig. It's very verbose and I'm not sure where to begin.

Output test-backend-ops
❯ ./test-backend-ops
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6800 (MoltenVK) | uma: 0 | fp16: 1 | warp size: 64 | matrix cores: none
Testing 3 devices

Backend 1/3: Vulkan0
ggml_vulkan: Compiling shaders..........................Done!
Device description: AMD Radeon RX 6800
Device memory: 16368 MB (16368 MB free)

ABS(type=f32,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
ABS(type=f32,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
SGN(type=f32,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
SGN(type=f32,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
NEG(type=f32,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
NEG(type=f32,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
STEP(type=f32,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
STEP(type=f32,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
TANH(type=f32,ne_a=[128,2,2,2],v=0): OK
TANH(type=f32,ne_a=[5,7,11,13],v=0): OK
ELU(type=f32,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
ELU(type=f32,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
RELU(type=f32,ne_a=[128,2,2,2],v=0): OK
RELU(type=f32,ne_a=[5,7,11,13],v=0): OK
SIGMOID(type=f32,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
SIGMOID(type=f32,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
GELU(type=f32,ne_a=[128,2,2,2],v=0): OK
GELU(type=f32,ne_a=[5,7,11,13],v=0): OK
GELU_QUICK(type=f32,ne_a=[128,2,2,2],v=0): OK
GELU_QUICK(type=f32,ne_a=[5,7,11,13],v=0): OK
SILU(type=f32,ne_a=[128,2,2,2],v=0): OK
SILU(type=f32,ne_a=[5,7,11,13],v=0): OK
HARDSWISH(type=f32,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
HARDSWISH(type=f32,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
HARDSIGMOID(type=f32,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
HARDSIGMOID(type=f32,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
EXP(type=f32,ne_a=[128,2,2,2],v=0): not supported [Vulkan0]
EXP(type=f32,ne_a=[5,7,11,13],v=0): not supported [Vulkan0]
ABS(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
ABS(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
SGN(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
SGN(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
NEG(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
NEG(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
STEP(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
STEP(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
TANH(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
TANH(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
ELU(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
ELU(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
RELU(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
RELU(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
SIGMOID(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
SIGMOID(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
GELU(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
GELU(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
GELU_QUICK(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
GELU_QUICK(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
SILU(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
SILU(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
HARDSWISH(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
HARDSWISH(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
HARDSIGMOID(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
HARDSIGMOID(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
EXP(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
EXP(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]
GET_ROWS(type=f32,n=1,m=8,r=2,b=1,v=0): OK
GET_ROWS(type=f32,n=256,m=5,r=4,b=1,v=0): OK
GET_ROWS(type=f32,n=256,m=5,r=4,b=1,v=1): OK
GET_ROWS(type=f32,n=256,m=5,r=4,b=7,v=0): OK
GET_ROWS(type=f32,n=256,m=5,r=4,b=7,v=1): OK
GET_ROWS(type=f16,n=256,m=5,r=4,b=1,v=0): OK
GET_ROWS(type=f16,n=256,m=5,r=4,b=1,v=1): OK
GET_ROWS(type=f16,n=256,m=5,r=4,b=7,v=0): OK
GET_ROWS(type=f16,n=256,m=5,r=4,b=7,v=1): OK
GET_ROWS(type=bf16,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=bf16,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=bf16,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=bf16,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=q4_0,n=256,m=5,r=4,b=1,v=0): OK
GET_ROWS(type=q4_0,n=256,m=5,r=4,b=1,v=1): OK
GET_ROWS(type=q4_0,n=256,m=5,r=4,b=7,v=0): OK
GET_ROWS(type=q4_0,n=256,m=5,r=4,b=7,v=1): OK
GET_ROWS(type=q4_1,n=256,m=5,r=4,b=1,v=0): OK
GET_ROWS(type=q4_1,n=256,m=5,r=4,b=1,v=1): OK
GET_ROWS(type=q4_1,n=256,m=5,r=4,b=7,v=0): OK
GET_ROWS(type=q4_1,n=256,m=5,r=4,b=7,v=1): OK
GET_ROWS(type=q5_0,n=256,m=5,r=4,b=1,v=0): OK
GET_ROWS(type=q5_0,n=256,m=5,r=4,b=1,v=1): OK
GET_ROWS(type=q5_0,n=256,m=5,r=4,b=7,v=0): OK
GET_ROWS(type=q5_0,n=256,m=5,r=4,b=7,v=1): OK
GET_ROWS(type=q5_1,n=256,m=5,r=4,b=1,v=0): OK
GET_ROWS(type=q5_1,n=256,m=5,r=4,b=1,v=1): OK
GET_ROWS(type=q5_1,n=256,m=5,r=4,b=7,v=0): OK
GET_ROWS(type=q5_1,n=256,m=5,r=4,b=7,v=1): OK
GET_ROWS(type=q8_0,n=256,m=5,r=4,b=1,v=0): OK
GET_ROWS(type=q8_0,n=256,m=5,r=4,b=1,v=1): OK
GET_ROWS(type=q8_0,n=256,m=5,r=4,b=7,v=0): OK
GET_ROWS(type=q8_0,n=256,m=5,r=4,b=7,v=1): OK
GET_ROWS(type=q2_K,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=q2_K,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=q2_K,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=q2_K,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=q3_K,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=q3_K,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=q3_K,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=q3_K,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=q4_K,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=q4_K,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=q4_K,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=q4_K,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=q5_K,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=q5_K,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=q5_K,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=q5_K,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=q6_K,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=q6_K,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=q6_K,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=q6_K,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=iq2_xxs,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=iq2_xxs,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=iq2_xxs,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=iq2_xxs,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=iq2_xs,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=iq2_xs,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=iq2_xs,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=iq2_xs,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=iq2_s,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=iq2_s,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=iq2_s,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=iq2_s,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=iq3_xxs,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=iq3_xxs,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=iq3_xxs,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=iq3_xxs,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=iq1_s,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=iq1_s,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=iq1_s,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=iq1_s,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=iq1_m,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=iq1_m,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=iq1_m,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=iq1_m,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=iq4_nl,n=256,m=5,r=4,b=1,v=0): OK
GET_ROWS(type=iq4_nl,n=256,m=5,r=4,b=1,v=1): OK
GET_ROWS(type=iq4_nl,n=256,m=5,r=4,b=7,v=0): OK
GET_ROWS(type=iq4_nl,n=256,m=5,r=4,b=7,v=1): OK
GET_ROWS(type=iq3_s,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=iq3_s,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=iq3_s,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=iq3_s,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=iq4_xs,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=iq4_xs,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=iq4_xs,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=iq4_xs,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
GET_ROWS(type=i32,n=256,m=5,r=4,b=1,v=0): not supported [Vulkan0]
GET_ROWS(type=i32,n=256,m=5,r=4,b=1,v=1): not supported [Vulkan0]
GET_ROWS(type=i32,n=256,m=5,r=4,b=7,v=0): not supported [Vulkan0]
GET_ROWS(type=i32,n=256,m=5,r=4,b=7,v=1): not supported [Vulkan0]
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=avg,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=1,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=1,s0=2,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=1,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=1,k1=3,s0=2,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=1,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=1,s0=2,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=1,s1=2,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=1,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=1,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=1,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=1,p0=1,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=2,p0=0,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=2,p0=0,p1=1): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=2,p0=1,p1=0): OK
POOL_2D(pool_type=max,type_input=f32,ne_input=[10,10,3,1],k0=3,k1=3,s0=2,s1=2,p0=1,p1=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[3000,128,1,1],ne_kernel=[3,128,1280,1],s0=1,s1=0,p0=1,p1=0,d0=1,d1=0,is_2D=0): OK
IM2COL(type_input=f32,type_kernel=f16,dst_type=f32,ne_input=[3000,128,1,1],ne_kernel=[3,128,1280,1],s0=1,s1=0,p0=1,p1=0,d0=1,d1=0,is_2D=0): OK
IM2COL(type_input=f32,type_kernel=f16,dst_type=f16,ne_input=[3000,128,1,1],ne_kernel=[3,128,1280,1],s0=1,s1=0,p0=1,p1=0,d0=1,d1=0,is_2D=0): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,2,2,1],ne_kernel=[3,2,2,1],s0=1,s1=0,p0=0,p1=0,d0=1,d1=0,is_2D=0): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,2,2,1],ne_kernel=[3,2,2,1],s0=1,s1=0,p0=0,p1=0,d0=3,d1=0,is_2D=0): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,2,2,1],ne_kernel=[3,2,2,1],s0=1,s1=0,p0=3,p1=0,d0=1,d1=0,is_2D=0): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,2,2,1],ne_kernel=[3,2,2,1],s0=1,s1=0,p0=3,p1=0,d0=3,d1=0,is_2D=0): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,2,2,1],ne_kernel=[3,2,2,1],s0=3,s1=0,p0=0,p1=0,d0=1,d1=0,is_2D=0): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,2,2,1],ne_kernel=[3,2,2,1],s0=3,s1=0,p0=0,p1=0,d0=3,d1=0,is_2D=0): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,2,2,1],ne_kernel=[3,2,2,1],s0=3,s1=0,p0=3,p1=0,d0=1,d1=0,is_2D=0): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,2,2,1],ne_kernel=[3,2,2,1],s0=3,s1=0,p0=3,p1=0,d0=3,d1=0,is_2D=0): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[10,10,3,1],ne_kernel=[3,3,3,1],s0=1,s1=1,p0=1,p1=1,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f16,dst_type=f32,ne_input=[10,10,3,1],ne_kernel=[3,3,3,1],s0=1,s1=1,p0=1,p1=1,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f16,dst_type=f16,ne_input=[10,10,3,1],ne_kernel=[3,3,3,1],s0=1,s1=1,p0=1,p1=1,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=0,p1=0,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=0,p1=0,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=0,p1=0,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=0,p1=0,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=0,p1=3,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=0,p1=3,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=0,p1=3,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=0,p1=3,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=3,p1=0,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=3,p1=0,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=3,p1=0,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=3,p1=0,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=3,p1=3,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=3,p1=3,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=3,p1=3,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=1,p0=3,p1=3,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=0,p1=0,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=0,p1=0,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=0,p1=0,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=0,p1=0,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=0,p1=3,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=0,p1=3,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=0,p1=3,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=0,p1=3,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=3,p1=0,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=3,p1=0,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=3,p1=0,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=3,p1=0,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=3,p1=3,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=3,p1=3,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=3,p1=3,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=1,s1=3,p0=3,p1=3,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=0,p1=0,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=0,p1=0,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=0,p1=0,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=0,p1=0,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=0,p1=3,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=0,p1=3,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=0,p1=3,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=0,p1=3,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=3,p1=0,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=3,p1=0,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=3,p1=0,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=3,p1=0,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=3,p1=3,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=3,p1=3,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=3,p1=3,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=1,p0=3,p1=3,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=0,p1=0,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=0,p1=0,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=0,p1=0,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=0,p1=0,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=0,p1=3,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=0,p1=3,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=0,p1=3,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=0,p1=3,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=3,p1=0,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=3,p1=0,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=3,p1=0,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=3,p1=0,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=3,p1=3,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=3,p1=3,d0=1,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=3,p1=3,d0=3,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f32,dst_type=f32,ne_input=[20,20,2,2],ne_kernel=[3,3,2,2],s0=3,s1=3,p0=3,p1=3,d0=3,d1=3,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f16,dst_type=f16,ne_input=[12,12,1,32],ne_kernel=[3,3,1,32],s0=1,s1=1,p0=1,p1=1,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f16,dst_type=f16,ne_input=[12,12,2,32],ne_kernel=[3,3,2,32],s0=1,s1=1,p0=1,p1=1,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f16,dst_type=f16,ne_input=[12,12,1,1024],ne_kernel=[3,3,1,1024],s0=1,s1=1,p0=1,p1=1,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f16,dst_type=f16,ne_input=[12,12,2,1024],ne_kernel=[3,3,2,1024],s0=1,s1=1,p0=1,p1=1,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f16,dst_type=f16,ne_input=[12,12,1,2048],ne_kernel=[3,3,1,2048],s0=1,s1=1,p0=1,p1=1,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f16,dst_type=f16,ne_input=[12,12,2,2048],ne_kernel=[3,3,2,2048],s0=1,s1=1,p0=1,p1=1,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f16,dst_type=f16,ne_input=[12,12,1,2560],ne_kernel=[3,3,1,2560],s0=1,s1=1,p0=1,p1=1,d0=1,d1=1,is_2D=1): OK
IM2COL(type_input=f32,type_kernel=f16,dst_type=f16,ne_input=[12,12,2,2560],ne_kernel=[3,3,2,2560],s0=1,s1=1,p0=1,p1=1,d0=1,d1=1,is_2D=1): OK
CONV_TRANSPOSE_1D(ne_input=[197,32,1,1],ne_kernel=[16,32,32,1],s0=1,p0=0,d0=1): not supported [Vulkan0]
CONV_TRANSPOSE_1D(ne_input=[3,2,1,1],ne_kernel=[2,3,2,1],s0=3,p0=0,d0=1): not supported [Vulkan0]
CONV_TRANSPOSE_1D(ne_input=[3,2,1,1],ne_kernel=[2,3,2,1],s0=2,p0=0,d0=1): not supported [Vulkan0]
CONV_TRANSPOSE_1D(ne_input=[3,2,1,1],ne_kernel=[2,3,2,1],s0=1,p0=0,d0=1): not supported [Vulkan0]
CONV_TRANSPOSE_1D(ne_input=[3,2,1,1],ne_kernel=[3,2,2,1],s0=2,p0=0,d0=1): not supported [Vulkan0]
CONV_TRANSPOSE_1D(ne_input=[3,2,1,1],ne_kernel=[3,2,2,1],s0=1,p0=0,d0=1): not supported [Vulkan0]
CONV_TRANSPOSE_1D(ne_input=[3,2,1,1],ne_kernel=[3,1,2,1],s0=1,p0=0,d0=1): not supported [Vulkan0]
CONV_TRANSPOSE_1D(ne_input=[2,1,1,1],ne_kernel=[3,1,1,1],s0=1,p0=0,d0=1): not supported [Vulkan0]
COUNT_EQUAL(type=f32,ne=[4,500,1,1]): not supported [Vulkan0]
ARGMAX(type=f32,ne=[32,1,1,1]): not supported [Vulkan0]
ARGMAX(type=f32,ne=[100,10,1,1]): not supported [Vulkan0]
ARGMAX(type=f32,ne=[1024,10,1,1]): not supported [Vulkan0]
ARGMAX(type=f32,ne=[1024,12,1,1]): not supported [Vulkan0]
ARGMAX(type=f32,ne=[2000,10,1,1]): not supported [Vulkan0]
ARGMAX(type=f32,ne=[5438,3,1,1]): not supported [Vulkan0]
REPEAT(type=f32,ne=[10,5,4,1],nr=[1,1,1,1]): OK
REPEAT(type=f32,ne=[10,5,4,1],nr=[2,1,1,1]): OK
REPEAT(type=f32,ne=[10,5,4,1],nr=[1,2,1,1]): OK
REPEAT(type=f32,ne=[10,5,4,1],nr=[1,1,2,1]): OK
REPEAT(type=f32,ne=[10,5,4,1],nr=[1,1,1,2]): OK
REPEAT(type=i32,ne=[10,5,4,1],nr=[2,1,1,1]): OK
REPEAT(type=i16,ne=[10,5,4,1],nr=[1,1,1,2]): not supported [Vulkan0]
REPEAT(type=f32,ne=[10,5,4,3],nr=[1,1,1,1]): OK
REPEAT(type=f32,ne=[10,5,4,3],nr=[2,1,1,1]): OK
REPEAT(type=f32,ne=[10,5,4,3],nr=[1,2,1,1]): OK
REPEAT(type=f32,ne=[10,5,4,3],nr=[1,1,2,1]): OK
REPEAT(type=f32,ne=[10,5,4,3],nr=[1,1,1,2]): OK
REPEAT(type=i32,ne=[10,5,4,3],nr=[2,1,1,1]): OK
REPEAT(type=i16,ne=[10,5,4,3],nr=[1,1,1,2]): not supported [Vulkan0]
DUP(type=f32,ne=[10,10,20,1]): OK
DUP(type=f16,ne=[10,10,20,1]): OK
DUP(type=i32,ne=[10,10,20,1]): not supported [Vulkan0]
DUP(type=i16,ne=[10,10,20,1]): not supported [Vulkan0]
DUP(type=f32,ne=[10,10,5,1],permute=[0,2,1,3]): OK
DUP(type=f16,ne=[10,10,5,1],permute=[0,2,1,3]): OK
DUP(type=f32,ne=[10,10,5,1],permute=[1,0,2,3]): OK
DUP(type=f16,ne=[10,10,5,1],permute=[1,0,2,3]): OK
DUP(type=i16,ne=[10,8,3,1],permute=[0,2,1,3]): not supported [Vulkan0]
DUP(type=i16,ne=[10,8,3,1],permute=[1,2,0,3]): not supported [Vulkan0]
SET(type_src=f32,type_dst=f32,ne=[6,5,4,3],dim=1): not supported [Vulkan0]
SET(type_src=f32,type_dst=f32,ne=[6,5,4,3],dim=2): not supported [Vulkan0]
SET(type_src=f32,type_dst=f32,ne=[6,5,4,3],dim=3): not supported [Vulkan0]
SET(type_src=i32,type_dst=i32,ne=[6,5,4,3],dim=1): not supported [Vulkan0]
SET(type_src=i32,type_dst=i32,ne=[6,5,4,3],dim=2): not supported [Vulkan0]
SET(type_src=i32,type_dst=i32,ne=[6,5,4,3],dim=3): not supported [Vulkan0]
CPY(type_src=f16,type_dst=f32,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=f32,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=f16,ne=[256,4,4,4],permute=[0,0,0,0]): OK
CPY(type_src=f16,type_dst=f16,ne=[256,2,3,4],permute=[0,2,1,3]): OK
CPY(type_src=f16,type_dst=bf16,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=bf16,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q4_0,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q4_0,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q4_1,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q4_1,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q5_0,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q5_0,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q5_1,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q5_1,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q8_0,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q8_0,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q2_K,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q2_K,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q3_K,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q3_K,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q4_K,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q4_K,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q5_K,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q5_K,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q6_K,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=q6_K,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=iq2_xxs,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq2_xxs,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq2_xs,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq2_xs,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq2_s,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq2_s,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq3_xxs,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq3_xxs,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq1_s,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq1_s,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq1_m,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq1_m,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq4_nl,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=iq4_nl,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=iq3_s,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq3_s,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f16,type_dst=iq4_xs,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=iq4_xs,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=f32,ne=[256,4,4,4],permute=[0,0,0,0]): OK
CPY(type_src=f32,type_dst=f32,ne=[256,2,3,4],permute=[0,2,1,3]): OK
CPY(type_src=f32,type_dst=f16,ne=[256,4,4,4],permute=[0,0,0,0]): OK
CPY(type_src=f32,type_dst=f16,ne=[256,2,3,4],permute=[0,2,1,3]): OK
CPY(type_src=f32,type_dst=bf16,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=bf16,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q4_0,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q4_0,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q4_1,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q4_1,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q5_0,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q5_0,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q5_1,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q5_1,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q8_0,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q8_0,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q2_K,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q2_K,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q3_K,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q3_K,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q4_K,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q4_K,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q5_K,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q5_K,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q6_K,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=q6_K,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=iq2_xxs,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq2_xxs,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq2_xs,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq2_xs,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq2_s,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq2_s,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq3_xxs,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq3_xxs,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq1_s,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq1_s,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq1_m,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq1_m,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq4_nl,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=iq4_nl,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=iq3_s,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq3_s,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
CPY(type_src=f32,type_dst=iq4_xs,ne=[256,4,4,4],permute=[0,0,0,0]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=iq4_xs,ne=[256,2,3,4],permute=[0,2,1,3]): not supported [Vulkan0]
CPY(type_src=f16,type_dst=f16,ne=[256,2,3,4],permute=[1,0,2,3]): OK
CPY(type_src=f16,type_dst=f32,ne=[256,2,3,4],permute=[1,0,2,3]): not supported [Vulkan0]
CPY(type_src=f32,type_dst=f16,ne=[256,2,3,4],permute=[1,0,2,3]): OK
CPY(type_src=f32,type_dst=f32,ne=[256,2,3,4],permute=[1,0,2,3]): OK
CONT(type=f32,ne=[10,10,10,1]): OK
CONT(type=f32,ne=[2,1,1,1]): OK
CONT(type=f32,ne=[2,1,3,5]): OK
CONT(type=f32,ne=[2,3,5,7]): OK
CONT(type=f16,ne=[2,1,1,1]): OK
CONT(type=f16,ne=[2,1,3,5]): OK
CONT(type=f16,ne=[2,3,5,7]): OK
CONT(type=bf16,ne=[2,1,1,1]): not supported [Vulkan0]
CONT(type=bf16,ne=[2,1,3,5]): not supported [Vulkan0]
CONT(type=bf16,ne=[2,3,5,7]): not supported [Vulkan0]
ADD(type=f32,ne=[1,1,8,1],nr=[1,1,1,1]): OK
MUL(type=f32,ne=[1,1,8,1],nr=[1,1,1,1]): OK
DIV(type=f32,ne=[1,1,8,1],nr=[1,1,1,1]): OK
ADD(type=f32,ne=[1,1,1,1],nr=[32,1,1,1]): OK
MUL(type=f32,ne=[1,1,1,1],nr=[32,1,1,1]): OK
DIV(type=f32,ne=[1,1,1,1],nr=[32,1,1,1]): OK
ADD(type=f32,ne=[1,1,320,320],nr=[1,1,1,1]): OK
MUL(type=f32,ne=[1,1,320,320],nr=[1,1,1,1]): OK
DIV(type=f32,ne=[1,1,320,320],nr=[1,1,1,1]): OK
ADD(type=f32,ne=[10,5,1,1],nr=[1,1,1,1]): OK
MUL(type=f32,ne=[10,5,1,1],nr=[1,1,1,1]): OK
DIV(type=f32,ne=[10,5,1,1],nr=[1,1,1,1]): OK
ADD(type=f32,ne=[10,5,4,1],nr=[1,1,1,1]): OK
MUL(type=f32,ne=[10,5,4,1],nr=[1,1,1,1]): OK
DIV(type=f32,ne=[10,5,4,1],nr=[1,1,1,1]): OK
ADD(type=f32,ne=[10,5,4,3],nr=[1,1,1,1]): OK
MUL(type=f32,ne=[10,5,4,3],nr=[1,1,1,1]): OK
DIV(type=f32,ne=[10,5,4,3],nr=[1,1,1,1]): OK
ADD(type=f32,ne=[10,5,4,3],nr=[2,1,1,1]): OK
MUL(type=f32,ne=[10,5,4,3],nr=[2,1,1,1]): OK
DIV(type=f32,ne=[10,5,4,3],nr=[2,1,1,1]): OK
ADD(type=f32,ne=[10,5,4,3],nr=[1,2,1,1]): OK
MUL(type=f32,ne=[10,5,4,3],nr=[1,2,1,1]): OK
DIV(type=f32,ne=[10,5,4,3],nr=[1,2,1,1]): OK
ADD(type=f32,ne=[10,5,4,3],nr=[1,1,2,1]): OK
MUL(type=f32,ne=[10,5,4,3],nr=[1,1,2,1]): OK
DIV(type=f32,ne=[10,5,4,3],nr=[1,1,2,1]): OK
ADD(type=f32,ne=[10,5,4,3],nr=[1,1,1,2]): OK
MUL(type=f32,ne=[10,5,4,3],nr=[1,1,1,2]): OK
DIV(type=f32,ne=[10,5,4,3],nr=[1,1,1,2]): OK
ADD(type=f32,ne=[10,5,4,3],nr=[1,1,2,2]): OK
MUL(type=f32,ne=[10,5,4,3],nr=[1,1,2,2]): OK
DIV(type=f32,ne=[10,5,4,3],nr=[1,1,2,2]): OK
ADD(type=f32,ne=[10,5,4,3],nr=[1,2,2,2]): OK
MUL(type=f32,ne=[10,5,4,3],nr=[1,2,2,2]): OK
DIV(type=f32,ne=[10,5,4,3],nr=[1,2,2,2]): OK
ADD(type=f32,ne=[10,5,4,3],nr=[2,2,2,2]): OK
MUL(type=f32,ne=[10,5,4,3],nr=[2,2,2,2]): OK
DIV(type=f32,ne=[10,5,4,3],nr=[2,2,2,2]): OK
ADD(type=f32,ne=[1280,1,1,1],nr=[1,1,1,1]): OK
MUL(type=f32,ne=[1280,1,1,1],nr=[1,1,1,1]): OK
DIV(type=f32,ne=[1280,1,1,1],nr=[1,1,1,1]): OK
ADD(type=f32,ne=[1280,1,1,1],nr=[1,16,16,1]): OK
MUL(type=f32,ne=[1280,1,1,1],nr=[1,16,16,1]): OK
DIV(type=f32,ne=[1280,1,1,1],nr=[1,16,16,1]): OK
ADD(type=f32,ne=[1280,16,16,1],nr=[1,1,1,1]): OK
MUL(type=f32,ne=[1280,16,16,1],nr=[1,1,1,1]): OK
DIV(type=f32,ne=[1280,16,16,1],nr=[1,1,1,1]): OK
ADD(type=f32,ne=[1280,1,1,1],nr=[1,256,1,1]): OK
MUL(type=f32,ne=[1280,1,1,1],nr=[1,256,1,1]): OK
DIV(type=f32,ne=[1280,1,1,1],nr=[1,256,1,1]): OK
ADD(type=f32,ne=[1,1,1280,1],nr=[16,16,1,1]): OK
MUL(type=f32,ne=[1,1,1280,1],nr=[16,16,1,1]): OK
DIV(type=f32,ne=[1,1,1280,1],nr=[16,16,1,1]): OK
ADD(type=f32,ne=[16,16,1280,1],nr=[1,1,1,1]): OK
MUL(type=f32,ne=[16,16,1280,1],nr=[1,1,1,1]): OK
DIV(type=f32,ne=[16,16,1280,1],nr=[1,1,1,1]): OK
ADD(type=f32,ne=[1,1,1920,1],nr=[16,16,1,1]): OK
MUL(type=f32,ne=[1,1,1920,1],nr=[16,16,1,1]): OK
DIV(type=f32,ne=[1,1,1920,1],nr=[16,16,1,1]): OK
ADD(type=f32,ne=[1,1,2560,1],nr=[16,16,1,1]): OK
MUL(type=f32,ne=[1,1,2560,1],nr=[16,16,1,1]): OK
DIV(type=f32,ne=[1,1,2560,1],nr=[16,16,1,1]): OK
ADD(type=f32,ne=[1,1,1280,1],nr=[32,32,1,1]): OK
MUL(type=f32,ne=[1,1,1280,1],nr=[32,32,1,1]): OK
DIV(type=f32,ne=[1,1,1280,1],nr=[32,32,1,1]): OK
ADD(type=f32,ne=[1,1,1920,1],nr=[32,32,1,1]): OK
MUL(type=f32,ne=[1,1,1920,1],nr=[32,32,1,1]): OK
DIV(type=f32,ne=[1,1,1920,1],nr=[32,32,1,1]): OK
ADD(type=f32,ne=[1,1,640,1],nr=[32,32,1,1]): OK
MUL(type=f32,ne=[1,1,640,1],nr=[32,32,1,1]): OK
DIV(type=f32,ne=[1,1,640,1],nr=[32,32,1,1]): OK
ADD(type=f32,ne=[5120,1,1,1],nr=[1,256,1,1]): OK
MUL(type=f32,ne=[5120,1,1,1],nr=[1,256,1,1]): OK
DIV(type=f32,ne=[5120,1,1,1],nr=[1,256,1,1]): OK
ADD(type=f32,ne=[640,1,1,1],nr=[1,1,1,1]): OK
MUL(type=f32,ne=[640,1,1,1],nr=[1,1,1,1]): OK
DIV(type=f32,ne=[640,1,1,1],nr=[1,1,1,1]): OK
ADD1(type=f32,ne=[10,5,4,3]): not supported [Vulkan0]
SCALE(type=f32,ne=[10,10,10,10],scale=2.000000): OK
NORM(type=f32,ne=[64,5,4,3],eps=0.000001): OK
RMS_NORM(type=f32,ne=[64,5,4,3],eps=0.000001): OK
NORM(type=f32,ne=[64,5,4,3],eps=0.000010): OK
RMS_NORM(type=f32,ne=[64,5,4,3],eps=0.000010): OK
NORM(type=f32,ne=[64,5,4,3],eps=0.001000): OK
RMS_NORM(type=f32,ne=[64,5,4,3],eps=0.001000): OK
NORM(type=f32,ne=[64,5,4,3],eps=0.100000): OK
RMS_NORM(type=f32,ne=[64,5,4,3],eps=0.100000): OK
SSM_CONV(type=f32,ne_a=[4,1536,1,1],ne_b=[4,1536,1,1]): not supported [Vulkan0]
SSM_CONV(type=f32,ne_a=[8,1536,1,1],ne_b=[4,1536,1,1]): not supported [Vulkan0]
SSM_CONV(type=f32,ne_a=[4,1536,4,1],ne_b=[4,1536,1,1]): not supported [Vulkan0]
SSM_SCAN(type=f32,d_state=16,d_inner=1024,n_seq_tokens=32,n_seqs=4): not supported [Vulkan0]
RWKV_WKV6(type=f32,head_count=32,head_size=64,n_seq_tokens=1,n_seqs=1): OK
RWKV_WKV6(type=f32,head_count=32,head_size=64,n_seq_tokens=32,n_seqs=1): OK
RWKV_WKV6(type=f32,head_count=32,head_size=64,n_seq_tokens=32,n_seqs=4): OK
RWKV_WKV6(type=f32,head_count=32,head_size=64,n_seq_tokens=128,n_seqs=4): OK
MUL_MAT(type_a=f16,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.661085948 > 0.000500000 FAIL
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.400560787 > 0.000500000 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.681256329 > 0.000500000 FAIL
MUL_MAT(type_a=q5_0,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.079188357 > 0.000500000 FAIL
MUL_MAT(type_a=q5_1,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.215862548 > 0.000500000 FAIL
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q5_K,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q6_K,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=iq4_nl,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.975282763 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=2,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=2,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.000175833 > 0.000500000 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=2,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.000118234 > 0.000500000 FAIL
MUL_MAT(type_a=q5_0,type_b=f32,m=16,n=2,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.999974845 > 0.000500000 FAIL
MUL_MAT(type_a=q5_1,type_b=f32,m=16,n=2,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=-inf CPU=-0.338433 FAIL
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=2,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=2,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q5_K,type_b=f32,m=16,n=2,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q6_K,type_b=f32,m=16,n=2,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=iq4_nl,type_b=f32,m=16,n=2,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=-inf CPU=-4.190618 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=3,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=3,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.000043822 > 0.000500000 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=3,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.998296223 > 0.000500000 FAIL
MUL_MAT(type_a=q5_0,type_b=f32,m=16,n=3,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=-inf CPU=-7.237620 FAIL
MUL_MAT(type_a=q5_1,type_b=f32,m=16,n=3,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NaN at index 1 (Vulkan0=nan CPU=-9.693287) FAIL
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=3,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=3,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q5_K,type_b=f32,m=16,n=3,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q6_K,type_b=f32,m=16,n=3,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=iq4_nl,type_b=f32,m=16,n=3,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NaN at index 0 (Vulkan0=nan CPU=-1.046309) FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=4,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=4,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.042294041 > 0.000500000 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=4,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=inf CPU=9.104725 FAIL
MUL_MAT(type_a=q5_0,type_b=f32,m=16,n=4,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NaN at index 1 (Vulkan0=nan CPU=9.292299) FAIL
MUL_MAT(type_a=q5_1,type_b=f32,m=16,n=4,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=-inf CPU=1.899602 FAIL
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=4,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=4,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q5_K,type_b=f32,m=16,n=4,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q6_K,type_b=f32,m=16,n=4,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=iq4_nl,type_b=f32,m=16,n=4,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.000055343 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=5,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=5,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=-inf CPU=-8.745516 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=5,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.999567162 > 0.000500000 FAIL
MUL_MAT(type_a=q5_0,type_b=f32,m=16,n=5,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=-inf CPU=-6.530345 FAIL
MUL_MAT(type_a=q5_1,type_b=f32,m=16,n=5,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NaN at index 6 (Vulkan0=nan CPU=-1.844313) FAIL
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=5,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=5,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q5_K,type_b=f32,m=16,n=5,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q6_K,type_b=f32,m=16,n=5,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=iq4_nl,type_b=f32,m=16,n=5,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NaN at index 0 (Vulkan0=nan CPU=7.115935) FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=6,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=6,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NaN at index 9 (Vulkan0=nan CPU=-1.214520) FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=6,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NaN at index 1 (Vulkan0=nan CPU=6.053637) FAIL
MUL_MAT(type_a=q5_0,type_b=f32,m=16,n=6,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NaN at index 1 (Vulkan0=nan CPU=-2.437598) FAIL
MUL_MAT(type_a=q5_1,type_b=f32,m=16,n=6,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.998213900 > 0.000500000 FAIL
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=6,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=6,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q5_K,type_b=f32,m=16,n=6,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q6_K,type_b=f32,m=16,n=6,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=iq4_nl,type_b=f32,m=16,n=6,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=-inf CPU=-3.623495 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=7,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=7,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NaN at index 5 (Vulkan0=nan CPU=-3.815799) FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=7,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=-inf CPU=1.870180 FAIL
MUL_MAT(type_a=q5_0,type_b=f32,m=16,n=7,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.012911283 > 0.000500000 FAIL
MUL_MAT(type_a=q5_1,type_b=f32,m=16,n=7,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=inf CPU=-11.407846 FAIL
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=7,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=7,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q5_K,type_b=f32,m=16,n=7,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q6_K,type_b=f32,m=16,n=7,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=iq4_nl,type_b=f32,m=16,n=7,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NaN at index 5 (Vulkan0=nan CPU=1.980819) FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=8,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=8,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=inf CPU=-8.924566 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=8,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NaN at index 1 (Vulkan0=nan CPU=-7.957967) FAIL
MUL_MAT(type_a=q5_0,type_b=f32,m=16,n=8,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.578344182 > 0.000500000 FAIL
MUL_MAT(type_a=q5_1,type_b=f32,m=16,n=8,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.393112502 > 0.000500000 FAIL
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=8,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=8,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q5_K,type_b=f32,m=16,n=8,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q6_K,type_b=f32,m=16,n=8,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=iq4_nl,type_b=f32,m=16,n=8,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NaN at index 0 (Vulkan0=nan CPU=2.719533) FAIL
MUL_MAT(type_a=f32,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.587496911 > 0.000500000 FAIL
MUL_MAT(type_a=f32,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.362449189 > 0.000500000 FAIL
MUL_MAT(type_a=f32,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.032061575 > 0.000500000 FAIL
MUL_MAT(type_a=f32,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.966385343 > 0.000500000 FAIL
MUL_MAT(type_a=f32,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.981731167 > 0.000500000 FAIL
MUL_MAT(type_a=f32,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=f32,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=f32,type_b=f32,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f32,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f32,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f32,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f32,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f32,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=f32,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=f32,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): [MUL_MAT] NMSE = 0.733512549 > 0.000500000 FAIL
MUL_MAT(type_a=f32,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): [MUL_MAT] NMSE = 1.013535720 > 0.000500000 FAIL
MUL_MAT(type_a=f32,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): [MUL_MAT] NMSE = 0.803323170 > 0.000500000 FAIL
MUL_MAT(type_a=f32,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): OK
MUL_MAT(type_a=f32,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): OK
MUL_MAT(type_a=f32,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): OK
MUL_MAT(type_a=f32,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): OK
MUL_MAT(type_a=f32,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): OK
MUL_MAT(type_a=f32,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): OK
MUL_MAT(type_a=f32,type_b=f16,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [CPU]
MUL_MAT(type_a=f32,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [CPU]
MUL_MAT(type_a=f16,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.125975642 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.987203368 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.934508655 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.995750783 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.020384841 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=f16,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=f16,type_b=f32,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f16,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f16,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f16,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f16,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f16,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=f16,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=f16,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): [MUL_MAT] NMSE = 1.040987350 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): [MUL_MAT] NMSE = 0.638536524 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): [MUL_MAT] NMSE = 1.546060580 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): OK
MUL_MAT(type_a=f16,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): OK
MUL_MAT(type_a=f16,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): OK
MUL_MAT(type_a=f16,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): OK
MUL_MAT(type_a=f16,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): OK
MUL_MAT(type_a=f16,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): OK
MUL_MAT(type_a=f16,type_b=f16,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.891958905 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.594799932 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.109942999 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.030334714 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.993919169 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=f16,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=f16,type_b=f16,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f16,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f16,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f16,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f16,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f16,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=f16,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=f16,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): [MUL_MAT] NMSE = 0.900133733 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): [MUL_MAT] NMSE = 0.768604431 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): [MUL_MAT] NMSE = 1.214501285 > 0.000500000 FAIL
MUL_MAT(type_a=f16,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): OK
MUL_MAT(type_a=f16,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): OK
MUL_MAT(type_a=f16,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): OK
MUL_MAT(type_a=f16,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): OK
MUL_MAT(type_a=f16,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): OK
MUL_MAT(type_a=f16,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): OK
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q8_0,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.712161658 > 0.000500000 FAIL
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.352495666 > 0.000500000 FAIL
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.087372846 > 0.000500000 FAIL
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.900193793 > 0.000500000 FAIL
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.012933385 > 0.000500000 FAIL
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.999901452 > 0.000500000 FAIL
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=-inf CPU=1.831558 FAIL
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=-inf CPU=0.292956 FAIL
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=-inf CPU=-3.602249 FAIL
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=inf CPU=5.673249 FAIL
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_0,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.029693619 > 0.000500000 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.726815056 > 0.000500000 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.964693936 > 0.000500000 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.015064556 > 0.000500000 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.009439662 > 0.000500000 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=inf CPU=-1.808779 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NaN at index 1 (Vulkan0=nan CPU=-0.240183) FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=inf CPU=4.865494 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=-inf CPU=-3.224285 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): [MUL_MAT] inf mismatch: Vulkan0=-inf CPU=2.944630 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_K,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f32,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=1,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=1,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=16,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=16,k=256,bs=[10,1],nr=[2,1],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,1],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[1,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=16,k=256,bs=[10,10],nr=[2,2],per=[0,1,2,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=1,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=8,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,2,1,3]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,1,3,2]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=iq2_xxs,type_b=f16,m=16,n=16,k=256,bs=[2,3],nr=[1,1],per=[0,3,2,1]): not supported [Vulkan0] not supported [CPU]
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=1,k=32,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.818720397 > 0.000500000 FAIL
MUL_MAT(type_a=q4_1,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.062183374 > 0.000500000 FAIL
MUL_MAT(type_a=q5_0,type_b=f32,m=16,n=1,k=32,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.015355321 > 0.000500000 FAIL
MUL_MAT(type_a=q5_0,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.831845315 > 0.000500000 FAIL
MUL_MAT(type_a=q5_1,type_b=f32,m=16,n=1,k=32,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 1.500386433 > 0.000500000 FAIL
MUL_MAT(type_a=q5_1,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 0.436472787 > 0.000500000 FAIL
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=32,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q2_K,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q3_K,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q5_K,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=q6_K,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=iq2_xs,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq2_s,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq3_xxs,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq1_s,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq1_m,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq4_nl,type_b=f32,m=16,n=1,k=32,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 28.647686329 > 0.000500000 FAIL
MUL_MAT(type_a=iq4_nl,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): [MUL_MAT] NMSE = 3.591837032 > 0.000500000 FAIL
MUL_MAT(type_a=iq3_s,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=iq4_xs,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=bf16,type_b=f32,m=16,n=1,k=1,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=bf16,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): not supported [Vulkan0]
MUL_MAT(type_a=f16,type_b=f32,m=64,n=2,k=128,bs=[8,1],nr=[1,1],per=[0,1,2,3]): sentinel mismatch: sent_3 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=83,n=2,k=128,bs=[8,1],nr=[4,1],per=[0,1,2,3]): sentinel mismatch: sent_3 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=64,n=2,k=64,bs=[8,1],nr=[4,1],per=[0,1,2,3]): sentinel mismatch: sent_3 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=83,n=2,k=64,bs=[8,1],nr=[4,1],per=[0,1,2,3]): sentinel mismatch: sent_3 FAIL
MUL_MAT(type_a=f16,type_b=f32,m=64,n=45,k=128,bs=[8,1],nr=[4,1],per=[0,1,2,3]): OK
MUL_MAT(type_a=f16,type_b=f32,m=128,n=45,k=64,bs=[8,1],nr=[4,1],per=[0,1,2,3]): OK
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.197424061 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.873860072 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.030073208 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.149706361 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.040022251 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.045269493 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.142111286 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.918569203 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.916621015 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.865151598 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.019119892 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.104822114 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f32,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.088325681 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.919258121 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.946592628 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.002545604 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.965189266 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.007085018 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.001002041 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.018313344 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.024731871 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.999255277 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.963149756 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.126771018 > 0.000500000 FAIL
MUL_MAT_ID(type_a=f16,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.052990027 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=inf CPU=9.105417 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.800522499 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=-inf CPU=1.494531 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.924356976 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): [MUL_MAT_ID] NaN at index 32 (Vulkan0=nan CPU=4.635605) FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.842441881 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=32,k=256): [MUL_MAT_ID] NaN at index 31 (Vulkan0=nan CPU=2.030237) FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.060519358 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=-inf CPU=1.034023 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.006181180 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=inf CPU=-10.940933 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.888523969 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=inf CPU=-7.898067 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.181912295 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=-inf CPU=-3.736725 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.022558233 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=32,k=256): [MUL_MAT_ID] NaN at index 11 (Vulkan0=nan CPU=4.183511) FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.053310525 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=32,k=256): [MUL_MAT_ID] NaN at index 26 (Vulkan0=nan CPU=-5.770243) FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.046320924 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=32,k=256): [MUL_MAT_ID] NaN at index 1 (Vulkan0=nan CPU=1.435730) FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.043610654 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_0,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=32,k=256): [MUL_MAT_ID] NaN at index 31 (Vulkan0=nan CPU=1.138597) FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.869594980 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=32,k=256): [MUL_MAT_ID] NaN at index 35 (Vulkan0=nan CPU=-6.290529) FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.921246039 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=inf CPU=-2.897036 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.999138315 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=-inf CPU=-6.349037 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.123311991 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=32,k=256): [MUL_MAT_ID] NaN at index 1 (Vulkan0=nan CPU=-1.407337) FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.037888577 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=inf CPU=-0.900343 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.973479012 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=-inf CPU=0.114285 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.828011690 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=32,k=256): [MUL_MAT_ID] NaN at index 13 (Vulkan0=nan CPU=1.680809) FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.245510491 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=-inf CPU=1.566675 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.942148000 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=-inf CPU=-0.005307 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.926056524 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=inf CPU=7.212160 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.979712349 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=-inf CPU=1.407257 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.034745032 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=-inf CPU=0.210060 FAIL
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q4_K,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=4,n_used=1,b=0,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=4,n_used=1,b=1,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=4,n_used=2,b=1,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=4,n_used=4,b=0,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=4,n_used=4,b=1,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=8,n_used=1,b=0,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=8,n_used=1,b=1,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=8,n_used=2,b=0,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=8,n_used=2,b=1,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=8,n_used=4,b=0,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xxs,type_b=f32,n_mats=8,n_used=4,b=1,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 0.865536822 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q4_1,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): [MUL_MAT_ID] NaN at index 4 (Vulkan0=nan CPU=-6.271731) FAIL
MUL_MAT_ID(type_a=q5_0,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.144338201 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q5_0,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=inf CPU=-4.140865 FAIL
MUL_MAT_ID(type_a=q5_1,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.001393934 > 0.000500000 FAIL
MUL_MAT_ID(type_a=q5_1,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=-inf CPU=-2.371538 FAIL
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q8_0,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q2_K,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q2_K,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q3_K,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q3_K,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q5_K,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q5_K,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=q6_K,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): OK
MUL_MAT_ID(type_a=q6_K,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): OK
MUL_MAT_ID(type_a=iq2_xs,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_xs,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_s,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq2_s,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq3_xxs,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq3_xxs,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq1_s,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq1_s,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq1_m,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq1_m,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq4_nl,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): [MUL_MAT_ID] NMSE = 1.078790674 > 0.000500000 FAIL
MUL_MAT_ID(type_a=iq4_nl,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): [MUL_MAT_ID] inf mismatch: Vulkan0=-inf CPU=-1.357063 FAIL
MUL_MAT_ID(type_a=iq3_s,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq3_s,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq4_xs,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=iq4_xs,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=bf16,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=1,k=256): not supported [Vulkan0]
MUL_MAT_ID(type_a=bf16,type_b=f32,n_mats=4,n_used=2,b=0,m=512,n=32,k=256): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f32,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=f16,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q8_0,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_0,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_1,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=q4_K,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f32,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=1,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=1,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=1,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=16,k=16,bs=[1,1],trans_b=1): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=16,k=16,bs=[10,1],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
OUT_PROD(type_a=iq2_xxs,type_b=f16,m=256,n=16,k=16,bs=[10,10],trans_b=0): not supported [Vulkan0] not supported [CPU]
SQR(type=f32,ne=[10,5,4,3]): OK
SQRT(type=f32,ne=[10,3,3,2]): not supported [Vulkan0]
LOG(type=f32,ne=[10,5,4,3]): not supported [Vulkan0]
SIN(type=f32,ne=[10,2,2,2]): OK
COS(type=f32,ne=[10,2,2,2]): OK
CLAMP(type=f32,ne=[10,5,4,3],min=-0.500000,max=0.500000): OK
DIAG_MASK_INF(type=f32,ne=[10,10,1,1],n_past=5): sentinel mismatch: sent_2 FAIL
DIAG_MASK_INF(type=f32,ne=[10,10,3,1],n_past=5): sentinel mismatch: sent_2 FAIL
DIAG_MASK_INF(type=f32,ne=[10,10,3,2],n_past=5): sentinel mismatch: sent_2 FAIL
SOFT_MAX(type=f32,ne=[16,16,1,1],mask=0,scale=1.000000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[15,15,1,1],mask=0,scale=1.000000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[16,1024,1,1],mask=0,scale=1.000000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[15,1023,1,1],mask=0,scale=1.000000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[1024,16,1,1],mask=0,scale=1.000000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.992216879 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1023,15,1,1],mask=0,scale=1.000000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.989407522 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1024,1024,1,1],mask=0,scale=1.000000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.970080866 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1023,1023,1,1],mask=0,scale=1.000000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.998032321 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[16,16,1,1],mask=0,scale=0.100000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[15,15,1,1],mask=0,scale=0.100000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[16,1024,1,1],mask=0,scale=0.100000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[15,1023,1,1],mask=0,scale=0.100000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[1024,16,1,1],mask=0,scale=0.100000,max_bias=0.000000): [SOFT_MAX] NMSE = 1.000598591 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1023,15,1,1],mask=0,scale=0.100000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.995278639 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1024,1024,1,1],mask=0,scale=0.100000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.971442316 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1023,1023,1,1],mask=0,scale=0.100000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.995985609 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[16,16,1,1],mask=1,scale=1.000000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[15,15,1,1],mask=1,scale=1.000000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[16,1024,1,1],mask=1,scale=1.000000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[15,1023,1,1],mask=1,scale=1.000000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[1024,16,1,1],mask=1,scale=1.000000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.957018904 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1023,15,1,1],mask=1,scale=1.000000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.997755793 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1024,1024,1,1],mask=1,scale=1.000000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.971153020 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1023,1023,1,1],mask=1,scale=1.000000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.987904775 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[16,16,1,1],mask=1,scale=0.100000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[15,15,1,1],mask=1,scale=0.100000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[16,1024,1,1],mask=1,scale=0.100000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[15,1023,1,1],mask=1,scale=0.100000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[1024,16,1,1],mask=1,scale=0.100000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.993326148 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1023,15,1,1],mask=1,scale=0.100000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.968625617 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1024,1024,1,1],mask=1,scale=0.100000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.964081629 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1023,1023,1,1],mask=1,scale=0.100000,max_bias=0.000000): [SOFT_MAX] NMSE = 0.985822218 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[16,16,1,1],mask=1,scale=1.000000,max_bias=8.000000): OK
SOFT_MAX(type=f32,ne=[15,15,1,1],mask=1,scale=1.000000,max_bias=8.000000): OK
SOFT_MAX(type=f32,ne=[16,1024,1,1],mask=1,scale=1.000000,max_bias=8.000000): OK
SOFT_MAX(type=f32,ne=[15,1023,1,1],mask=1,scale=1.000000,max_bias=8.000000): OK
SOFT_MAX(type=f32,ne=[1024,16,1,1],mask=1,scale=1.000000,max_bias=8.000000): [SOFT_MAX] NMSE = 1.017260972 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1023,15,1,1],mask=1,scale=1.000000,max_bias=8.000000): [SOFT_MAX] NMSE = 1.015725154 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1024,1024,1,1],mask=1,scale=1.000000,max_bias=8.000000): [SOFT_MAX] NMSE = 0.964184743 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1023,1023,1,1],mask=1,scale=1.000000,max_bias=8.000000): [SOFT_MAX] NMSE = 0.983833291 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[16,16,1,1],mask=1,scale=0.100000,max_bias=8.000000): OK
SOFT_MAX(type=f32,ne=[15,15,1,1],mask=1,scale=0.100000,max_bias=8.000000): OK
SOFT_MAX(type=f32,ne=[16,1024,1,1],mask=1,scale=0.100000,max_bias=8.000000): OK
SOFT_MAX(type=f32,ne=[15,1023,1,1],mask=1,scale=0.100000,max_bias=8.000000): OK
SOFT_MAX(type=f32,ne=[1024,16,1,1],mask=1,scale=0.100000,max_bias=8.000000): [SOFT_MAX] NMSE = 0.997570592 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1023,15,1,1],mask=1,scale=0.100000,max_bias=8.000000): [SOFT_MAX] NMSE = 0.901592004 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1024,1024,1,1],mask=1,scale=0.100000,max_bias=8.000000): [SOFT_MAX] NMSE = 0.965908813 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[1023,1023,1,1],mask=1,scale=0.100000,max_bias=8.000000): [SOFT_MAX] NMSE = 0.989022277 > 0.000001000 FAIL
SOFT_MAX(type=f32,ne=[16,2,32,1],mask=1,scale=0.100000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[16,2,32,1],mask=0,scale=0.100000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[32,2,32,1],mask=1,scale=0.100000,max_bias=0.000000): OK
SOFT_MAX(type=f32,ne=[32,2,32,1],mask=1,scale=0.100000,max_bias=8.000000): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[128,40,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[128,52,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[128,64,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[64,1,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[64,71,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[64,8,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[80,32,2,1],n_dims=20,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[80,32,2,1],n_dims=32,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[128,12,2,1],n_dims=128,mode=8,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,28,2,1],n_dims=128,mode=8,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): not supported [Vulkan0]
ROPE(type=f32,ne_a=[80,16,2,1],n_dims=80,mode=24,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[128,40,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[128,52,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[128,64,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[64,1,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[64,71,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[64,8,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[80,32,2,1],n_dims=20,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[80,32,2,1],n_dims=32,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[128,12,2,1],n_dims=128,mode=8,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,28,2,1],n_dims=128,mode=8,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): not supported [Vulkan0]
ROPE(type=f32,ne_a=[80,16,2,1],n_dims=80,mode=24,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[128,40,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[128,52,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[128,64,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[64,1,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[64,71,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[64,8,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[80,32,2,1],n_dims=20,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[80,32,2,1],n_dims=32,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[128,12,2,1],n_dims=128,mode=8,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,28,2,1],n_dims=128,mode=8,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): not supported [Vulkan0]
ROPE(type=f16,ne_a=[80,16,2,1],n_dims=80,mode=24,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[128,40,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[128,52,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[128,64,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[64,1,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[64,71,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[64,8,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[80,32,2,1],n_dims=20,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[80,32,2,1],n_dims=32,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[128,12,2,1],n_dims=128,mode=8,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,28,2,1],n_dims=128,mode=8,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): not supported [Vulkan0]
ROPE(type=f16,ne_a=[80,16,2,1],n_dims=80,mode=24,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=0,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=0,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=1,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=1,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=0,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=0,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=1,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=1,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=0,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=0,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=1,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=1,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=0,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=0,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=1,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=1,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=0,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=0,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=1,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=1,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=0,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=0,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=1,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=1,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=0,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=0,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=1,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=1,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=0,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=0,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=1,v=0): OK
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=1,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=0,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=0,v=0): OK
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=1,v=0): OK
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=1,v=0): OK
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.000000,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.000000,ef=0.746500,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.000000,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f32,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=0,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=1,v=1): not supported [Vulkan0]
ROPE(type=f16,ne_a=[64,128,2,1],n_dims=64,mode=2,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=1,v=1): not supported [Vulkan0]
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=0,v=0): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=0,v=0): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=1,v=0): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=1,v=0): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=2,v=0): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=2,v=0): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=3,v=0): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=3,v=0): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=0,v=1): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=0,v=1): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=1,v=1): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=1,v=1): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=2,v=1): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=2,v=1): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=3,v=1): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=3,v=1): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=0,v=2): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=0,v=2): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=1,v=2): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=1,v=2): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=2,v=2): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=2,v=2): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=3,v=2): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=3,v=2): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=0,v=3): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=0,v=3): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=1,v=3): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=1,v=3): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=2,v=3): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=2,v=3): OK
CONCAT(type=f32,ne_a=[11,12,13,14],ne_b_d=7,dim=3,v=3): OK
CONCAT(type=i32,ne_a=[11,12,13,14],ne_b_d=7,dim=3,v=3): OK
ARGSORT(type=f32,ne=[8,1,1,1],order=0): OK
ARGSORT(type=f32,ne=[16,10,10,10],order=0): OK
ARGSORT(type=f32,ne=[60,10,10,10],order=0): OK
ARGSORT(type=f32,ne=[8,1,1,1],order=1): OK
ARGSORT(type=f32,ne=[16,10,10,10],order=1): OK
ARGSORT(type=f32,ne=[60,10,10,10],order=1): OK
SUM(type=f32,ne=[10,5,4,3]): not supported [Vulkan0]
SUM_ROWS(type=f32,ne=[10,5,4,3]): OK
MEAN(type=f32,ne=[10,5,4,3]): not supported [Vulkan0]
UPSCALE(type=f32,ne=[512,512,3,1],scale_factor=2,transpose=0): OK
UPSCALE(type=f32,ne=[512,512,3,1],scale_factor=2,transpose=1): OK
UPSCALE(type=f32,ne=[2,5,7,11],ne_tgt=[5,7,11,13]): OK
GROUP_NORM(type=f32,ne=[64,64,320,1],num_groups=32): OK
GROUP_NORM(type=f32,ne=[9,9,1280,1],num_groups=32): OK
ACC(type=f32,ne_a=[256,17,1,1],ne_b=[256,16,1,1]): OK
PAD(type=f32,ne_a=[512,512,1,1],pad_0=1,pad_1=1): OK
PAD_REFLECT_1D(type=f32,ne_a=[512,34,2,1],pad_0=10,pad_1=9): not supported [Vulkan0]
ARANGE(type=f32,start=0.000000,stop=10.000000,step=1.000000): not supported [Vulkan0]
TIMESTEP_EMBEDDING(type=f32,ne_a=[2,1,1,1],dim=320,max_period=10000): OK
LEAKY_RELU(type=f32,ne_a=[10,5,4,3],negative_slope=0.100000): OK
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=64,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=80,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=128,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=10.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=1,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=3,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=32,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=35,mask=1,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=1,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=3,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=32,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=35,mask=1,max_bias=8.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=512,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=1,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=3,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=32,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=f16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=bf16): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q8_0): not supported [Vulkan0]
FLASH_ATTN_EXT(hs=256,nh=32,kv=1024,nb=35,mask=0,max_bias=0.000000,logit_softcap=0.000000,type_KV=q4_0): not supported [Vulkan0]
CROSS_ENTROPY_LOSS(type=f32,ne=[10,5,4,3]): not supported [Vulkan0]
OPT_STEP_ADAMW(type=f32,ne=[10,5,4,3]): not supported [Vulkan0]
2012/2216 tests passed
Backend Vulkan0: FAIL

@soerenkampschroer
Copy link
Author

I reinstalled MoltenVk and while it still passes and fails the same tests and overall still fails, it does now work.

There's still one issue though. Every other query, it descends into madness after answering the first query. It will either hallucinate follow-up questions or repeat phrases, words or letters.

Either way, when it works, it's insanely fast. I'm up from 7 token/s on the CPU to 45 token/s using vulkan.

This is all using gemma-2-9b-it.Q5_K_M.gguf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working build Compilation issues Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants