Skip to content

Conversation

@zhangjian29
Copy link
Contributor

Description

This PR fixes the gcc14.2 internal compiler error (ICE) bug that occurs when specifying -DDNNL_ARCH_OPT_FLAGS="-march=rv64gcv" to disable zvfh extension.

I attempted to disable zvfh extension by adding -DDNNL_ARCH_OPT_FLAGS="-march=rv64gcv" in the cmake options. However, I found two bugs:

  • In platform.cmake, the test that sets DNNL_RISCV_USE_ZVFH_INTRINSICS to true passed before DDNNL_ARCH_OPT_FLAGS overried the CMAKE_CCXX_FLAGS.
  • rvv_nhwc_pooling.hpp uses DNNL_RISCV_USE_ZVFH_INTRINSICS flag for dispatch, but this flag can be undefined.

These issues together cause a GCC ICE during compilation.

See Log
zhangjian@localhost:~/oneDNN/build_rvv $ cmake .. -DDNNL_TARGET_ARCH="RV64" -DDNNL_ARCH_OPT_FLAGS="-march=rv64gcv"
-- CMAKE_BUILD_TYPE is unset, defaulting to Release
-- The C compiler identification is GNU 14.2.0
-- The CXX compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- DNNL_TARGET_ARCH: RV64
-- DNNL_LIBRARY_NAME: dnnl
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Performing Test CAN_COMPILE_RVV_INTRINSICS
-- Performing Test CAN_COMPILE_RVV_INTRINSICS - Success
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS - Success
-- Can compile RVV Intrinsics: TRUE
-- Can compile Zvfh Intrinsics: TRUE
-- DNNL_RISCV_USE_RVV_INTRINSICS: TRUE
-- DNNL_RISCV_USE_ZVFH_INTRINSICS: TRUE
-- Using RV64 march flag: -march=rv64gcv_zvfh
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) 
-- Could NOT find Doxyrest (missing: DOXYREST_EXECUTABLE) 
-- Found Python: /usr/bin/python3.11 (found suitable version "3.11.7", minimum required is "3.7") found components: Interpreter 
-- Could NOT find Sphinx (missing: SPHINX_EXECUTABLE) 
-- Found Git: /usr/bin/git (found version "2.43.0") 
-- Enabled testing coverage: CI
-- Enabled workload: TRAINING
-- Enabled primitives: ALL
-- Enabled primitive CPU ISA: ALL
-- Enabled primitive GPU ISA: ALL
-- Enabled GeMM kernels ISA: ALL
-- Primitive cache is enabled
-- Graph component is enabled
-- Configuring done (1.8s)
-- Generating done (0.2s)
-- Build files have been written to: /home/zhangjian/oneDNN/build_rvv
zhangjian@localhost:~/oneDNN/build_rvv $ make -j8
...
/home/zhangjian/oneDNN/src/cpu/rv64/rvv_1x1_convolution.cpp: In member function 'dnnl::impl::status_t dnnl::impl::cpu::rv64::rvv_1x1_convolution_fwd_t::execute_forward_ncsp(const dnnl::impl::exec_ctx_t&) const':
/home/zhangjian/oneDNN/src/cpu/rv64/rvv_1x1_convolution.cpp:116:17: warning: unused variable 'wei_oc_stride' [-Wunused-variable]
  116 |     const dim_t wei_oc_stride = ic;
      |                 ^~~~~~~~~~~~~
/home/zhangjian/oneDNN/src/cpu/rv64/rvv_1x1_convolution.cpp: In member function 'dnnl::impl::status_t dnnl::impl::cpu::rv64::rvv_1x1_convolution_fwd_t::execute_forward_nspc(const dnnl::impl::exec_ctx_t&) const':
/home/zhangjian/oneDNN/src/cpu/rv64/rvv_1x1_convolution.cpp:272:17: warning: unused variable 'alpha' [-Wunused-variable]
  272 |     const float alpha = 1.0f;
      |                 ^~~~~
during RTL pass: expand
In lambda function,
    inlined from 'constexpr _Res std::__invoke_impl(__invoke_other, _Fn&&, _Args&& ...) [with _Res = void; _Fn = dnnl::impl::cpu::rv64::{anonymous}::AvgPoolingExcludePadding_f16(const void*, void*, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, const dnnl::impl::cpu::rv64::rvv_postops_t&)::<lambda(dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t)>&; _Args = {long int, long int, long int, long int}]' at /home/zhangjian/tools/tool-gcc14.2/riscv64-unknown-linux-gnu/include/c++/14.2.0/bits/invoke.h:61:36,
    inlined from 'std::__enable_if_t<((bool)std::is_void< <template-parameter-1-1> >::value), _Res> std::__invoke_r(_Callable&&, _Args&& ...) [with _Res = void; _Callable = dnnl::impl::cpu::rv64::{anonymous}::AvgPoolingExcludePadding_f16(const void*, void*, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, const dnnl::impl::cpu::rv64::rvv_postops_t&)::<lambda(dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t)>&; _Args = {long int, long int, long int, long int}]' at /home/zhangjian/tools/tool-gcc14.2/riscv64-unknown-linux-gnu/include/c++/14.2.0/bits/invoke.h:150:33,
    inlined from 'static _Res std::_Function_handler<_Res(_ArgTypes ...), _Functor>::_M_invoke(const std::_Any_data&, _ArgTypes&& ...) [with _Res = void; _Functor = dnnl::impl::cpu::rv64::{anonymous}::AvgPoolingExcludePadding_f16(const void*, void*, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, const dnnl::impl::cpu::rv64::rvv_postops_t&)::<lambda(dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t, dnnl::impl::dim_t)>; _ArgTypes = {long int, long int, long int, long int}]' at /home/zhangjian/tools/tool-gcc14.2/riscv64-unknown-linux-gnu/include/c++/14.2.0/bits/std_function.h:290:30:
/home/zhangjian/oneDNN/src/cpu/rv64/rvv_nhwc_pooling.cpp:436:53: internal compiler error: in emit_move_insn, at expr.cc:4615
  436 |                         = __riscv_vfncvt_f_f_w_f16m1(vzero_f32, vl);
      |                           ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~

Bug Fix

We fix these bugs by:

  • Adding detection logic for DNNL_ARCH_OPT_FLAGS to correctly determine the CAN_COMPILE_ZVFH_INTRINSICS flag.
  • Fixing the zvfh runtime check by using the cpu_isa_traits for RV64.

Tests

After this PR:

  • The DNNL_RISCV_USE_ZVFH_INTRINSICS flag is set correctly and there are no compilation errors.
See log
zhangjian@localhost:~/oneDNN/build_rvv $ cmake .. -DDNNL_TARGET_ARCH="RV64" -DDNNL_ARCH_OPT_FLAGS="-march=rv64gcv"
-- CMAKE_BUILD_TYPE is unset, defaulting to Release
-- The C compiler identification is GNU 14.2.0
-- The CXX compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- DNNL_TARGET_ARCH: RV64
-- DNNL_LIBRARY_NAME: dnnl
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Performing Test CAN_COMPILE_RVV_INTRINSICS
-- Performing Test CAN_COMPILE_RVV_INTRINSICS - Success
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS - Success
-- Can compile RVV Intrinsics: TRUE
-- Can compile Zvfh Intrinsics: FALSE
-- DNNL_RISCV_USE_RVV_INTRINSICS: TRUE
-- DNNL_RISCV_USE_ZVFH_INTRINSICS: FALSE
-- Using RV64 march flag: -march=rv64gcv
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) 
-- Could NOT find Doxyrest (missing: DOXYREST_EXECUTABLE) 
-- Found Python: /usr/bin/python3.11 (found suitable version "3.11.7", minimum required is "3.7") found components: Interpreter 
-- Could NOT find Sphinx (missing: SPHINX_EXECUTABLE) 
-- Found Git: /usr/bin/git (found version "2.43.0") 
-- Enabled testing coverage: CI
-- Enabled workload: TRAINING
-- Enabled primitives: ALL
-- Enabled primitive CPU ISA: ALL
-- Enabled primitive GPU ISA: ALL
-- Enabled GeMM kernels ISA: ALL
-- Primitive cache is enabled
-- Graph component is enabled
-- Configuring done (1.8s)
-- Generating done (0.2s)
-- Build files have been written to: /home/zhangjian/oneDNN/build_rvv
  • f16 tests are skipped on the rv64gcv platform as excepted.
See log
[root@localhost build_rvv]# LD_LIBRARY_PATH=$(pwd)/src:$LD_LIBRARY_PATH ./tests/benchdnn/benchdnn --pool --batch=tests/benchdnn/inputs/pool/test_pool_float16 
0:SKIPPED (Data type not supported) (128 ms) __REPRO: --pool --dt=f16:f16 ic64iw32ow16kw3sw2pw0
1:SKIPPED (Data type not supported) (17 ms) __REPRO: --pool --dt=f16:f16 --alg=avg_np ic64iw32ow16kw3sw2pw0
2:SKIPPED (Data type not supported) (4 ms) __REPRO: --pool --dt=f16:f16 --alg=avg_p ic64iw32ow16kw3sw2pw0
3:SKIPPED (Data type not supported) (4 ms) __REPRO: --pool --dt=f16:f16 --tag=axb ic64iw32ow16kw3sw2pw0
4:SKIPPED (Data type not supported) (4 ms) __REPRO: --pool --dt=f16:f16 --tag=axb --alg=avg_np ic64iw32ow16kw3sw2pw0
5:SKIPPED (Data type not supported) (4 ms) __REPRO: --pool --dt=f16:f16 --tag=axb --alg=avg_p ic64iw32ow16kw3sw2pw0
6:SKIPPED (Data type not supported) (5 ms) __REPRO: --pool --dir=BWD_D --dt=f16:f16 ic64iw32ow16kw3sw2pw0
7:SKIPPED (Data type not supported) (4 ms) __REPRO: --pool --dir=BWD_D --dt=f16:f16 --alg=avg_np ic64iw32ow16kw3sw2pw0
8:SKIPPED (Data type not supported) (4 ms) __REPRO: --pool --dir=BWD_D --dt=f16:f16 --alg=avg_p ic64iw32ow16kw3sw2pw0
9:SKIPPED (Data type not supported) (4 ms) __REPRO: --pool --dir=BWD_D --dt=f16:f16 --tag=axb ic64iw32ow16kw3sw2pw0
10:SKIPPED (Data type not supported) (5 ms) __REPRO: --pool --dir=BWD_D --dt=f16:f16 --tag=axb --alg=avg_np ic64iw32ow16kw3sw2pw0
...
tests:4014 passed:0 skipped:4014 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:0 listed:0
total: 21.54s; create_pd: 0.00s (0%); create_prim: 0.00s (0%); fill: 0.00s (0%); execute: 0.00s (0%); compute_ref: 0.00s (0%); compare: 0.00s (0%);

@zhangfeiv0
Copy link
Contributor

If the user sets DNNL_ARCH_OPT_FLAGS to -march=rv64gc, should we also disable the v extension?

If the user does not provide compiler flags, we detect extension support via CMake; however, if the user explicitly specifies flags, we should compile using those specified flags. What do you think about this approach?

@zhangjian29
Copy link
Contributor Author

If the user sets DNNL_ARCH_OPT_FLAGS to -march=rv64gc, should we also disable the v extension?

Thank you for pointing this out @zhangfeiv0 . Updated in the new commit. Now we can:

  • Enable both v and zvfh if user doesn't specify any flags and compiler supports them.
  • Disable both v and zvfh if user specifies rv64gc flag.
  • Enable v only if user specifies rv64gcv falg.
See cmake logs:
zhangjian@localhost:~/oneDNN/build_rv64 (main) $ rm -rf * && cmake .. -DDNNL_TARGET_ARCH="RV64"
-- CMAKE_BUILD_TYPE is unset, defaulting to Release
-- The C compiler identification is GNU 14.2.0
-- The CXX compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- DNNL_TARGET_ARCH: RV64
-- DNNL_LIBRARY_NAME: dnnl
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Performing Test CAN_COMPILE_RVV_INTRINSICS
-- Performing Test CAN_COMPILE_RVV_INTRINSICS - Success
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS - Success
-- Can compile RVV Intrinsics: TRUE
-- Can compile Zvfh Intrinsics: 1
-- DNNL_RISCV_USE_RVV_INTRINSICS: TRUE
-- DNNL_RISCV_USE_ZVFH_INTRINSICS: 1
-- Using RV64 march flag: -march=rv64gcv_zvfh
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) 
-- Could NOT find Doxyrest (missing: DOXYREST_EXECUTABLE) 
-- Found Python: /usr/bin/python3.11 (found suitable version "3.11.7", minimum required is "3.7") found components: Interpreter 
-- Could NOT find Sphinx (missing: SPHINX_EXECUTABLE) 
-- Found Git: /usr/bin/git (found version "2.43.0") 
-- Enabled testing coverage: CI
-- Enabled workload: TRAINING
-- Enabled primitives: ALL
-- Enabled primitive CPU ISA: ALL
-- Enabled primitive GPU ISA: ALL
-- Enabled GeMM kernels ISA: ALL
-- Primitive cache is enabled
-- Graph component is enabled
-- Configuring done (1.8s)
-- Generating done (0.2s)
-- Build files have been written to: /home/zhangjian/oneDNN/build_rv64
zhangjian@localhost:~/oneDNN/build_rv64 (main) $ rm -rf * && cmake .. -DDNNL_TARGET_ARCH="RV64" -DDNNL_ARCH_OPT_FLAGS="-march=rv64gc"
-- CMAKE_BUILD_TYPE is unset, defaulting to Release
-- The C compiler identification is GNU 14.2.0
-- The CXX compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- DNNL_TARGET_ARCH: RV64
-- DNNL_LIBRARY_NAME: dnnl
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Performing Test CAN_COMPILE_RVV_INTRINSICS
-- Performing Test CAN_COMPILE_RVV_INTRINSICS - Success
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS - Success
-- Can compile RVV Intrinsics: FALSE
-- Can compile Zvfh Intrinsics: FALSE
-- DNNL_RISCV_USE_RVV_INTRINSICS: FALSE
-- DNNL_RISCV_USE_ZVFH_INTRINSICS: FALSE
-- Using RV64 march flag: -march=rv64gc
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) 
-- Could NOT find Doxyrest (missing: DOXYREST_EXECUTABLE) 
-- Found Python: /usr/bin/python3.11 (found suitable version "3.11.7", minimum required is "3.7") found components: Interpreter 
-- Could NOT find Sphinx (missing: SPHINX_EXECUTABLE) 
-- Found Git: /usr/bin/git (found version "2.43.0") 
-- Enabled testing coverage: CI
-- Enabled workload: TRAINING
-- Enabled primitives: ALL
-- Enabled primitive CPU ISA: ALL
-- Enabled primitive GPU ISA: ALL
-- Enabled GeMM kernels ISA: ALL
-- Primitive cache is enabled
-- Graph component is enabled
-- Configuring done (1.8s)
-- Generating done (0.2s)
-- Build files have been written to: /home/zhangjian/oneDNN/build_rv64
zhangjian@localhost:~/oneDNN/build_rv64 (main) $ rm -rf * && cmake .. -DDNNL_TARGET_ARCH="RV64" -DDNNL_ARCH_OPT_FLAGS="-march=rv64gcv"
-- CMAKE_BUILD_TYPE is unset, defaulting to Release
-- The C compiler identification is GNU 14.2.0
-- The CXX compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- DNNL_TARGET_ARCH: RV64
-- DNNL_LIBRARY_NAME: dnnl
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Performing Test CAN_COMPILE_RVV_INTRINSICS
-- Performing Test CAN_COMPILE_RVV_INTRINSICS - Success
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS - Success
-- Can compile RVV Intrinsics: TRUE
-- Can compile Zvfh Intrinsics: FALSE
-- DNNL_RISCV_USE_RVV_INTRINSICS: TRUE
-- DNNL_RISCV_USE_ZVFH_INTRINSICS: FALSE
-- Using RV64 march flag: -march=rv64gcv
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) 
-- Could NOT find Doxyrest (missing: DOXYREST_EXECUTABLE) 
-- Found Python: /usr/bin/python3.11 (found suitable version "3.11.7", minimum required is "3.7") found components: Interpreter 
-- Could NOT find Sphinx (missing: SPHINX_EXECUTABLE) 
-- Found Git: /usr/bin/git (found version "2.43.0") 
-- Enabled testing coverage: CI
-- Enabled workload: TRAINING
-- Enabled primitives: ALL
-- Enabled primitive CPU ISA: ALL
-- Enabled primitive GPU ISA: ALL
-- Enabled GeMM kernels ISA: ALL
-- Primitive cache is enabled
-- Graph component is enabled
-- Configuring done (1.8s)
-- Generating done (0.2s)
-- Build files have been written to: /home/zhangjian/oneDNN/build_rv64
zhangjian@localhost:~/oneDNN/build_rv64 (main) $ rm -rf * && cmake .. -DDNNL_TARGET_ARCH="RV64" -DDNNL_ARCH_OPT_FLAGS="-march=rv64gcv_zvfh"
-- CMAKE_BUILD_TYPE is unset, defaulting to Release
-- The C compiler identification is GNU 14.2.0
-- The CXX compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- DNNL_TARGET_ARCH: RV64
-- DNNL_LIBRARY_NAME: dnnl
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Performing Test CAN_COMPILE_RVV_INTRINSICS
-- Performing Test CAN_COMPILE_RVV_INTRINSICS - Success
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS - Success
-- Can compile RVV Intrinsics: TRUE
-- Can compile Zvfh Intrinsics: 1
-- DNNL_RISCV_USE_RVV_INTRINSICS: TRUE
-- DNNL_RISCV_USE_ZVFH_INTRINSICS: 1
-- Using RV64 march flag: -march=rv64gcv_zvfh
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) 
-- Could NOT find Doxyrest (missing: DOXYREST_EXECUTABLE) 
-- Found Python: /usr/bin/python3.11 (found suitable version "3.11.7", minimum required is "3.7") found components: Interpreter 
-- Could NOT find Sphinx (missing: SPHINX_EXECUTABLE) 
-- Found Git: /usr/bin/git (found version "2.43.0") 
-- Enabled testing coverage: CI
-- Enabled workload: TRAINING
-- Enabled primitives: ALL
-- Enabled primitive CPU ISA: ALL
-- Enabled primitive GPU ISA: ALL
-- Enabled GeMM kernels ISA: ALL
-- Primitive cache is enabled
-- Graph component is enabled
-- Configuring done (1.8s)
-- Generating done (0.2s)
-- Build files have been written to: /home/zhangjian/oneDNN/build_rv64

If the user does not provide compiler flags, we detect extension support via CMake; however, if the user explicitly specifies flags, we should compile using those specified flags. What do you think about this approach?

I think a detection is still required even if the user specifies any custom flags. This PR addresses the issue where a user wants fewer extensions than the compiler's default capabilities by specifying DNNL_ARCH_OPT_FLAGS.

@zhangjian29 zhangjian29 requested a review from vpirogov December 18, 2025 06:04
Copy link
Contributor

@dzarukin dzarukin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you, please, elaborate why CMake changes are required and what's the verbose purpose of RV64_MARCH_FLAG? Dealing with compiler options in CMake doesn't look like a promising path to me as it's extremely hard to keep all existing and future options in harmony over oneDNN versions esp. when it comes to inspecting a string value.

I'd assume that changes to pooling file should be enough to address the issue.

@zhangjian29
Copy link
Contributor Author

Thank you for the feedback @dzarukin.

Clarify why CMake changes are necessary

You're right that changing rvv_nhwc_pooling.hpp to use mayiuse(zvfh) instead of the compile-time macro fixes the immediate GCC ICE.

However, the purpose of CMake changes are to support documented DNNL_ARCH_OPT_FLAGS behavior for RV64, where a user wants fewer extensions than the compiler's default capabilities.

The real issue is that CMake detection happens before DNNL_ARCH_OPT_FLAGS is applied, creating a mismatch between:

  • What CMake thinks the compiler supports (based on early detection)
  • What the compiler actually gets instructed to compile for (after user flags are applied)

Without the CMake fix:

  • No GCC ICE (pooling fix handles this)
  • DNNL_ARCH_OPT_FLAGS doesn't work as documented
  • Users cannot control which RISC-V extensions to enable/disable at build time

About string inspection concerns

I totally understand your concern about string parsing being fragile. Let me propose a cleaner alternative:

Alternative Approach: Re-test compiler capabilities after applying user flags

Instead of parsing the -march string, we can re-run the compiler capability tests after applying DNNL_ARCH_OPT_FLAGS:

if (DNNL_TARGET_ARCH STREQUAL "RV64")
    if (NOT DNNL_ARCH_OPT_FLAGS STREQUAL "HostOpts")
        message(STATUS "Testing Arch Opt Flags for RV64: ${DNNL_ARCH_OPT_FLAGS}")
        set(ARCH_SIMD_TEST_FLAGS "${DNNL_ARCH_OPT_FLAGS}")
    endif()
    # Check if the RVV Intrinsics can be compiled with the current toolchain and flags
    if (DNNL_ARCH_OPT_FLAGS STREQUAL "HostOpts")
        set(ARCH_SIMD_TEST_FLAGS "-march=rv64gcv")
    endif()
    set(CMAKE_REQUIRED_FLAGS_SAVE ${CMAKE_REQUIRED_FLAGS})
    set(CMAKE_REQUIRED_FLAGS "${ARCH_SIMD_TEST_FLAGS}")
    include(CheckCXXSourceCompiles)
    check_cxx_source_compiles("#if !defined(__riscv) || !defined(__riscv_v)
                               #error \"RISC-V or vector extension(RVV) is not supported by the compiler\"
                               #endif

                               #if defined(__riscv_v_intrinsic) && __riscv_v_intrinsic < 12000
                               #error \"RISC-V intrinsics v0.12 or higher is required\"
                               #endif

                               #include <riscv_vector.h>
                               int main() {
                                return 0;
                               };"
                               CAN_COMPILE_RVV_INTRINSICS
    )
See logs:
zhangjian@localhost:~/oneDNN/build_rvv (main) $ rm -rf * && cmake .. -DDNNL_TARGET_ARCH="RV64" 
-- CMAKE_BUILD_TYPE is unset, defaulting to Release
-- The C compiler identification is GNU 14.2.0
-- The CXX compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- DNNL_TARGET_ARCH: RV64
-- DNNL_LIBRARY_NAME: dnnl
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Performing Test CAN_COMPILE_RVV_INTRINSICS
-- Performing Test CAN_COMPILE_RVV_INTRINSICS - Success
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS - Success
-- Can compile RVV Intrinsics: TRUE
-- Can compile Zvfh Intrinsics: TRUE
-- DNNL_RISCV_USE_RVV_INTRINSICS: TRUE
-- DNNL_RISCV_USE_ZVFH_INTRINSICS: TRUE
-- Using RV64 march flag: -march=rv64gcv_zvfh
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) 
-- Could NOT find Doxyrest (missing: DOXYREST_EXECUTABLE) 
-- Found Python: /usr/bin/python3.11 (found suitable version "3.11.7", minimum required is "3.7") found components: Interpreter 
-- Could NOT find Sphinx (missing: SPHINX_EXECUTABLE) 
-- Found Git: /usr/bin/git (found version "2.43.0") 
-- Enabled testing coverage: CI
-- Enabled workload: TRAINING
-- Enabled primitives: ALL
-- Enabled primitive CPU ISA: ALL
-- Enabled primitive GPU ISA: ALL
-- Enabled GeMM kernels ISA: ALL
-- Primitive cache is enabled
-- Graph component is enabled
-- Configuring done (1.8s)
-- Generating done (0.2s)
-- Build files have been written to: /home/zhangjian/oneDNN/build_rvv
zhangjian@localhost:~/oneDNN/build_rvv (main) $ rm -rf * && cmake .. -DDNNL_TARGET_ARCH="RV64" -DDNNL_ARCH_OPT_FLAGS="-march=rv64gc"
-- CMAKE_BUILD_TYPE is unset, defaulting to Release
-- The C compiler identification is GNU 14.2.0
-- The CXX compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- DNNL_TARGET_ARCH: RV64
-- DNNL_LIBRARY_NAME: dnnl
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Testing Arch Opt Flags for RV64: -march=rv64gc
-- Performing Test CAN_COMPILE_RVV_INTRINSICS
-- Performing Test CAN_COMPILE_RVV_INTRINSICS - Failed
-- Can compile RVV Intrinsics: FALSE
-- Can compile Zvfh Intrinsics: FALSE
-- DNNL_RISCV_USE_RVV_INTRINSICS: FALSE
-- DNNL_RISCV_USE_ZVFH_INTRINSICS: FALSE
-- Using RV64 march flag: -march=rv64gc
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) 
-- Could NOT find Doxyrest (missing: DOXYREST_EXECUTABLE) 
-- Found Python: /usr/bin/python3.11 (found suitable version "3.11.7", minimum required is "3.7") found components: Interpreter 
-- Could NOT find Sphinx (missing: SPHINX_EXECUTABLE) 
-- Found Git: /usr/bin/git (found version "2.43.0") 
-- Enabled testing coverage: CI
-- Enabled workload: TRAINING
-- Enabled primitives: ALL
-- Enabled primitive CPU ISA: ALL
-- Enabled primitive GPU ISA: ALL
-- Enabled GeMM kernels ISA: ALL
-- Primitive cache is enabled
-- Graph component is enabled
-- Configuring done (1.4s)
-- Generating done (0.2s)
-- Build files have been written to: /home/zhangjian/oneDNN/build_rvv
zhangjian@localhost:~/oneDNN/build_rvv (main) $ rm -rf * && cmake .. -DDNNL_TARGET_ARCH="RV64" -DDNNL_ARCH_OPT_FLAGS="-march=rv64gcv"
-- CMAKE_BUILD_TYPE is unset, defaulting to Release
-- The C compiler identification is GNU 14.2.0
-- The CXX compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- DNNL_TARGET_ARCH: RV64
-- DNNL_LIBRARY_NAME: dnnl
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Testing Arch Opt Flags for RV64: -march=rv64gcv
-- Performing Test CAN_COMPILE_RVV_INTRINSICS
-- Performing Test CAN_COMPILE_RVV_INTRINSICS - Success
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS - Failed
-- Can compile RVV Intrinsics: TRUE
-- Can compile Zvfh Intrinsics: FALSE
-- DNNL_RISCV_USE_RVV_INTRINSICS: TRUE
-- DNNL_RISCV_USE_ZVFH_INTRINSICS: FALSE
-- Using RV64 march flag: -march=rv64gcv
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) 
-- Could NOT find Doxyrest (missing: DOXYREST_EXECUTABLE) 
-- Found Python: /usr/bin/python3.11 (found suitable version "3.11.7", minimum required is "3.7") found components: Interpreter 
-- Could NOT find Sphinx (missing: SPHINX_EXECUTABLE) 
-- Found Git: /usr/bin/git (found version "2.43.0") 
-- Enabled testing coverage: CI
-- Enabled workload: TRAINING
-- Enabled primitives: ALL
-- Enabled primitive CPU ISA: ALL
-- Enabled primitive GPU ISA: ALL
-- Enabled GeMM kernels ISA: ALL
-- Primitive cache is enabled
-- Graph component is enabled
-- Configuring done (1.7s)
-- Generating done (0.2s)
-- Build files have been written to: /home/zhangjian/oneDNN/build_rvv
zhangjian@localhost:~/oneDNN/build_rvv (main) $ rm -rf * && cmake .. -DDNNL_TARGET_ARCH="RV64" -DDNNL_ARCH_OPT_FLAGS="-march=rv64gcv_zvfh"
-- CMAKE_BUILD_TYPE is unset, defaulting to Release
-- The C compiler identification is GNU 14.2.0
-- The CXX compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/zhangjian/tools/tool-gcc14.2/bin/riscv64-unknown-linux-gnu-g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- DNNL_TARGET_ARCH: RV64
-- DNNL_LIBRARY_NAME: dnnl
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Testing Arch Opt Flags for RV64: -march=rv64gcv_zvfh
-- Performing Test CAN_COMPILE_RVV_INTRINSICS
-- Performing Test CAN_COMPILE_RVV_INTRINSICS - Success
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS
-- Performing Test CAN_COMPILE_ZVFH_INTRINSICS - Success
-- Can compile RVV Intrinsics: TRUE
-- Can compile Zvfh Intrinsics: TRUE
-- DNNL_RISCV_USE_RVV_INTRINSICS: TRUE
-- DNNL_RISCV_USE_ZVFH_INTRINSICS: TRUE
-- Using RV64 march flag: -march=rv64gcv_zvfh
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) 
-- Could NOT find Doxyrest (missing: DOXYREST_EXECUTABLE) 
-- Found Python: /usr/bin/python3.11 (found suitable version "3.11.7", minimum required is "3.7") found components: Interpreter 
-- Could NOT find Sphinx (missing: SPHINX_EXECUTABLE) 
-- Found Git: /usr/bin/git (found version "2.43.0") 
-- Enabled testing coverage: CI
-- Enabled workload: TRAINING
-- Enabled primitives: ALL
-- Enabled primitive CPU ISA: ALL
-- Enabled primitive GPU ISA: ALL
-- Enabled GeMM kernels ISA: ALL
-- Primitive cache is enabled
-- Graph component is enabled
-- Configuring done (1.8s)
-- Generating done (0.2s)
-- Build files have been written to: /home/zhangjian/oneDNN/build_rvv

This approach reaches the point where DNNL_ARCH_OPT_FLAGS could be handled correctly for RV64 without string parsing. What do you think of it? Or should I drop all modifications of CMake changes in this PR?

@dzarukin
Copy link
Contributor

dzarukin commented Jan 6, 2026

This approach reaches the point where DNNL_ARCH_OPT_FLAGS could be handled correctly for RV64 without string parsing. What do you think of it? Or should I drop all modifications of CMake changes in this PR?

I got the part about user controlling behavior of extensions which is fine with me.

Before jumping into a proposal, could you, please, expand on why CMake detection is needed at all given that there's XByak support for RISC-V now and all supported extensions can be verified at runtime through mayiuse(...) call?

With data type support tucked behind the extension I can see two options:

  1. The primitive implementation uses RISC-V specific code (intrinsics, XByak, etc.). It makes the implementation RISC-V-based and all support must be checked carefully at runtime alongside data types.
  2. The primitive implementation relies on C++ features to access buffers (a.k.a. ref or simple impls). In that case, has_data_type_support(...) function should check if data type can be safely used in that implementation. The function would go through identification mechanism, which would end up using mayiuse anyway, and, thus, either letting the implementation if user agreed on using specific extension or not if user rejected extensions.

If this understanding is correct, CMake can safely discard feature detection code unless this vision misses some important pieces...
Please share your thoughts on that. Thanks.

@zhangjian29
Copy link
Contributor Author

Thank you for the clarification @dzarukin. You're absolutely right.

I will drop all CMake detection changes and keep the pooling file fix mayiuse(...) in this PR.

@zhangjian29 zhangjian29 merged commit 7fdc83e into uxlfoundation:main Jan 7, 2026
13 checks passed
@zhangjian29 zhangjian29 deleted the fix-pool-zvfh-gard branch January 7, 2026 04:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants