Skip to content

OpenBLAS: New version 0.3.31#12970

Open
eschnett wants to merge 31 commits intoJuliaPackaging:masterfrom
eschnett:eschnett/openblas-0.3.31
Open

OpenBLAS: New version 0.3.31#12970
eschnett wants to merge 31 commits intoJuliaPackaging:masterfrom
eschnett:eschnett/openblas-0.3.31

Conversation

@eschnett
Copy link
Contributor

This is based on the unmerged 0.3.30 branch.

@giordano
Copy link
Member

Most issues seem to be related to float16, not very supported on a few platforms

@eschnett
Copy link
Contributor Author

Can't build on aarch64-apple-darwin because we don't have GCC 15 there. (GCC 12 is failing.)

@giordano
Copy link
Member

Error is https://buildkite.com/julialang/yggdrasil/builds/27453#019c68ad-9cd3-4712-9af4-98cc9912be2d/L3822

[23:13:00] fatal error: error in backend: Calling convention AArch64_SME_ABI_Support_Routines_PreserveMost_From_X0 is unsupported on Darwin.
[23:13:00] PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
[23:13:00] Stack dump:
[23:13:00] 0.	Program arguments: /opt/x86_64-linux-musl/bin/clang -target arm64-apple-darwin20 --sysroot=/opt/aarch64-apple-darwin20/aarch64-apple-darwin20/sys-root -Wno-unused-command-line-argument -mmacosx-version-min=11.0 -O2 -Wall -fPIC -march=armv9-a+sve2+sme -march=armv9-a+sve2+sme -nostdinc++ -isystem /opt/aarch64-apple-darwin20/aarch64-apple-darwin20/sys-root/usr/include/c++/v1 -DSMALL_MATRIX_OPT -DGEMM_GEMV_FORWARD -DSBGEMM_GEMV_FORWARD -DBGEMM_GEMV_FORWARD -DMAX_STACK_ALLOC=2048 -DF_INTERFACE_GFORT -DDYNAMIC_ARCH -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=32 -DMAX_PARALLEL_NUMBER=1 -DBUILD_BFLOAT16 -DBUILD_HFLOAT16 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.31\" -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=_sgemm_direct_alpha_beta_ARMV9SME -DASMFNAME=_sgemm_direct_alpha_beta_ARMV9SME_ -DNAME=sgemm_direct_alpha_beta_ARMV9SME_ -DCNAME=sgemm_direct_alpha_beta_ARMV9SME -DCHAR_NAME=\"sgemm_direct_alpha_beta_ARMV9SME_\" -DCHAR_CNAME=\"sgemm_direct_alpha_beta_ARMV9SME\" -DNO_AFFINITY -DTS=_ARMV9SME -I.. -DBUILD_KERNEL -DTABLE_NAME=gotoblas_ARMV9SME -DBUILD_KERNEL -DTABLE_NAME=gotoblas_ARMV9SME -UDOUBLE -UCOMPLEX -UDOUBLE -UCOMPLEX -c -fcolor-diagnostics -o sgemm_direct_alpha_beta_ARMV9SME.o ../kernel/arm64/sgemm_direct_alpha_beta_arm64_sme1.c
[23:13:00] 1.	<eof> parser at end of file
[23:13:00] 2.	Code generation
[23:13:00] 3.	Running pass 'Function Pass Manager' on module '../kernel/arm64/sgemm_direct_alpha_beta_arm64_sme1.c'.
[23:13:00] 4.	Running pass 'AArch64 Instruction Selection' on function '@sgemm_direct_alpha_beta_sme1_2VLx2VL_ARMV9SME'
[23:13:00] clang: error: clang frontend command failed with exit code 70 (use -v to see invocation)
[23:13:00] clang version 18.1.7 (/home/tim/.cache/BinaryBuilder/downloads/clones/llvm-project.git-1df819a03ecf6890e3787b27bfd4f160aeeeeacd50a98d003be8b0893f11a9be 768118d1ad38bf13c545828f67bd6b474d61fc55)
[23:13:00] Target: arm64-apple-darwin20
[23:13:00] Thread model: posix
[23:13:00] InstalledDir: /opt/x86_64-linux-musl/bin
[23:13:00] clang: note: diagnostic msg: 
[23:13:00] ********************

Why do you say the problem is missing GCC 15? Requiring a less-than-a-year-old compiler would be harsh for OpenBLAS.

@eschnett
Copy link
Contributor Author

I'm building with GCC 15 to make the riscv64 build work. This uses GCC 15 on all platforms except aarch64-darwin which still uses GCC 12 (IainS!). Given the error message I assume that using a newer GCC version would help.

We can switch to an older GCC version if we drop support for riscv64.

@giordano
Copy link
Member

Given the error message I assume that using a newer GCC version would help.

But why? The error message is about Clang.

@eschnett
Copy link
Contributor Author

You are right. The magic sauce is export NO_SME=1.

@eschnett
Copy link
Contributor Author

Major changes:

  • enable float16 support (where possible)
  • require GCC 15 to make riscv64 build

@giordano
Copy link
Member

giordano commented Feb 17, 2026

require GCC 15 to make riscv64 build

That makes me extremely uncomfortable. There are no Julia versions which come with the GCC 15 runtime. (BTW, need to update CompilerSupportLibraries_jll)

@eschnett
Copy link
Contributor Author

Alternatives:

  • Do not build for riscv64
  • Spend much time (I tried and failed) to make things work on riscv64 with an earlier GCC
  • Secretly ship openBLAS 0.3.30 for riscv64

Can we require different GCC versions for different targets?

@eschnett eschnett marked this pull request as ready for review February 17, 2026 17:51
@giordano
Copy link
Member

Do not build for riscv64

👎

Can we require different GCC versions for different targets?

JuliaPackaging/BinaryBuilder.jl#1360

@imciner2
Copy link
Member

JuliaPackaging/BinaryBuilder.jl#1360

Yea, let me get back to that this week.

@imciner2
Copy link
Member

Can we just wait for later in the week so I can just finish off that BinaryBUilder PR and get platform selection done natively there? I don't think we need to rush this in.

@eschnett
Copy link
Contributor Author

Something is weird. I'm looking at build -- O/OpenBLAS/OpenBLASConsistentFPCSR@0.3.31 -- aarch64-linux-gnu-libgfortran5 and I see log output such as

cc -c -O2 -DSMALL_MATRIX_OPT -DMAX_STACK_ALLOC=2048 -Wall -m64 -DF_INTERFACE_GFORT -fPIC -DDYNAMIC_ARCH -DSMP_SERVER -DNO_WARMUP -DCONSISTENT_FPCSR -DMAX_CPU_NUMBER=512 -DMAX_PARALLEL_NUMBER=1 -DBUILD_BFLOAT16 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.31\" -msse3 -mavx -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME=ztrmm_olnncopy_BULLDOZER -DASMFNAME=ztrmm_olnncopy_BULLDOZER_ -DNAME=ztrmm_olnncopy_BULLDOZER_ -DCNAME=ztrmm_olnncopy_BULLDOZER -DCHAR_NAME=\"ztrmm_olnncopy_BULLDOZER_\" -DCHAR_CNAME=\"ztrmm_olnncopy_BULLDOZER\" -DNO_AFFINITY -DTS=_BULLDOZER -I.. -DBUILD_KERNEL -DTABLE_NAME=gotoblas_BULLDOZER -DDOUBLE  -DCOMPLEX -Wno-uninitialized -DDOUBLE -DCOMPLEX -DOUTER -DLOWER -UUNIT generic/ztrmm_lncopy_2.c -o ztrmm_olnncopy_BULLDOZER.o

I don't think this build (aarch64) should use options such as sse3 or avx, or optimize for a "bulldozer" microarchitecture.

@giordano
Copy link
Member

I think we'll need to wait for OpenMathLib/OpenBLAS#5643 to be merged, getting wrong matmuls on an architecture popular in HPC sounds bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants