Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefix Support Fails During OpenBLAS Build on Windows (WoA) #5154

Open
Harishmcw opened this issue Feb 27, 2025 · 11 comments
Open

Prefix Support Fails During OpenBLAS Build on Windows (WoA) #5154

Harishmcw opened this issue Feb 27, 2025 · 11 comments

Comments

@Harishmcw
Copy link
Contributor

Harishmcw commented Feb 27, 2025

Hi @martin-frbg,

I am attempting to build OpenBLAS on Windows on ARM (WoA) with prefix support enabled. However, I encountered the following error when the build process reaches the objcopy step:

objcopy -v --redefine-syms C:/Users/HCKTest/Desktop/Harish/openblas-libs/OpenBLAS/build/objcopy.def C:/Users/HCKTest/Desktop/Harish/openblas-libs/OpenBLAS/build/lib/libscipy_openblas.so""
llvm-objcopy.exe: error: unknown argument '-v'
ninja: build stopped: subcommand failed.

I used the following CMake command for the build:

cmake .. -G Ninja -DCMAKE_BUILD_TYPE=Release -DTARGET=ARMV8 -DBUILD_SHARED_LIBS=ON -DARCH=arm64 -DBINARY=64 -DCMAKE_SYSTEM_PROCESSOR=ARM64 -DCMAKE_C_COMPILER=clang-cl -DCMAKE_Fortran_COMPILER=flang-new -DSYMBOLPREFIX="scipy_" -DLIBNAMEPREFIX="scipy_"
Is there any recommended way to enable prefix support for OpenBLAS on Windows (WoA)

@martin-frbg
Copy link
Collaborator

Let me see - the -v is probably spurious, useful in the debugging phase but not needed for function. And earlier versions of LLVM did not freak out when they encountered a typical GNU option.

@Harishmcw
Copy link
Contributor Author

Hi @martin-frbg,

Thanks for the response. I removed the -v flag in the CMakelists.txt file and changed the file extension from .so to .dll, which allowed the build to complete successfully. However, the symbols inside scipy_openblas.dll still do not have the scipy_ prefix as expected.

Is there an additional step required to ensure the symbol renaming takes effect?

@rgommers
Copy link
Contributor

Since you're trying to create a scipy_openblas, are you aware that the build configs for that are at https://github.com/MacPython/openblas-libs? It'd be useful to open an issue there; Windows on Arm support isn't yet documented or supported there, but that's where it should land. In numpy/numpy#22530 the suggestion was to cross-compile in CI; once that works then there's no need for local builds anymore.

@martin-frbg
Copy link
Collaborator

mhh, that's annoying - seems llvm-objcopy only actually works on ELF files, although it does not complain when handed a DLL. I'll try to come up with something based on Microsoft's lib command

@Harishmcw
Copy link
Contributor Author

mhh, that's annoying - seems llvm-objcopy only actually works on ELF files, although it does not complain when handed a DLL. I'll try to come up with something based on Microsoft's lib command

Thanks for looking into this! That makes sense—since llvm-objcopy is meant for ELF files, it explains why the symbol renaming didn't work on the DLL.

Do you have any suggestions on how we can proceed with ensuring the symbols get renamed correctly? Let me know if there's anything I can try on my end.

Thanks again!

@martin-frbg
Copy link
Collaborator

That makes sense—since llvm-objcopy is meant for ELF files, it explains why the symbol renaming didn't work on the DLL.

Right, but as it is also stated to support COFF I had (perhaps naively) assumed that it would work with DLLs as well, given that it did not produce an error.

Do you have any suggestions on how we can proceed with ensuring the symbols get renamed correctly? Let me know if there's anything I can try on my end.

I had found some untested suggestions to try something like lib /def /export:newsymbol=oldsymbol but they all seem to trace back to a single bogus response on stackoverflow from like 10 years ago.
https://developercommunity.visualstudio.com/t/libexe-does-not-rename-export-names-from-dll-when/1548994 seems to say there is no way known to Microsoft Support to actually achieve symbol renaming at the lib/dll stage like we do on ELF platforms.

@Harishmcw
Copy link
Contributor Author

Hi @martin-frbg,

I tried building OpenBLAS with prefix support on x64 Windows using make commands, and the prefixing worked correctly. The resulting binaries were also in PE/COFF format.

However, on Windows on ARM (WoA), I was not able to build OpenBLAS using make commands and had to use CMake and Ninja instead.

If we can identify how symbol prefixing works in the Make-based build, we might be able to replicate that approach in the CMake build instead of relying on llvm-objcopy. This could help ensure prefixing works properly on WoA as well.

Would appreciate your thoughts on this!

@martin-frbg
Copy link
Collaborator

Certainly - the Makefile build is different in that it creates a static library (.a) first, and then post-processes that into a shared one. On Windows, this static library is almost certainly just a bunch of COFF-formatted objects, so llvm-objcopy works.

The CMake build actually used to proceed similarly, until there were complaints (and associated PRs) claiming that creating both static and dynamic library in the same run by default was somehow against the philosophy of CMake. The "old" way can very likely be restored to work around the Windows issue, but I did not manage to check this today

@Harishmcw
Copy link
Contributor Author

Thanks for the clarification! That makes sense. If restoring the "old" way in the CMake build can resolve this issue, I’d be happy to test it out. Do you have any pointers on where this change was made, or which PRs introduced it? I can look into reverting or modifying that behavior to see if it restores proper symbol prefixing on Windows.

@Harishmcw
Copy link
Contributor Author

Hi @martin-frbg

Since the Make-based prefix enabled build is working while the CMake-based one is failing on redefining symbols with llvm-objcopy, I’d like to understand how the PREFIX mechanism is handled in Make-based builds. Could you explain how it's being applied there? I want to try a similar approach in the CMake-based build to see if it resolves the issue.

@martin-frbg
Copy link
Collaborator

Please see exports/Makefile - basically it is linking the dllinit.c stub against the static library and applying a def file with the symbol equivalences in the process. However I am still struggling to get this to work with the WoA LLVM19, which is throwing some rather confused and confusing warnings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants