Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Add libs/python3.lib to libpython target for SABI builds on Windows #1820

Merged
merged 11 commits into from
Nov 30, 2024

Conversation

nicholasjng
Copy link
Contributor

@nicholasjng nicholasjng commented Mar 22, 2024

When targeting the Python Stable ABI on Windows (by setting the Py_LIMITED_API macro to a Python minimum version hex), the unversioned python3.lib needs to be linked instead of the versioned one (e.g. python38.lib for Python 3.8).

Python's own config sets the library to link by default in a header called pyconfig.h (https://github.com/python/cpython/blob/9cc9e277254023c0ca08e1a9e379fd89475ca9c2/PC/pyconfig.h#L270), which prompts the linker to search for python3.lib if a stable ABI extension is built using @rules_python toolchains.

Since this library is not exported on Windows in the python_repository() rule, building Python C++ extensions with rules_python toolchains fails in the linking step, because the library is never copied. Consequently, it is added now to allow Python SABI extensions to be built (and linked) on Windows with @rules_python.

Since Python takes responsibility for linking the correct lib on Windows, and never both at the same time, no other changes are made.

@axbycc-mark
Copy link

axbycc-mark commented Mar 24, 2024

I just tried out the method mentioned in your fix. I made a small cc_binary

cc_binary(
    name = "embed",
    srcs = ["embed.cpp"],
    deps = [
        "@rules_python//python/cc:current_py_cc_headers",
        "@rules_python//python/cc:current_py_cc_libs",
    ],
)
//embed.cpp

#include <iostream>

#define Py_LIMITED_ABI 
#include <Python.h>

int main() { 
		std::cout << "Starting main" << "\n";
		std::cout << "Ending main" << "\n";
    return 0;
}

I looked at the resulting linker commands.

/nologo
/OUT:bazel-out/x64_windows-opt/bin/python/experimental/embed.exe
bazel-out/x64_windows-opt/bin/python/experimental/_objs/embed/embed.obj
external/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/libs/python311.lib
external/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/libs/python3.lib
/SUBSYSTEM:CONSOLE
/MACHINE:X64
/DEFAULTLIB:msvcrt.lib
/OPT:ICF
/OPT:REF

The compiler is now trying to link both python311.lib and python3.lib. I'm not sure what the implications are here and I'm surprised there are no duplicate definition errors produced. Do you know if this is a problem?

@nicholasjng
Copy link
Contributor Author

nicholasjng commented Mar 24, 2024

Yes, that is potentially a problem - when building against the stable ABI, you do not want to link against the versioned python3XY.lib.

This would probably be fine if the python3.lib appeared first in the command. (Without knowing the intricacies of the MSVC linker, I'm guessing the first resolved symbol is taken greedily.)

But, I'm more curious about the defines here: I'm mentioning Py_LIMITED_API, since that designates python3.lib in that pyconfig.h snippet. You don't seem to be defining that either in the code, or in the cc_binary.

Can you share the linker flags when building with cc_binary(..., defines=["Py_LIMITED_API=0x030A0000"]) or something similar?

@axbycc-mark
Copy link

axbycc-mark commented Mar 24, 2024

Sorry, my mistake. I have been misreading "LIMITED_API" as "LIMITED_ABI" the whole time. Let me try once more with the correct spelling.

Update: Same issue with the fixed spelling, now inside the defines=[...] attribute as requested. I verified the object file was compiled with Py_LIMITED_API by looking at the compilation flags which appear below.


/nologo
/DCOMPILER_MSVC
/DNOMINMAX
/D_WIN32_WINNT=0x0601
/D_CRT_SECURE_NO_DEPRECATE
/D_CRT_SECURE_NO_WARNINGS
/bigobj
/Zm500
/EHsc
/wd4351
/wd4291
/wd4250
/wd4996
/I.
/Ibazel-out/x64_windows-opt/bin
/Iexternal/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc
/Ibazel-out/x64_windows-opt/bin/external/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc
/Iexternal/bazel_tools
/Ibazel-out/x64_windows-opt/bin/external/bazel_tools
/Iexternal/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/include
/Ibazel-out/x64_windows-opt/bin/external/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/include
/Iexternal/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/include/python3.11
/Ibazel-out/x64_windows-opt/bin/external/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/include/python3.11
/Iexternal/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/include/python3.11m
/Ibazel-out/x64_windows-opt/bin/external/rules_python~0.31.0~python~python_3_11_x86_64-pc-windows-msvc/include/python3.11m
/DPy_LIMITED_API=0x030A0000
/showIncludes
/MD
/O2
/Oy-
/DNDEBUG
/wd4117
-D__DATE__=\"redacted\"
-D__TIMESTAMP__=\"redacted\"
-D__TIME__=\"redacted\"
/Gy
/Gw
-DWIN32_LEAN_AND_MEAN
-DNOGDI
/std:c++20
/Fobazel-out/x64_windows-opt/bin/python/experimental/_objs/embed/embed.obj
/c
python/experimental/embed.cpp

I guess whether you define the Py_LIMITED_API or not, the libraries get linked anyway due to Bazel's rules. It makes sense to me that Bazel would instruct the compiler to link all srcs mentioned in cc_library dependency.

@nicholasjng
Copy link
Contributor Author

I mean, so much is expected - the headers appearing in your compilation command there are always versioned to the Python distribution they are from.

Also, from the pyconfig.h snippet, the conditional declaration of either python3.lib or python3X.lib means that the @rules_python//python/cc:current_py_cc_libs need not appear as deps at all if you're building a Python extension on Windows, the libs only need to be both present for the hermetic Python.

You can see that this is true e.g. by checking the CI statuses of nicholasjng/nanobind-bazel#16, where I completely removed the current_py_cc_libs from all default depsets of my Python targets in the second commit of the changeset. On Windows, Python 3.8-3.11 (all without the stable ABI) proceed to build just fine.

TL,DR: Explicitly specifying the current libs on Windows is normally not necessary because of the selection mechanism in pyconfig.h, and perhaps counterproductive for the reasons you mentioned (Bazel links all libs appearing as deps in a command).

@axbycc-mark
Copy link

axbycc-mark commented Mar 24, 2024

I'm thinking about what you said here.

Also, from the pyconfig.h snippet, the conditional declaration of either python3.lib or python3X.lib means that the @rules_python//python/cc:current_py_cc_libs need not appear as deps at all if you're building a Python extension on Windows, the libs only need to be both present for the hermetic Python.

In my project, I have a python_repository provided by python_rules for Python version 3.11. It has registered this version as the default Python toolchain. Outside of the project folder, my system has no globally installed Python version 3.11. Within my project, I want to embed a Python interpreter / write Python extensions for Python 3.11.

I am hitting the hermetic Python use case that you are mentioning right? If I don't include these dependencies, the #pragma will still try to link python311.lib and I will get a linker error (if I still include the current_py_cc_headers).

On the other hand, if I installed Python 3.10 on my system level (and don't include current_py_cc_headers or current_py_cc_libs), msvc would find the system python310.lib/python310.dll and link my project's binaries against those on account of the #pragma directives. But if I write an extension this way, and use it as a dependency for a py_binary, is it going to cause issues? Both python310.dll and python311.dll will be loaded at runtime, defining many of the same symbols. So if I take that route, I must be sure that the system python matches the hermetic python.

Does that match your understanding?

@nicholasjng
Copy link
Contributor Author

nicholasjng commented Mar 24, 2024

I think we mean different things: I am referring to "hermetic Python" as in "a configured rules_python toolchain", not as an embedded interpreter. MSVC invocations will not link against anything on the system other than the configured rules_python toolchain. (To verify, see the 3.12 build failure in the PR that I linked - I install Python 3.12 on the runner early on, but the build still fails due to a linker error, even if the system Python 3.12 has all of the required libs.)

The only way to screw up hermeticity is to link libs from the system Python explicitly in your package setup code, i.e. by supplying --linkopt=/LIBPATH:C:\my\system\python\libs in Python. If you build everything in Bazel anyway, you're good.

EDIT: Ah, now I get it. Yes, the current libs will be taken from the rules_python toolchain as well - if you want to redistribute SABI extensions, you package python3.{lib,dll}, if not, the versioned ones.

@axbycc-mark
Copy link

Okay I still can't get my head around how you can remove the current_py_cc_libs dependency and still have your cc_library compile. On my computer I get the following error, which from my point of view is completely expected. Without the dependency, msvc is never told where the lib file is.

LINK : fatal error LNK1104: cannot open file 'python3.lib'
Target //python/experimental:embed failed to build

But I think I'm a little out of my depth here so I'll need to go off on my own and read more about Windows/Bazel/Python.

Unrelatedly, I did uncover something interesting. Using Process Explorer and launching the Python interpreter, I found it's completely normal for the Python process to load in both python3.dll and pythonxy.dll.
image

But this may just be because there are transitive dependencies being imported at startup, where every single pyd only depends on either python3 or pythonxy.

In any case I imagine it is safe to add the python*.dll into the sources in your PR which would fix #1823

Thanks for taking up this issue 👍

@aignas
Copy link
Collaborator

aignas commented Mar 26, 2024

I don't have a Windows machine around so I am happy that you are investigating this and attempting to make building on Windows better.

Let me know when you would like a review from the maintainer point of view. The checklist would be:

Let me know if you would like any ideas/suggestions from me or other maintainers.

@nicholasjng
Copy link
Contributor Author

@aignas Yes please, this is ready for review.

I can copy the backstory over to a comment in the thread and paste it here, the issue you mentioned is technically about a slightly different use.

Is the PR title (or a slightly more elaborate description thereof) sufficient as a changelog entry?

PS: If you're fine with it, I would add the other versioned DLL mentioned in #1823 to the list, and then it should actually also fix that issue.

@aignas
Copy link
Collaborator

aignas commented Mar 26, 2024

The PR description is the commit message once the PR is merged and the changelog.md file still needs to be modified manually. As for the backstory, having it as a comment in this PR may be also sufficient. No need to create an issue just for that.

I am curious if it would be possible to add a test target (maybe under tests/toolchains?) Where we could ensure that the new code works?

@nicholasjng
Copy link
Contributor Author

The following is the backstory of this PR, previously found in the description.


Backstory

I am currently implementing Bazel support for the nanobind project, which makes creation of C++ Python bindings very easy. It is the successor to pybind11 in that regard. Development efforts on this happen in the nanobind-bazel repository.

One particular feature in nanobind is targeting the Python stable ABI, which can be used to shrink the build matrix for Python wheels by promising compatibility across minor versions of Python even with C++ extensions. Due to later additions of nanobind prerequisites to the Python limited API, targeting the stable ABI using nanobind only became possible starting with Python 3.12.

I added a Bazel config to the easiest, "hello world"-ish nanobind example project here, which used to build just fine on all platforms (Win/MacOS/Linux) and versions (Python 3.8-3.12), but only because I was breaking hermeticity in the Windows case, passing the libdir of the system interpreter to the build as a linkopt on Windows in Python (in the setup.py, to be specific).

(For an example of where this is still done, see https://github.com/google/benchmark/blob/d5c55e8c42a8782cb24f6011d0e88449237ab842/setup.py#L71-L74.)

Once I stopped doing that, I started getting build errors for Win+Python 3.12, as for example in this PR: nicholasjng/nanobind-bazel#16. Note that when building for Python 3.12, I target the stable ABI by setting the Py_LIMITED_API macro to 3.12.

As would be expected from pyconfig.h taking measures to target the unversioned library, Python itself mentions that python3.lib needs to be linked in place of python3XY.lib when targeting the stable ABI: https://docs.python.org/3/c-api/stable.html#stable-abi

As a final point, none of this is an issue on Unix platforms, since those get their Python symbols dynamically at runtime without needing any linkage.

Please let me know your thoughts.

@nicholasjng
Copy link
Contributor Author

I added the changelog entry, and split off the backstory into a comment, removing it from the PR description.

I also took the liberty to add the pythonXY.dll DLL file, as prompted by the linked issue.

As for tests - I believe that since MSVC decides the library to link in the pyconfig.h header, we can add a regression test for this issue by compiling a simple cc_binary with the Py_LIMITED_API macro defined. The only dependency needed are the current cc headers, so this should be grouped with the cc headers tests. Since I have no experience with writing tests for Bazel, I would welcome some guidance.

@rickeylev
Copy link
Collaborator

I think //tests/cc/current_py_cc_libs:python_libs_linking_test is what you want? I created a simple cc_test to verify some linking behavior, but it didn't pass on windows and I couldn't figure out why. This thread sounds like the reason?

Since I have no experience with writing tests for Bazel, I would welcome some guidance.

You're probably looking at e.g. tests/cc/current_py_cc_headers/current_py_cc_headers_tests.bzl ? Those are called "analysis tests" and are for verifying that the bzl logic in rule code is working. I wouldn't recommend trying to use those for this case -- what you have to do is look into the linker args and try to figure out if it looks right, which is fairly brittle and painful. I would just create cc_test targets that are built/run instead.

@nicholasjng
Copy link
Contributor Author

I think //tests/cc/current_py_cc_libs:python_libs_linking_test is what you want? I created a simple cc_test to verify some linking behavior, but it didn't pass on windows and I couldn't figure out why. This thread sounds like the reason?

Yes, that should be it. What do I need to expect in _solib for Windows now? Probably python3.lib, right?

# The shared library should also end up in runfiles
# The `_solib` directory is a special directory CC rules put
# libraries into.
env.expect.that_target(target).runfiles().contains_predicate(
matching.str_matches("*_solib*/libpython3.so"),
)

@rickeylev
Copy link
Collaborator

Yeah. Just use python3. instead of python3.so. The extension can vary, so I don't think there's a point in trying to match it.

@nicholasjng
Copy link
Contributor Author

Local Mac is still broken with matching.str_matches("*_solib*/*python3."):

FAIL: //tests/cc/current_py_cc_libs:python_libs_linking_test (see /private/var/tmp/_bazel_nicholasjunge/1a11c1f7d001d9440f87afb0306a8075/execroot/_main/bazel-out/darwin_arm64-fastbuild/testlogs/tests/cc/current_py_cc_libs/python_libs_linking_test/test.log)
INFO: From Testing //tests/cc/current_py_cc_libs:python_libs_linking_test:
==================== Test output for //tests/cc/current_py_cc_libs:python_libs_linking_test:
dyld[10462]: Library not loaded: /install/lib/libpython3.11.dylib
  Referenced from: <BED7F7C3-FAFD-377B-9B37-AEA0F9EC10C8> /private/var/tmp/_bazel_nicholasjunge/1a11c1f7d001d9440f87afb0306a8075/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/tests/cc/current_py_cc_libs/python_libs_linking_test
  Reason: tried: '/install/lib/libpython3.11.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/install/lib/libpython3.11.dylib' (no such file), '/install/lib/libpython3.11.dylib' (no such file)

Looks like the library search path is messed up?
(FWIW, I do see libpython3.11.dylib under the _solib_darwin_arm64 directory in bazel-bin, although in a randomly generated subfolder.)

@nicholasjng
Copy link
Contributor Author

I think the current libs test is fundamentally broken on MacOS (and potentially Windows too). The linkage of the python_libs_linking_test is off:

➜ otool -L bazel-out/darwin_arm64-fastbuild/bin/tests/cc/current_py_cc_libs/python_libs_linking_test
bazel-out/darwin_arm64-fastbuild/bin/tests/cc/current_py_cc_libs/python_libs_linking_test:
        /install/lib/libpython3.11.dylib (compatibility version 3.11.0, current version 3.11.0)
        /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1700.255.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)
        /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 2420.0.0)
        /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0)

The first path is the baked-in dylib path of the hermetic Python's libpythonXY.dylib. The linker options look good for fixing up the path to the actual dylib, but it seems they are not respected:

# python_libs_linking_test-2.params

-o
bazel-out/darwin_arm64-fastbuild/bin/tests/cc/current_py_cc_libs/python_libs_linking_test
-Xlinker
-rpath
-Xlinker
@loader_path/../../../_solib_darwin_arm64/_U@@_Umain~python~python_U3_U11_Uaarch64-apple-darwin_S_S_Clibpython___Ulib
-Xlinker
-rpath
-Xlinker
@loader_path/python_libs_linking_test.runfiles/_main/_solib_darwin_arm64/_U@@_Umain~python~python_U3_U11_Uaarch64-apple-darwin_S_S_Clibpython___Ulib
-Lbazel-out/darwin_arm64-fastbuild/bin/_solib_darwin_arm64/_U@@_Umain~python~python_U3_U11_Uaarch64-apple-darwin_S_S_Clibpython___Ulib
bazel-out/darwin_arm64-fastbuild/bin/tests/cc/current_py_cc_libs/_objs/python_libs_linking_test/python_libs_linking_test.o
-lpython3.11
-Wl,-S
-mmacos-version-min=10.11
-no-canonical-prefixes
-fobjc-link-runtime
-headerpad_max_install_names
-lc++
-lm

Not sure what to do here. I think the best option is to create a new cc_test with Py_LIMITED_API defined, and restrict it to Windows only.

@nicholasjng
Copy link
Contributor Author

This patch does not fix the SABI build failure in my project, see nicholasjng/nanobind-bazel#18 and the CI run therein.

It seems that the current_py_cc_libs target is more broken than initially anticipated, perhaps because no rpath fixup happens for the shared libs in the repository template.

Possibly related thread (for MacOS): pyinstaller/pyinstaller#7582

@rickeylev
Copy link
Collaborator

more broken than anticipated

I'm not too surprised by that :). There's lots of platform-specific idiosyncrasies and I'm by no means an linker expert, even less so with Windows and Mac. I don't really have a Mac or Windows machine to experiment with, either. I think the reality is someone with more knowledge/experience/motivation for those platforms will need to step up.

The code powering the current_py_cc_libs stuff is pretty simple. All it's doing is forwarding along the cc_library() defined in the runtime's repo. If someone can craft a cc_library that represents python libs coming from within the build, then I can help translate it into the necessary toolchain code.

create a new cc_test with Py_LIMITED_API defined, and restrict it to Windows only.

Yes, that's fine. Incremental progress is good.

linkage and rpath issues

That /install/lib/libpython3.11.dylib entry does look a bit odd. It shouldn't be finding libraries from outside the runfiles when those libraries are coming from a cc_library in the build. What does the binary's RPATH, DTNEEDED, and runfiles solib directory look like?

From what I understand of Bazel's cc_test[1], the way it should be working is:

  • The cc_test has a DTNEEDED entry added for libpython
  • The runfiles solib directory will, somewhere, contain libpython.so
  • The cc_test has RPATH pointing to the runfiles solib directory

There might be some filename mangling and SONAME trickery, but I can't recall.

  • cc_test might have DTNEEDED=_S_S_blabla_S_libpython3 and a corresponding file with a rewritten SONAME in soblib
  • cc_test might have DTNEEDED=libpython3 and solib will have e.g. runfiles/solib/_S_S_blabla/libpython.dylib (symlink to actual file)

[1] cc_binary might behave differently btw. I recall seeing some code paths in the cc rule implementation that special cases cc_test vs cc_binary, but what activated them was convoluted at times, so ymmv.

@nicholasjng
Copy link
Contributor Author

nicholasjng commented Mar 28, 2024

Not to worry, I'm here to (hopefully) see this through. I'm also not an expert, and every new linker topic I see prompts an extensive google search, but I'll get there :)

It seems that on macOS, a binary called install_name_tool is responsible for overwriting / setting rpaths and loader paths in dylibs. That could be an option, but it's in all likelihood not portable.

As a note, here is a blog post detailing the practice when bundling dylibs on macOS.

Here's the full otool -l output on the test binary:

Output
➜ otool -l /Users/nicholasjunge/Workspaces/c++/rules_python/bazel-bin/tests/cc/current_py_cc_libs/python_libs_linking_test
/Users/nicholasjunge/Workspaces/c++/rules_python/bazel-bin/tests/cc/current_py_cc_libs/python_libs_linking_test:
Load command 0
      cmd LC_SEGMENT_64
  cmdsize 72
  segname __PAGEZERO
   vmaddr 0x0000000000000000
   vmsize 0x0000000100000000
  fileoff 0
 filesize 0
  maxprot 0x00000000
 initprot 0x00000000
   nsects 0
    flags 0x0
Load command 1
      cmd LC_SEGMENT_64
  cmdsize 392
  segname __TEXT
   vmaddr 0x0000000100000000
   vmsize 0x0000000000004000
  fileoff 0
 filesize 16384
  maxprot 0x00000005
 initprot 0x00000005
   nsects 4
    flags 0x0
Section
  sectname __text
   segname __TEXT
      addr 0x0000000100003f08
      size 0x0000000000000064
    offset 16136
     align 2^2 (4)
    reloff 0
    nreloc 0
     flags 0x80000400
 reserved1 0
 reserved2 0
Section
  sectname __stubs
   segname __TEXT
      addr 0x0000000100003f6c
      size 0x0000000000000024
    offset 16236
     align 2^2 (4)
    reloff 0
    nreloc 0
     flags 0x80000408
 reserved1 0 (index into indirect symbol table)
 reserved2 12 (size of stubs)
Section
  sectname __cstring
   segname __TEXT
      addr 0x0000000100003f90
      size 0x0000000000000017
    offset 16272
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000002
 reserved1 0
 reserved2 0
Section
  sectname __unwind_info
   segname __TEXT
      addr 0x0000000100003fa8
      size 0x0000000000000058
    offset 16296
     align 2^2 (4)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Load command 2
      cmd LC_SEGMENT_64
  cmdsize 152
  segname __DATA_CONST
   vmaddr 0x0000000100004000
   vmsize 0x0000000000004000
  fileoff 16384
 filesize 16384
  maxprot 0x00000003
 initprot 0x00000003
   nsects 1
    flags 0x10
Section
  sectname __got
   segname __DATA_CONST
      addr 0x0000000100004000
      size 0x0000000000000018
    offset 16384
     align 2^3 (8)
    reloff 0
    nreloc 0
     flags 0x00000006
 reserved1 3 (index into indirect symbol table)
 reserved2 0
Load command 3
      cmd LC_SEGMENT_64
  cmdsize 72
  segname __LINKEDIT
   vmaddr 0x0000000100008000
   vmsize 0x0000000000004000
  fileoff 32768
 filesize 824
  maxprot 0x00000001
 initprot 0x00000001
   nsects 0
    flags 0x0
Load command 4
      cmd LC_DYLD_CHAINED_FIXUPS
  cmdsize 16
  dataoff 32768
 datasize 152
Load command 5
      cmd LC_DYLD_EXPORTS_TRIE
  cmdsize 16
  dataoff 32920
 datasize 48
Load command 6
     cmd LC_SYMTAB
 cmdsize 24
  symoff 32976
   nsyms 5
  stroff 33080
 strsize 88
Load command 7
            cmd LC_DYSYMTAB
        cmdsize 80
      ilocalsym 0
      nlocalsym 0
     iextdefsym 0
     nextdefsym 2
      iundefsym 2
      nundefsym 3
         tocoff 0
           ntoc 0
      modtaboff 0
        nmodtab 0
   extrefsymoff 0
    nextrefsyms 0
 indirectsymoff 33056
  nindirectsyms 6
      extreloff 0
        nextrel 0
      locreloff 0
        nlocrel 0
Load command 8
          cmd LC_LOAD_DYLINKER
      cmdsize 32
         name /usr/lib/dyld (offset 12)
Load command 9
     cmd LC_UUID
 cmdsize 24
    uuid BED7F7C3-FAFD-377B-9B37-AEA0F9EC10C8
Load command 10
      cmd LC_BUILD_VERSION
  cmdsize 32
 platform 1
    minos 14.4
      sdk 14.4
   ntools 1
     tool 3
  version 1053.12
Load command 11
      cmd LC_SOURCE_VERSION
  cmdsize 16
  version 0.0
Load command 12
       cmd LC_MAIN
   cmdsize 24
  entryoff 16136
 stacksize 0
Load command 13
          cmd LC_LOAD_DYLIB
      cmdsize 64
         name /install/lib/libpython3.11.dylib (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 3.11.0
compatibility version 3.11.0
Load command 14
          cmd LC_LOAD_DYLIB
      cmdsize 48
         name /usr/lib/libc++.1.dylib (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 1700.255.0
compatibility version 1.0.0
Load command 15
          cmd LC_LOAD_DYLIB
      cmdsize 56
         name /usr/lib/libSystem.B.dylib (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 1345.100.2
compatibility version 1.0.0
Load command 16
          cmd LC_LOAD_DYLIB
      cmdsize 96
         name /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 2420.0.0
compatibility version 300.0.0
Load command 17
          cmd LC_LOAD_DYLIB
      cmdsize 56
         name /usr/lib/libobjc.A.dylib (offset 24)
   time stamp 2 Thu Jan  1 01:00:02 1970
      current version 228.0.0
compatibility version 1.0.0
Load command 18
          cmd LC_RPATH
      cmdsize 136
         path @loader_path/../../../_solib_darwin_arm64/_U@@_Umain~python~python_U3_U11_Uaarch64-apple-darwin_S_S_Clibpython___Ulib (offset 12)
Load command 19
          cmd LC_RPATH
      cmdsize 168
         path @loader_path/python_libs_linking_test.runfiles/_main/_solib_darwin_arm64/_U@@_Umain~python~python_U3_U11_Uaarch64-apple-darwin_S_S_Clibpython___Ulib (offset 12)
Load command 20
      cmd LC_FUNCTION_STARTS
  cmdsize 16
  dataoff 32968
 datasize 8
Load command 21
      cmd LC_DATA_IN_CODE
  cmdsize 16
  dataoff 32976
 datasize 0
Load command 22
      cmd LC_CODE_SIGNATURE
  cmdsize 16
  dataoff 33168
 datasize 424

It looks like the first LC_LOAD_DYLIB entry (for libpython3.11) is the culprit, but the rpath that Bazel set with the linker is correct.

I'll try to add the test shortly, would appreciate feedback as your time permits.

@rickeylev
Copy link
Collaborator

It looks like the first LC_LOAD_DYLIB entry (for libpython3.11) is the culprit, but the rpath that Bazel set with the linker is correct.

Yes, I agree. My guess is something is missing from the linker command to generate an LC_LOAD_DYLIB entry like @rpath/...something..., like mentioned in that web page you linked to. There is also a cc_import() rule; IDK if that will help, but you could try it.

@nicholasjng
Copy link
Contributor Author

I added a Windows test with Py_LIMITED_API defined to make MSVC look for libs/python3.lib. It basically reproduces the issue I saw in my nanobind repo.

nicholasjng and others added 2 commits April 5, 2024 19:12
When targeting the Python Stable ABI on Windows (by setting the
Py_LIMITED_API macro to a Python minimum version hex), the unversioned
python3.lib needs to be linked instead of the versioned one
(e.g. python38.lib for Python 3.8).

Python's own config sets the library to link by default in a header
called pyconfig.h (https://github.com/python/cpython/blob/9cc9e277254023c0ca08e1a9e379fd89475ca9c2/PC/pyconfig.h#L270),
which prompts the linker to search for python3.lib if a stable ABI
extension is built using `@rules_python` toolchains.

Since this library is not exported on Windows in the `python_repository()`
rule, it's added now to allow Python SABI extensions to be built (and
linked) on Windows with `@rules_python`.

Since Python takes responsibility for linking the correct lib on Windows,
and never both at the same time, no other changes are made.
Reuses the existing cc_test targeting the stable ABI on Windows. Since
this prompts MSVC to search for libs/python3.lib, it serves as a check
that the library search path is intact on Windows.
This could potentially fix the header test.
@nicholasjng
Copy link
Contributor Author

@rickeylev I think the new cc_import in c567f70 did something, I'm just not sure what the issue is now. The buildkite artifacts don't seem to contain any useful information on the failures.

@rickeylev
Copy link
Collaborator

Wow, empty output? That's sort of impressive. I'd have expected like a message about segfaulting or something.

In any case, perhaps line 5 in python_libs_linking_test isn't actually triggering? That C++ code is buildable, but not runnable. That it's getting past the build step means the (build time) linking is happy enough. That it's failing at runtime is expected.

@nicholasjng
Copy link
Contributor Author

Ah, nice! Do you have a preference on how to proceed? I'm guessing it's not as easy as adding an early return 0, since the compiler might optimize the symbols away.

@rickeylev
Copy link
Collaborator

Yeah, that's why the condition it checks is argc -- that's a runtime condition, so it can't optimize them away.

Maybe print out argv or env vars and see if there's something to trigger in there? If not, adding something to args/env of the cc_test target would let us set a signal to check.

@nicholasjng
Copy link
Contributor Author

Just tried to set an envvar, but that evidently did not work. Am I on the right track?

(I still can't run this test locally because of the macOS dylib failure, btw, so I am a little slow here.)

The former was probably an oversight, the latter is a consequence of the
addition of free-threading, which wasn't there when this PR was first
opened.
@nicholasjng
Copy link
Contributor Author

Come on dude, why will it not work. What's so special about Windows that the test won't even output logs anymore?

tl,dr of my latest attempt: Introduce select() on _IS_FREETHREADED in the cc_import, adjust the formatting in the libpython Windows cases, and change the test by making a string out of argv[0] and printing its refcount. Proof for the SABI-ness of said refcount is here: https://docs.python.org/3/c-api/typeobj.html#c.PyObject.ob_refcnt

Maybe it's time to install that Windows virtualization thingy for Mac.

@rickeylev
Copy link
Collaborator

Looks like the missing part was current_py_cc_libs in the deps of the test? Glad you figured it out!

@rickeylev rickeylev added this pull request to the merge queue Nov 30, 2024
Merged via the queue into bazelbuild:main with commit d5a595c Nov 30, 2024
4 checks passed
@nicholasjng
Copy link
Contributor Author

Thanks! If you'd like, I could also make another PR to enable the other test for Windows now. That effectively tests locating the non-ABI3 libs, which has value in its own right in my opinion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants