Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add two small tooldir prefix patches to aid cross-compiling #8

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

chewi
Copy link
Member

@chewi chewi commented Jan 4, 2025

These are small but a tad controversial.

There has long been a desire for crossdev to initialise environments with a given profile rather than the embedded profile that few people actually want. I picked up a PR for this, tried it with a merged-usr profile, and found that things break very badly once you point the sysroot somewhere else.

For some bizarre reason, GCC adds /usr/${CHOST}/lib to the library search path before the sysroot's own /lib and /usr/lib. Gentoo has previously got away with this as there were generally no conflicting libraries in the toolchain's /lib, but with merged-usr, this is now the same directory as the toolchain's /usr/lib.

Aside from ABI compatibility issues, the toolchain's /usr/lib/libc.so ld script is now found before the sysroot's. This includes a reference to /lib/libc.so.6. The sysroot is only prepended to this when the ld script itself is within that sysroot, so /lib/libc.so.6 is taken literally, causing the build machine's libc to be used instead, which immediately breaks any linking.

I couldn't see any good reason to use the toolchain's /lib when the sysroot has been changed, so the "lib" patch simply omits it in that case.

The "bin" patch involves the code immediately before. The issue here is similar in that /usr/${CHOST}/bin is searched for toolchain programs, but at least on Gentoo, only non-native binaries live here. It would be better to fail to find a given tool than inadvertently try to execute a non-native build of it via QEMU. While this change is not strictly necessary, I felt it made sense alongside the first change.

Note that I have deliberately not adjusted the whitespace of the existing code to keep the patches minimal. This way, they only add a single line each.

I didn't want to submit the "lib" patch without considering whether Clang would need similar treatment. It sadly does suffer from the same issue. You can kind of work around it by pointing --gcc-install-dir inside the sysroot, but this gets messy when bootstrapping a new system. When I found the code responsible though, it turned out to reveal much more about GCC than it did about Clang.

    // GCC cross compiling toolchains will install target libraries which ship
    // as part of the toolchain under <prefix>/<triple>/<libdir> rather than as
    // any part of the GCC installation in
    // <prefix>/<libdir>/gcc/<triple>/<version>. This decision is somewhat
    // debatable, but is the reality today. We need to search this tree even
    // when we have a sysroot somewhere else. It is the responsibility of
    // whomever is doing the cross build targeting a sysroot using a GCC
    // installation that is *not* within the system root to ensure two things:
    //
    //  1) Any DSOs that are linked in from this tree or from the install path
    //     above must be present on the system root and found via an
    //     appropriate rpath.
    //  2) There must not be libraries installed into
    //     <prefix>/<triple>/<libdir> unless they should be preferred over
    //     those within the system root.
    //
    // Note that this matches the GCC behavior. See the below comment for where
    // Clang diverges from GCC's behavior.
    addPathIfExists(D,
                    LibPath + "/../" + GCCTriple.str() + "/lib/../" + OSLibDir +
                        SelectedMultilibs.back().osSuffix(),
                    Paths);

With merged-usr, we are totally violating that second rule. The original commit from 2013 (llvm/llvm-project@7f8042c) says a little more about how this behaviour is "frustrating" but apparently necessary for Android and MIPS platforms. I take that to mean their SDKs.

I can confirm that gating the above line with if (SysRoot.empty()) fixes the issue, although another line just below is needed to avoid the corresponding /lib (as opposed to /lib64) entry.

Since upstream clearly thought this behaviour is terrible, I don't feel so bad about nullifying it. I don't think we need to worry about what is needed by these SDKs (if they even do still need this) because they probably have their own toolchains. I can't really see any other way around it. merged-usr is now the default, so cross environments really should support it.

As for the "bin" patch, Clang also mirrors GCC's behaviour here, but I think it is less likely to fall back to the problematic directory, as everything it needs should be in /usr/lib/llvm/${V}/bin, which is where it looks first.

@chewi
Copy link
Member Author

chewi commented Jan 5, 2025

I hear we have a "no Clang patches" policy. I'd prefer to abide by that if we can somehow. I wondered whether we could convince GCC to change this given that major distros use merged-usr now. Clang would then follow suit. But wouldn't have other distros hit this already? They don't do as much cross-compiling as we do, but I had a look.

Fedora's does create /usr/aarch64-linux-gnu, but the only directories in there are bin with the binutils programs and an empty sys-root directory. Despite being empty, this is where gcc's default sysroot points to. The sysroot-aarch64-fc41-glibc package installs glibc under /usr/aarch64-redhat-linux/sys-root/fc41 instead, but everything is under the usr subdirectory. There is no lib here. I don't know how they use this is in practice, but I guess they always explicitly set the sysroot, leaving the default one empty. What I find most surprising is that even hello.c doesn't build without setting the sysroot. I don't think we want to copy this scheme.

Arch's aarch64-linux-gnu-glibc package installs everything under /usr/aarch64-linux-gnu but with a slightly odd layout. There's practically nothing under the usr subdirectory. They only put GCC libraries (e.g. libgcc_s) under lib64 with everything else like glibc under lib, despite it being 64-bit. This is partly explained by their host /usr/lib64 being symlinked to lib. Wondering what this looks like for a 32-bit toolchain? Well, they don't have any Linux-based ones. To sum up, they avoid the issue by misusing multilib to manipulate the search order. This seems too different to make a useful comparison.

I already know that Debian and NixOS are also too different to make a useful comparison.

@chewi
Copy link
Member Author

chewi commented Jan 5, 2025

An amusing fact. Want to know when this behaviour was introduced? January 2nd 1993 by rms.
gcc-mirror/gcc@f18fd95

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant