Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to build TOT LinuxL #470

Open
mwoodpatrick opened this issue Dec 10, 2023 · 11 comments
Open

Failing to build TOT LinuxL #470

mwoodpatrick opened this issue Dec 10, 2023 · 11 comments

Comments

@mwoodpatrick
Copy link

I'm running:

mkroot/mkroot.sh CROSS_COMPILE=armv5l-linux-musleabihf- LINUX=$GIT_ROOT/linux

with TOT Linux and the build is failing (see below) I also don't see flex in the armv5l & armv7l toolchains from:

https://landley.net/toybox/downloads/binaries/toolchains/latest/

I note that binaries in in the above directory are over a year and half old. Anyone have any pointers to more recent toolchains

=== linux-arm
/mnt/wsl/projects/git/linux /mnt/wsl/projects/git/toybox
/mnt/wsl/projects/git/toybox
/mnt/wsl/projects/git/toybox/root/build/armv5l-tmp/linux /mnt/wsl/projects/git/toybox
HOSTCC scripts/basic/fixdep
HOSTCC scripts/kconfig/conf.o
HOSTCC scripts/kconfig/confdata.o
HOSTCC scripts/kconfig/expr.o
LEX scripts/kconfig/lexer.lex.c
/bin/sh: 1: flex: not found
make[2]: *** [scripts/Makefile.host:9: scripts/kconfig/lexer.lex.c] Error 127
make[1]: *** [/mnt/wsl/projects/git/toybox/root/build/armv5l-tmp/linux/Makefile:685: allnoconfig] Error

@landley
Copy link
Owner

landley commented Dec 11, 2023

flex and bison aren't in the toolchains, the airlock setup still tries to link them from your host $PATH (https://github.com/landley/toybox/blob/master/scripts/install.sh#L105) meaning it expects them installed on the host. (I do plan to provide usable lex and yacc versions in toybox, but they're on the todo list after awk and make.)

I have newer toolchain builds here, I should upload them with the coming release. (Alas musl-cross-make stopped updating, so what I really should do is bite the bullet and write my own standalone toolchain build script. But the ones I have still work for me, and when I have spare toolchain brain I mostly spend it fighting with https://github.com/quic/toolchain_for_hexagon/ so... That said, I've locally built arm64 hosted versions and should upload both sets next time...)

Rob

@mwoodpatrick
Copy link
Author

I am able to build toybox for my host arch

mkroot/mkroot.sh LINUX=$GIT_ROOT/linux

and qemu boots as expected using:

root/host/run-qemu.sh

but I'm unable to ping dns.google.com any suggestions on the best way to debug?

However, I'm still seeing the same issue building for arm51:

mkroot/mkroot.sh CROSS_COMPILE=armv5l-linux-musleabihf- LINUX=$GIT_ROOT/linux

Do you have any suggestions on the best way to debug?

@mwoodpatrick
Copy link
Author

mwoodpatrick commented Dec 12, 2023

For the ping issue:

# ping dns.google.com
Ping dns.google.com (8.8.4.4): 56(84) bytes.

--- 8.8.4.4 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss
# cat /etc/resolv.conf
nameserver 8.8.8.8

I don't see any issues in the kernel log:

# dmesg
[    0.000000] Linux version 6.7.0-rc5 (mwoodpatrick@mlwphpenvy360) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 Tue Dec 12 06:42:50 PST 2023
[    0.000000] Command line: panic=1 HOST=x86_64 console=ttyS0
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000000ffdffff] usable
[    0.000000] BIOS-e820: [mem 0x000000000ffe0000-0x000000000fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] APIC: Static calls initialized
[    0.000000] SMBIOS 2.8 present.
[    0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 2611.179 MHz processor
[    0.003383] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.003511] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.003730] last_pfn = 0xffe0 max_arch_pfn = 0x400000000
[    0.004225] MTRR map: 4 entries (3 fixed + 1 variable; max 19), built from 8 variable MTRRs
[    0.004335] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT
[    0.010354] found SMP MP-table at [mem 0x000f5ba0-0x000f5baf]
[    0.013190] RAMDISK: [mem 0x0ff01000-0x0ffdffff]
[    0.014349] Zone ranges:
[    0.014365] DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.014445] DMA32    [mem 0x0000000001000000-0x000000000ffdffff]
[    0.014452] Normal   empty
[    0.014469] Movable zone start for each node
[    0.014488] Early memory node ranges
[    0.014510] node   0: [mem 0x0000000000001000-0x000000000009efff]
[    0.014542] node   0: [mem 0x0000000000100000-0x000000000ffdffff]
[    0.014623] Initmem setup node 0 [mem 0x0000000000001000-0x000000000ffdffff]
[    0.015801] On node 0, zone DMA: 1 pages in unavailable ranges
[    0.016080] On node 0, zone DMA: 97 pages in unavailable ranges
[    0.017816] On node 0, zone DMA32: 32 pages in unavailable ranges
[    0.018277] Intel MultiProcessor Specification v1.4
[    0.018389] MPTABLE: OEM ID: BOCHSCPU
[    0.018401] MPTABLE: Product ID: 0.1
[    0.018412] MPTABLE: APIC at: 0xFEE00000
[    0.018832] Processor #0 (Bootup-CPU)
[    0.019124] IOAPIC[0]: apic_id 0, version 32, address 0xfec00000, GSI 0-23
[    0.019401] Processors: 1
[    0.019998] [mem 0x10000000-0xfffbffff] available for PCI devices
[    0.020143] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[    0.020671] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
[    0.020744] pcpu-alloc: [0] 0
[    0.021198] Kernel command line: panic=1 HOST=x86_64 console=ttyS0
[    0.021889] Unknown kernel command line parameters "HOST=x86_64", will be passed to user space.
[    0.022125] Dentry cache hash table entries: 32768 (order: 6, 262144 bytes, linear)
[    0.022198] Inode-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.023032] Built 1 zonelists, mobility grouping on.  Total pages: 64224
[    0.023160] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
[    0.026521] Memory: 242976K/261624K available (8192K kernel code, 825K rwdata, 1036K rodata, 680K init, 484K bss, 18392K reserved, 0K cma-reserved)
[    0.029312] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.034346] NR_IRQS: 4352, nr_irqs: 48, preallocated irqs: 16
[    0.043118] printk: legacy console [ttyS0] enabled
[    0.053396] APIC: Switch to symmetric I/O mode setup
[    0.057074] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.076654] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x25a37c6a2c3, max_idle_ns: 440795271571 ns
[    0.077139] Calibrating delay loop (skipped), value calculated using timer frequency.. 5222.35 BogoMIPS (lpj=10444716)
[    0.078507] process: using AMD E400 aware idle routine
[    0.078861] Last level iTLB entries: 4KB 512, 2MB 255, 4MB 127
[    0.078938] Last level dTLB entries: 4KB 512, 2MB 255, 4MB 127, 1GB 0
[    0.079145] CPU: AMD QEMU Virtual CPU version 2.5+ (family: 0xf, model: 0x6b, stepping: 0x1)
[    0.079441] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization
[    0.079604] Spectre V2 : Kernel not compiled with retpoline; no mitigation available!
[    0.079615] Spectre V2 : Vulnerable
[    0.079834] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[    0.081669] x86/fpu: x87 FPU will use FXSAVE
[    0.089471] pid_max: default: 32768 minimum: 301
[    0.094401] Mount-cache hash table entries: 512 (order: 0, 4096 bytes, linear)
[    0.094669] Mountpoint-cache hash table entries: 512 (order: 0, 4096 bytes, linear)
[    0.110054] Performance Events: PMU not available due to virtualization, using software events only.
[    0.111296] signal: max sigframe size: 1040
[    0.329379] devtmpfs: initialized
[    0.332809] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.333470] futex hash table entries: 256 (order: 0, 6144 bytes, linear)
[    0.335002] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[    0.340420] PCI: Using configuration type 1 for base access
[    0.345174] SCSI subsystem initialized
[    0.345652] libata version 3.00 loaded.
[    0.353303] PCI: Probing PCI hardware
[    0.353447] PCI: root bus 00: using default resources
[    0.353516] PCI: Probing PCI hardware (bus 00)
[    0.353990] PCI host bridge to bus 0000:00
[    0.354303] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
[    0.354877] pci_bus 0000:00: root bus resource [mem 0x00000000-0xffffffffff]
[    0.355104] pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff]
[    0.357269] pci 0000:00:00.0: [8086:1237] type 00 class 0x060000
[    0.361168] pci 0000:00:01.0: [8086:7000] type 00 class 0x060100
[    0.361446] pci 0000:00:01.1: [8086:7010] type 00 class 0x010180
[    0.365089] pci 0000:00:01.1: reg 0x20: [io  0xc040-0xc04f]
[    0.366217] pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io  0x01f0-0x01f7]
[    0.366348] pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io  0x03f6]
[    0.366421] pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io  0x0170-0x0177]
[    0.366522] pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io  0x0376]
[    0.366881] pci 0000:00:01.3: [8086:7113] type 00 class 0x068000
[    0.367284] pci 0000:00:01.3: quirk: [io  0x0600-0x063f] claimed by PIIX4 ACPI
[    0.367440] pci 0000:00:01.3: quirk: [io  0x0700-0x070f] claimed by PIIX4 SMB
[    0.367762] pci 0000:00:02.0: [1234:1111] type 00 class 0x030000
[    0.368327] pci 0000:00:02.0: reg 0x10: [mem 0xfd000000-0xfdffffff pref]
[    0.369045] pci 0000:00:02.0: reg 0x18: [mem 0xfebf0000-0xfebf0fff]
[    0.373217] pci 0000:00:02.0: reg 0x30: [mem 0xfebe0000-0xfebeffff pref]
[    0.373481] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[    0.373911] pci 0000:00:03.0: [8086:100e] type 00 class 0x020000
[    0.374449] pci 0000:00:03.0: reg 0x10: [mem 0xfebc0000-0xfebdffff]
[    0.374971] pci 0000:00:03.0: reg 0x14: [io  0xc000-0xc03f]
[    0.377045] pci 0000:00:03.0: reg 0x30: [mem 0xfeb80000-0xfebbffff pref]
[    0.378315] pci_bus 0000:00: busn_res: [bus 00-ff] end is updated to 00
[    0.381549] pci 0000:00:01.0: PIIX/ICH IRQ router [8086:7000]
[    0.381797] PCI: pci_cache_line_size set to 64 bytes
[    0.382123] e820: reserve RAM buffer [mem 0x0009fc00-0x0009ffff]
[    0.382287] e820: reserve RAM buffer [mem 0x0ffe0000-0x0fffffff]
[    0.385469] pci 0000:00:02.0: vgaarb: setting as boot VGA device
[    0.385609] pci 0000:00:02.0: vgaarb: bridge control possible
[    0.385683] pci 0000:00:02.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[    0.385781] vgaarb: loaded
[    0.386689] clocksource: Switched to clocksource tsc-early
[    0.388631] NET: Registered PF_INET protocol family
[    0.388631] IP idents hash table entries: 4096 (order: 3, 32768 bytes, linear)
[    0.390248] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 4096 bytes, linear)
[    0.390522] Table-perturb hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.390739] TCP established hash table entries: 2048 (order: 2, 16384 bytes, linear)
[    0.390966] TCP bind hash table entries: 2048 (order: 3, 32768 bytes, linear)
[    0.391153] TCP: Hash tables configured (established 2048 bind 2048)
[    0.392236] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
[    0.392424] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
[    0.393089] NET: Registered PF_UNIX/PF_LOCAL protocol family
[    0.393785] pci_bus 0000:00: resource 4 [io  0x0000-0xffff]
[    0.393863] pci_bus 0000:00: resource 5 [mem 0x00000000-0xffffffffff]
[    0.394132] pci 0000:00:01.0: PIIX3: Enabling Passive Release
[    0.394318] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
[    0.394513] PCI: CLS 0 bytes, default 64
[    0.395619] platform rtc_cmos: registered platform RTC device (no PNP device found)
[    0.400410] Unpacking initramfs...
[    0.403609] workingset: timestamp_bits=62 max_order=16 bucket_order=0
[    0.404572] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    0.413556] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    0.416133] serial8250: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[    0.433529] loop: module loaded
[    0.434124] ata_piix 0000:00:01.1: version 2.13
[    0.452049] scsi host0: ata_piix
[    0.453087] scsi host1: ata_piix
[    0.453388] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc040 irq 14 lpm-pol 0
[    0.453589] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc048 irq 15 lpm-pol 0
[    0.454222] e1000: Intel(R) PRO/1000 Network Driver
[    0.454285] e1000: Copyright (c) 1999-2006 Intel Corporation.
[    0.455405] e1000 0000:00:03.0: PCI->APIC IRQ transform: INT A -> IRQ 11
[    0.466518] Freeing initrd memory: 892K
[    0.620545] ata2: found unknown device (class 0)
[    0.632762] ata2.00: ATAPI: QEMU DVD-ROM, 2.5+, max UDMA/100
[    0.648937] scsi 1:0:0:0: CD-ROM            QEMU     QEMU DVD-ROM     2.5+ PQ: 0 ANSI: 5
[    0.786504] e1000 0000:00:03.0 eth0: (PCI:33MHz:32-bit) 52:54:00:12:34:56
[    0.787089] e1000 0000:00:03.0 eth0: Intel(R) PRO/1000 Network Connection
[    0.787940] NET: Registered PF_INET6 protocol family
[    0.792125] Segment Routing with IPv6
[    0.792318] In-situ OAM (IOAM) with IPv6
[    0.792730] NET: Registered PF_PACKET protocol family
[    0.796172] sched_clock: Marking stable (783270697, 12360226)->(808951701, -13320778)
[    0.802997] printk: legacy console [netcon0] enabled
[    0.803161] netconsole: network logging started
[    0.831608] Freeing unused kernel image (initmem) memory: 680K
[    0.832048] Write protecting the kernel read-only data: 10240k
[    0.833427] Freeing unused kernel image (rodata/data gap) memory: 1012K
[    0.833639] Run /init as init process
[    0.833682] with arguments:
[    0.833698] /init
[    0.833713] with environment:
[    0.833722] HOME=/
[    0.833732] TERM=linux
[    0.833735] HOST=x86_64
[    1.432289] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x25a37c6a2c3, max_idle_ns: 440795271571 ns
[    1.432434] clocksource: Switched to clocksource tsc
[    2.970983] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX

@landley
Copy link
Owner

landley commented Dec 13, 2023

Because glibc doesn't support static linking, which is why I provide musl-libc toolchains. You may have notices the warnings going past:

net.c:(.text.xgetaddrinfo+0x54): warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking

This is a well-known glibc bug:

https://stackoverflow.com/questions/2725255/create-statically-linked-binary-that-uses-getaddrinfo

Because its' ex-maintainer, Ulrich Drepper, has a personal dislike for static linking so he stabotaged support for it.

You fundamentally can't dlopen() from a static binary for a bunch of reasons (https://www.openwall.com/lists/musl/2012/12/08/4 and https://inbox.vuxu.org/musl/20200423022531.502e9d26@zenbook/t/ and so on).

The most fundamental of which is that the "heap" that malloc() and free() track memory in is basically a global variable in libc. When you use a dynamic libc, the global variable pointing to the heap lives in the shared library's address space. When you static link, the global variable tracking the heap lives in your executable's address space. If you load a library from a static executable, you now have TWO heap pointers tracking two different heaps, and if you malloc() from one context and free() into the other, you've leaked from one heap and corrupted the other.

I have part of a mkroot/packages/dynamic to do dynamic instead of static linking, copying the shared libraries from your toolchain into resulting filesystem. Unfortunately, if you do that on debian's host toolchain, it copies 1.7 gigabytes of libraries.

https://landley.net/notes-2023.html#12-07-2023

@landley
Copy link
Owner

landley commented Dec 13, 2023

The ping issue isn't in the kernel log, it's in the libc code. If you statically link glibc, glibc breaks. This is because glibc is terrible. And more stuff breaks over time: looking up usernames from UIDs switched to a shared library plugin, so that silently fails now and ls -l shows numbers instead of names when statically linked against glibc.

The problem is glibc. This works fine with musl. Or uclibc. Or bionic. Or sufficiently old versions of glibc....

@landley
Copy link
Owner

landley commented Dec 13, 2023

There's a little bit of explanation about this in https://landley.net/toybox/faq.html#cross1 but I mean to add a README to the mkroot subdirectory, and explaining this issue succinctly without coming out and saying "the glibc devs are INSANE" is one of the blockers...

@enh-google
Copy link
Collaborator

You fundamentally can't dlopen() from a static binary for a bunch of reasons...

(note that those are just implementation details --- there's no theoretical reason you couldn't make dlopen() work from static binaries [and it seems like glibc folks may be starting to rethink this themselves] ... you just probably don't want to have to go to all the trouble. musl especially doesn't want to because musl really cares about having tiny static binaries, and -- because you don't know what's going to be needed later -- if static binaries supported dlopen(), they'd need to pull everything in. again, that's mostly an implementation detail too --- a hypothetical sufficiently sophisticated system could only do this to you if you actually have a call to dlopen() and gc everything otherwise. but it comes back to the "ridiculous amounts of work for unconvincing benefit [and massively increased testing costs]". which is one reason why bionic static binaries only have a dlopen() that always returns NULL, for example.)

@richfelker
Copy link

Indeed it's not fundamental that you can't dlopen from static binaries. The fundamental constraint is that it's broken to have two versions of libc (or potentially of any library) present in the same program, and that's what you'd get with naive dlopen from a static program: you'd get both whatever library code got linked into the main program as a static dependency, and the copy of the library loaded as a dynamic dependency, both present. This is a problem for at least 2 reasons:

  1. Both copies may be assuming they have exclusive access to the same singleton resource.
  2. Non-public interface boundaries within the library become public ABI since some functions operating on shared state may come from the static linked copy of the library and others from the dynamic copy.

I've worked out a concept by which we could do static dlopen in musl without facing this issue, but it would make unavailable any libc interfaces that you didn't static link - so for practical purposes, you'd end up using -Wl,--whole-archive and getting very large static binaries. As such, it's very questionable whether the amount of effort needed to do it makes sense.

@enh-google
Copy link
Collaborator

The fundamental constraint is that it's broken to have two versions of libc

my assumption there, when i've [only very idly] thought about this for bionic is that we'd just ignore any libc.so DT_NEEDED. i've long been tempted by the idea of deprecating and removing libm.so that way, which is a lot less work and so might actually happen one day...

but, yes, the trouble people already get themselves into with mixtures of static and dynamic for their own libraries makes me very much not interested in going down this path, and continuing Android's "static binary support is for init and the dynamic linker" philosophy.

@landley
Copy link
Owner

landley commented Jan 27, 2024

Reviewing this thread, my todo item here is being sure to upload new toolchain builds for the next toybox release, which is somewhat gated by Rich not having updated musl-cross-make in 2 years so I may need to forward port patches to newer glibc/binutils versions myself...

@landley
Copy link
Owner

landley commented Mar 22, 2024

FYI, Rich has updated musl-cross-make. I've been fighting with my arm64 build environment trying to get arm hosted toolchains as well as x86-64 ones, but have yet to boot a vanilla kernel an an orange pi 3b and I recently moved from the place where I could easily plug it into the house router.

Working on it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants