diff --git a/docs/design_decisions/DR-001-infra-extension.md b/docs/design_decisions/DR-001-infra-extension.md new file mode 100644 index 00000000000..ca14ebc1b3d --- /dev/null +++ b/docs/design_decisions/DR-001-infra-extension.md @@ -0,0 +1,284 @@ + + + +# DR-001-B-Infra-Extension: S-CORE Build Execution Contract + + +* **Date:** 2026-01-20 + +```{dec_rec} S-CORE Build Execution Contract +:id: dec_rec__infra__execution_contract +:status: accepted +:context: Infrastructure +:decision: Adopt a layered execution contract + +``` + +--- + +## Purpose + +This document defines the execution contract for S-CORE builds across developer machines +and CI infrastructure. +Its goal is to ensure **long-term reproducibility (≥10 years)**, **traceability** and +**practical hermeticity**, despite changes in underlying infrastructure such as +GitHub-hosted runners. + +It builds on [DR-001], which was concerned about the same topics, but was focused on tools only. This adds details where the original description was too fuzzy. + +The contract is intentionally **layered**, because different parts of the system control +different capabilities and failure modes. + +--- + +## Core Requirements + +### R1 — Long-Term Reproducibility +S-CORE builds must be reproducible for **at least 10 years** after creation, given: +- the source revision +- archived execution context +- archived toolchains +- recorded build metadata + +This must remain possible even if: +- GitHub runner images change or are retired +- upstream toolchains are no longer available +- external services are unavailable + +For release-grade reproducibility, the required artifacts should be recoverable without +depending on live internet access, for example via archived execution contexts, mirrors, +or pre-populated artifact stores. + +--- + +### R2 — Traceability +For every build artifact, it must be possible to determine **exactly**: +- which sources were used +- which toolchains and tool versions were used +- which Bazel version and flags were used +- which execution context (container image) was used +- which host baseline constraints applied + +This information must be recorded in a **build manifest** and stored alongside the build +outputs. + +--- + +### R3 — Hermeticity (Practical) +Build actions must not depend on **undeclared inputs**. + +In practice: +- A tool is **output-affecting** if changing its version or behavior can change: + - build artifacts or generated source code + - metadata consumed by downstream actions + - CI gating results such as test, lint, or coverage outcomes +- Output-affecting tools must be: + - managed by Bazel, or + - explicitly injected as Bazel action inputs, or + - represented by a declared version or fingerprint that participates in cache keys +- Documentation improves traceability, but documentation alone is not sufficient for + hermeticity or long-term reproducibility. +- Reliance on host state must be minimized and documented where unavoidable. + +Perfect hermeticity is not required, but **undeclared variability is not acceptable**. + +--- + +## Three-Layer Execution Contract + +### Layer 1 — Host Platform Contract + +This layer defines the **non-virtualized constraints** imposed by the machine running +the build. + +#### Scope +- GitHub-hosted runners +- self-hosted runners (VM or bare metal) +- kernel-level features shared with containers + +#### Responsibilities +- Linux kernel version and configuration +- Host CPU architecture compatibility for distributed binaries (e.g. `x86_64`, `arm64`) +- Security mechanisms (AppArmor / LSM) +- Filesystems, networking, namespaces +- Support for: + - Bazel `linux-sandbox` + - QEMU / `binfmt` + - privileged operations where required + +#### Requirements +- Linux host OS with **Ubuntu 24.04 LTS** as the minimum supported value (MSV) +- Kernel must support: + - unprivileged user namespaces + - mount operations required by Bazel sandboxing +- Host security policies must not block Bazel `linux-sandbox` unless explicitly documented + +#### Known Constraints +- Ubuntu 24.04+ AppArmor may block Bazel sandbox mount operations +- Containers **cannot** mitigate host-kernel restrictions + +#### Policy +- The **Reference Integration Host Baseline** is **Ubuntu 24.04 LTS**. +- Temporary use of older hosts due to infrastructure constraints does **not** change the + contract and must be treated as an explicit exception. +- Deviations (e.g. privileged runners, sandbox disabled) must be explicit and isolated. + +--- + +### Layer 2 — Execution Context Contract (Default: Devcontainer) + +This layer defines the **user-space environment** in which builds are executed. +The default execution context is a versioned devcontainer image, but a compatible native +environment may also satisfy this layer. + +#### Purpose +- Provide consistent runtime ABI (`glibc`, `libstdc++`) +- Ensure tool binaries (e.g. rustc) can execute reliably +- Eliminate “works on my machine” discrepancies +- Enable local reproduction of CI builds +- Define which user-space properties must be equivalent between CI and local execution + +#### Compatibility Requirements +- A compatible execution context does not need to be byte-identical, but it must provide + the same build-relevant properties: + - runtime ABI relevant for executing distributed binaries + - declared tool and toolchain versions, or equivalent pinned artifacts + - environment settings intentionally relied upon by the build or test actions + +#### Definition +- A **versioned devcontainer image** is the default execution context for CI and local builds. +- The container image must be: + - built from a **defined Ubuntu LTS baseline** + - compatible with common developer tooling (e.g. by following the [specification](https://containers.dev/)) + - referenced by an **immutable image digest** + - managed in a way that supports archival for **long-term reproducibility** + +#### Baseline Preservation and Reproducibility +- Once a devcontainer image is used in CI, its image digest becomes part of the build provenance +- All such images must be designed to be archived and retrievable for **at least 10 years** +- Reproducing historical builds may rely on legacy container runtimes or CLI-only execution, + and does not require continued IDE support + +#### Responsibilities +- User-space runtime libraries +- Bootstrap tooling (git, bash, coreutils, python, etc.) +- Bazel entrypoint (preferably Bazelisk) +- If Bazelisk is used, the referenced Bazel binaries must be pinned and available from a + source that can be mirrored or archived +- Development UX tooling (optional) +- Environment settings intentionally exposed as part of the build contract + +#### Non-Goals +- The devcontainer must **not silently override** repository-declared Bazel versions. +- Build-relevant versions and dependency selections must remain reviewable in + repository-controlled metadata (e.g. `.bazelversion`, module lockfiles, pinned + toolchain definitions). +- The devcontainer must **not be the only place** where critical tool versions are defined. + +#### Policy +- The devcontainer defines the **default** environment, not the **only** supported one. +- A compatible native environment is acceptable if it satisfies the same build-relevant + contract. +- Builds should still be possible on compatible bare-metal hosts. + +--- + +### Layer 3 — Bazel Contract + +This layer defines **what Bazel controls and guarantees**. + +#### Bazel Versioning +- Each Bazel-based repository relevant to this contract must declare the Bazel version it + expects in repository-controlled metadata, preferably via `.bazelversion`. +- S-CORE should align on a **single Bazel version** across repositories that participate in + the shared build environment. +- CI should enforce version consistency where that alignment is required. + +#### Toolchains and Tools +- Toolchains (e.g. Rust/Ferrocene, C/C++) must be: + - versioned + - immutable + - built against a documented baseline +- Tools that can change build artifacts, generated metadata, or CI gating results must be + known to Bazel or reflected in declared action inputs. +- Module overrides and external artifact references must be pinned to immutable revisions + or checksums when they are part of the reproducibility story. +- Externally fetched build inputs must be pinned and suitable for mirroring or archival + when long-term reproducibility depends on them. +- Bazel management improves traceability and cache correctness, but does **not** by itself + guarantee long-term reproducibility. +- Long-term reproducibility also requires pinning, checksums, archival or mirroring, and + periodic offline verification for release-grade builds. + +#### Hermeticity Guarantees +- Bazel sandboxing provides reproducibility **given runnable tools**. +- Bazel does **not** virtualize: + - kernel + - `glibc` + - host security configuration + +These constraints must be handled in Layer 1 and Layer 2. + +--- + +## Minimum Supported Baselines + +### Host Platform Baseline +- Minimum supported value (MSV) for the host platform: **Ubuntu 24.04 LTS** +- This applies to developer hosts, CI runners, and other environments that execute Bazel + actions directly on the host +- Temporary execution on older hosts due to runner limitations or sandbox workarounds does + **not** redefine the supported baseline and must remain a documented exception + +### Toolchain Runtime Compatibility +- Distributed toolchains must be runnable on the host MSV or be executed inside a versioned + execution context that provides the required runtime ABI +- A temporary downgrade of a toolchain build baseline to accommodate current CI constraints + does **not** redefine the host platform contract +- Older host environments are **not supported** unless explicitly documented as exceptions + +### Rationale +We explicitly do **not** support all historical `glibc` or kernel versions. +Portability is achieved by choosing and documenting a baseline, not by unlimited +backward compatibility. The host platform contract must not be dictated by temporary +workarounds in individual modules or CI pipelines. Layer 2 can be virtualized as needed +for future reproducibility. + +--- + +## Build Provenance and Archiving + +Each CI build must produce and archive: +- build manifest (metadata) +- container image digest +- toolchain identifiers +- source revision(s) + +These artifacts form the basis for: +- long-term reproducibility +- forensic analysis +- compliance and auditing + +--- + +## Summary + +- **Layer 1** defines what the host *must* provide. +- **Layer 2** defines the default execution environment. +- **Layer 3** defines how Bazel achieves reproducibility and caching. +- Reproducibility, traceability, and hermeticity are enforced through +**explicit contracts**, not assumptions. + +This separation allows S-CORE to scale infrastructure, evolve toolchains, and still +reproduce builds years into the future.