Skip to content

Conversation

@edolstra
Copy link
Collaborator

@edolstra edolstra commented Jan 5, 2026

Motivation

This PR adds support for extending Nix using Wasm (specifically wasmtime) in two ways:

  • It adds a new builtin function, builtins.wasm, that allows Nix functions to be written in any language that compiles to Wasm. For example,

    builtins.wasm ./nix_wasm_plugin_fib.wasm "fib" 40

    The goal is to allow complex functionality (like YAML parsers) to be written in more convenient languages (like Rust) and executed faster, without having to resort to import-from-derivation.

    The Wasm function can lazily traverse its argument and can produce arbitrary Nix values (except functions). Each Wasm function call is executed as a fresh instance of the Wasm VM, ensuring determinism (i.e. you can't use the Wasm memory to pass state between function calls).

    See Initial bindings/examples for writing Nix functions in Rust nix-wasm-rust#1 for examples of Nix functions written in Rust.

  • It adds a new system type, wasm32-wasip1, that allows for system-independent derivations. For example:

    derivation {
      name = "wasi-test";
      system = "wasm32-wasip1";
      builder = ./nix-wasi-builder-hello.wasm;
      args = [ "World!" ];
    }

    Example here: https://github.com/DeterminateSystems/nix-wasi-rust

Context

Summary by CodeRabbit

  • New Features

    • WASM support in Nix expressions via a new "wasm" primitive to call WebAssembly functions.
    • WASI-based derivations enabling local builds that run WASM builders for wasm32-wasip1.
  • Packaging

    • Wasmtime runtime added as a packaged dependency.
  • Build / Integration

    • Core components now link with Wasmtime to enable WASM execution.
  • Experimental Features

    • New flags: "wasm-builtin" and "wasm-derivations".
  • Documentation

    • New guide for the builtins.wasm host interface and usage.

✏️ Tip: You can customize this high-level summary in your review settings.

It adds ~400 MB to the Nix closure which is a lot.
Also, require WASM plugins to export a `nix_wasm_init_v1` function.
We don't want to reuse instances between WASM calls, since that's
impure: the WASM function could use some mutable global state to
return a different value every time. So instead, pre-instantiate
instances, allowing them to be instantiated efficiently.

Unfortunately, while this is much faster than not using
pre-instantiation, it is slower than reusing an instance: about 1.3s
compared to 0.7s for the call.nix test in nix-wasm-plugin-test (which
performs 100K WASM calls).
@edolstra edolstra marked this pull request as draft January 5, 2026 13:03
@coderabbitai
Copy link

coderabbitai bot commented Jan 5, 2026

📝 Walkthrough

Walkthrough

Adds Wasmtime packaging and linking, implements a Wasm primop and WASI derivation builder for wasm32-wasip1, moves path realisation into EvalState::realisePath and renames sandbox path APIs to realPathInHost, and registers two wasm-related experimental flags plus documentation.

Changes

Cohort / File(s) Summary
Packaging & Dependencies
packaging/dependencies.nix, packaging/wasmtime.nix
Add wasmtime Nix package and expose it in project dependencies.
Expr: WASM Primop & Sources
src/libexpr/primops/wasm.cc, src/libexpr/primops/meson.build
New Wasmtime-backed wasm primop implementation; add wasm.cc to primops sources and register host functions.
Expr: Build & API
src/libexpr/meson.build, src/libexpr/package.nix, src/libexpr/primops.cc, src/libexpr/include/nix/expr/eval.hh
Link -lwasmtime into nixexpr, add wasmtime to build inputs, add EvalState::realisePath declaration and update call sites to member form.
Store: WASI Derivation Builder
src/libstore/unix/build/wasi-derivation-builder.cc, src/libstore/unix/build/derivation-builder.cc
Add WasiDerivationBuilder that runs builder WASM via Wasmtime/WASI; include new source and dispatch for wasm32-wasip1; rename sandbox path API and dispatch accordingly.
Store: Build Config & Platform Support
src/libstore/meson.build, src/libstore/derivation-options.cc, src/libstore/package.nix, src/libstore/unix/build/chroot-derivation-builder.cc
Link nixstore with -lwasmtime, add wasmtime to build inputs, allow wasm32-wasip1 for local builds, and rename realPathInSandboxrealPathInHost.
Experimental Features
src/libutil/experimental-features.cc, src/libutil/include/nix/util/experimental-features.hh
Add wasm-builtin and wasm-derivations experimental flags and enum entries.
Docs
doc/manual/source/SUMMARY.md.in, doc/manual/source/protocols/wasm.md
Add documentation entry and protocol doc for builtins.wasm host interface.

Sequence Diagram(s)

sequenceDiagram
    participant User as Nix User
    participant Eval as EvalState
    participant Cache as InstanceCache
    participant Engine as Wasmtime Engine/Linker
    participant Module as WASM Module
    participant Instance as WASM Instance
    participant Host as Host Functions

    User->>Eval: prim_wasm(wasmPath, entry, arg)
    Eval->>Cache: lookup(wasmPath)
    alt cached
        Cache-->>Eval: pre-instantiated data
    else not cached
        Eval->>Engine: prepare Engine & Linker
        Engine->>Host: register host functions
        Eval->>Module: load & compile module
        Engine->>Instance: instantiate module with Linker
        Instance-->>Eval: return pre-instantiated data
    end
    Eval->>Instance: create per-call Store/Context
    Eval->>Instance: write arg into Memory
    Eval->>Instance: call entry(ptr)
    Instance->>Host: invoke host functions as needed
    Instance-->>Eval: return result pointer
    Eval->>Instance: read result from Memory
    Eval-->>User: return Nix Value
Loading
sequenceDiagram
    participant Store as Nix Store
    participant Dispatcher as Builder Dispatcher
    participant WasiBuilder as WASI Derivation Builder
    participant Wasmtime as Wasmtime Engine/Linker
    participant WasmModule as WASM Module
    participant Instance as WASM Instance
    participant WasiEnv as WASI Environment

    Store->>Dispatcher: makeDerivationBuilder(drv: wasm32-wasip1)
    Dispatcher->>WasiBuilder: instantiate WASI builder
    WasiBuilder->>Wasmtime: new Engine & Linker
    WasiBuilder->>WasiEnv: configure WASI (stdio, argv, env, preopens)
    WasiBuilder->>WasmModule: load builder wasm path
    Wasmtime->>Instance: instantiate module
    WasiBuilder->>Instance: locate _start export
    WasiBuilder->>Instance: invoke _start()
    Instance->>WasiEnv: perform WASI syscalls
    Instance-->>WasiBuilder: execution complete
    WasiBuilder-->>Store: build finished
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

flake-regression-test

Suggested reviewers

  • cole-h

Poem

🐰 I hopped in with Wasmtime bright,

Tiny modules sprang to light,
WASI builders start to hum,
Nix and wasm now dance as one,
Hooray — more hops, more fun! 🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.48% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title "Wasm support" is vague and generic, failing to clearly convey the main changes which encompass adding WebAssembly support via builtins.wasm, WASI derivations, wasmtime integration, and multiple code refactorings. Provide a more descriptive title that reflects the main additions, such as "Add WebAssembly support with builtins.wasm and WASI derivations" or "Implement wasmtime-based Wasm runtime for Nix expressions".
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Fix all issues with AI Agents 🤖
In @src/libexpr/primops/wasm.cc:
- Around line 255-267: The code reads ValueId uint32_t values directly from WASM
memory in make_list, copy_list, make_attrset, and copy_attrset, assuming host
endianness; fix this by converting each 32-bit word from little-endian to host
order when constructing ValueId (e.g., apply le32toh or an equivalent portable
little-endian-to-host conversion) for every read from the memory subspan (the
vs[] accesses) so ValueId values are correct on big-endian hosts; update all
occurrences in the listed functions to perform this conversion before indexing
into values or assigning into lists/attrsets.
- Around line 334-350: In copy_attrname, replace the compile-time asserts with
runtime validations: check that attrIdx < attrs.size() and that len ==
name.size(); if either check fails, raise a runtime error (e.g., throw
std::runtime_error with a clear message mentioning valueId, attrIdx, len and
name.size()) or otherwise signal a trap/error path instead of proceeding; only
perform memcpy when both checks pass and keep the rest of the logic
(state.forceAttrs, obtaining name from state.symbols, and writing into memory())
unchanged.
- Around line 433-439: The static variable name `primop_fromTOML` is misleading
for the WASM primop registration; rename the variable to a descriptive
identifier like `primop_wasm` (i.e., change `static RegisterPrimOp
primop_fromTOML(...` to `static RegisterPrimOp primop_wasm(...`) so it matches
the registered `.name = "wasm"` and `.fun = prim_wasm`; update any local
references to this static variable if present in the file to use the new name.
- Around line 309-332: The copy_attrset function writes attributes into WASM
memory using enumerate(*value.attrs()) which is non-deterministic; to fix,
gather the attributes into a temporary vector (e.g., vector of pairs or pointers
to attr), sort that vector lexicographically by the attribute name string via
state.symbols[attr.name], then iterate the sorted vector to populate
buf[n].value = addValue(attr.value) and buf[n].nameLen =
state.symbols[attr.name].size(), preserving the existing buffer handling
(subspan/Attr/maxLen) and return value.attrs()->size().
- Around line 411-418: The static unordered_map instancesPre is not thread-safe
and leaks entries; replace it with a thread-safe concurrent map (use
boost::concurrent_flat_map keyed by SourcePath) and store weak references
instead of owning refs so entries can be reclaimed (e.g., map value -> weak
pointer to NixWasmInstancePre or equivalent weak_ref). Update the lookup/insert
logic around instancesPre and the use of make_ref<NixWasmInstancePre>(wasmPath)
so you first attempt to lock the weak entry, create a new instance only if
expired, and store a weak reference in the concurrent_flat_map; optionally add
an eviction or LRU policy if unbounded growth remains a concern.
- Around line 49-53: The subspan function currently uses assert to check buffer
size which is omitted in release builds; replace the assert in
subspan(std::span<uint8_t> s, size_t len) with a runtime check that throws a
descriptive exception (e.g., std::out_of_range or std::length_error) when
s.size() < len * sizeof(T), include the required header ( <stdexcept> ), and
keep the returned span construction (use reinterpret_cast<T*>(s.data()) for
clarity) so callers get a safe failure instead of potential buffer overruns.
🧹 Nitpick comments (12)
src/libstore/derivation-options.cc (1)

399-400: Consider requiring a system feature for WASM support.

The special-casing of "wasm32-wasip1" allows any system to build WASM derivations, which aligns with the PR's goal of system-independent derivations. However, this assumes Wasmtime is always available.

Consider whether wasm32-wasip1 builds should require a system feature (e.g., "wasm" or "wasmtime") to ensure the runtime is actually available on the build machine. This would be checked by the existing getRequiredSystemFeatures logic at lines 406-408.

🔎 Alternative approach using system features

If you want to gate WASM builds on explicit system capability:

  1. Have wasm32-wasip1 derivations declare requiredSystemFeatures = ["wasm"];
  2. Remove the special case here, letting the normal system feature check at lines 406-408 handle it:
-    if (drv.platform != settings.thisSystem.get() && drv.platform != "wasm32-wasip1"
-        && !settings.extraPlatforms.get().count(drv.platform) && !drv.isBuiltin())
+    if (drv.platform != settings.thisSystem.get()
+        && !settings.extraPlatforms.get().count(drv.platform) && !drv.isBuiltin())
         return false;

This would make WASM support more explicit and consistent with how other cross-compilation scenarios are handled.

src/libstore/meson.build (1)

383-383: Consider using Meson's dependency system instead of hardcoded linker flags.

Directly appending -lwasmtime to link_args bypasses Meson's dependency resolution, which can lead to portability and configuration issues:

  • No verification that the wasmtime library is actually available
  • Library search paths may not be configured correctly
  • Platform-specific library naming conventions (e.g., .dll vs .so) are not handled
  • Build will fail with cryptic linker errors rather than clear dependency errors
🔎 Recommended approach using Meson dependency system

Near the top of the file with other dependencies (around line 109-161), add:

+wasmtime = dependency('wasmtime', required : true)
+deps_private += wasmtime

Then remove the hardcoded linker flag:

-  link_args : linker_export_flags + [ '-lwasmtime' ],
+  link_args : linker_export_flags,

This provides better error messages, respects pkg-config paths, and handles platform differences automatically.

src/libexpr/meson.build (1)

242-242: Consider using Meson's dependency system instead of hardcoded linker flags.

This has the same portability concerns as the identical change in src/libstore/meson.build at line 383. Hardcoded -lwasmtime bypasses Meson's dependency resolution and can cause issues with library paths, platform-specific naming, and error reporting.

🔎 Recommended approach using Meson dependency system

Near the dependencies section (around lines 41-92), add:

+wasmtime = dependency('wasmtime', required : true)
+deps_private += wasmtime

Then remove the hardcoded linker flag:

-  link_args : linker_export_flags + [ '-lwasmtime' ],
+  link_args : linker_export_flags,

This approach provides better error messages, respects pkg-config configurations, and handles cross-platform differences automatically.

src/libstore/unix/build/derivation-builder.cc (1)

2033-2035: Early return for wasm32-wasip1 bypasses later sandbox / uid‑range checks

Routing params.drv.platform == "wasm32-wasip1" directly to WasiDerivationBuilder happens before:

  • The diverted store check (storeDir != realStoreDir) and
  • The !useSandbox && params.drvOptions.useUidRange(drv) guard.

That means for WASI builds:

  • uid-range requests will no longer be rejected when sandboxing is effectively disabled here.
  • Diverted stores on non‑Linux platforms will also be allowed for this path, unlike other builders.

If this is intentional (relying on the WASI runtime rather than OS sandboxing), consider making that explicit and either:

  • Reject uid-range in WasiDerivationBuilder’s path (or in this branch) so configuration errors are surfaced, or
  • Document that uid-range is ignored for wasm32-wasip1 and that OS sandboxing semantics differ from other systems.
src/libexpr/primops.cc (1)

168-188: Making realisePath non‑static for reuse is reasonable

Exposing SourcePath realisePath(...) (no longer static) with the optional SymlinkResolution parameter allows other primop modules (e.g. WASM/WASI ones) to share the same path + context resolution logic. The implementation is consistent with existing callers, and the default (Full) preserves previous behavior.

Just ensure there is a matching declaration in a header (e.g. primops.hh) so external uses stay type‑checked and the function remains the single canonical implementation.

src/libstore/unix/build/wasi-derivation-builder.cc (1)

5-18: Deduplicate unwrap / string2span helpers and tighten const‑correctness

Both unwrap and string2span appear to be cut‑and‑paste from src/libexpr/primops/wasm.cc, which means future changes to the Wasmtime API or error handling will have to be done in two places.

Consider:

  • Moving these helpers to a small shared header (e.g. a wasm-runtime-utils.hh) used by both primops and the WASI builder, and
  • Switching string2span to use a std::span<const uint8_t> (and avoiding the C‑style cast) if the Wasmtime C++ API accepts a const span, to keep const‑correctness and avoid accidental mutation.

Please double‑check the current wasmtime::Module::compile signature in your pinned Wasmtime version to confirm whether std::span<const uint8_t> is acceptable and adjust the helper accordingly.

src/libexpr/primops/wasm.cc (6)

6-6: Consider removing using namespace wasmtime; for clarity.

While acceptable in implementation files, explicitly qualifying wasmtime types improves code clarity and prevents potential name collisions with future additions to the nix namespace.


10-15: Refactor: Remove duplicate declaration and include proper header.

The realisePath function is already defined in src/libexpr/primops.cc (lines 167-187). Instead of forward-declaring it here, consider extracting it to a shared header file that both files can include, or making it part of the EvalState API.


94-103: Consider factory method for clearer error handling.

The constructor performs heavy operations (WASM compilation via unwrap()) that can throw. Consider using a static factory method that returns Result<ref<NixWasmInstancePre>, Error> for clearer error handling and easier testing.


182-196: Simplify error throwing pattern.

Line 196 uses an immediately-invoked lambda to throw an error, which is less clear than throwing directly. The ternary chain could also benefit from a switch statement for clarity.

🔎 Proposed refactor
 uint32_t get_type(ValueId valueId)
 {
     auto & value = *values.at(valueId);
     state.forceValue(value, noPos);
-    auto t = value.type();
-    return t == nInt        ? 1
-           : t == nFloat    ? 2
-           : t == nBool     ? 3
-           : t == nString   ? 4
-           : t == nPath     ? 5
-           : t == nNull     ? 6
-           : t == nAttrs    ? 7
-           : t == nList     ? 8
-           : t == nFunction ? 9
-                            : []() -> int { throw Error("unsupported type"); }();
+    switch (value.type()) {
+        case nInt: return 1;
+        case nFloat: return 2;
+        case nBool: return 3;
+        case nString: return 4;
+        case nPath: return 5;
+        case nNull: return 6;
+        case nAttrs: return 7;
+        case nList: return 8;
+        case nFunction: return 9;
+        default: throw Error("unsupported type");
+    }
 }

230-238: Consider adding documentation for two-phase copy pattern.

This function uses a common but potentially confusing pattern: it only copies data if the buffer is large enough, but always returns the actual size. A brief comment explaining that callers should check returnValue <= maxLen before using the buffer would improve clarity.


426-426: Improve readability by splitting complex expression.

Line 426 chains multiple operations in a single expression, making it difficult to read and debug. Consider breaking it into separate steps.

🔎 Proposed refactor
-        v = *instance.values.at(instance.runFunction(functionName, {(int32_t) instance.addValue(args[2])}).at(0).i32());
+        auto argId = instance.addValue(args[2]);
+        auto results = instance.runFunction(functionName, {(int32_t) argId});
+        auto resultId = results.at(0).i32();
+        v = *instance.values.at(resultId);
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 51dacdd and 1eaba77.

📒 Files selected for processing (11)
  • packaging/dependencies.nix
  • packaging/wasmtime.nix
  • src/libexpr/meson.build
  • src/libexpr/package.nix
  • src/libexpr/primops.cc
  • src/libexpr/primops/meson.build
  • src/libexpr/primops/wasm.cc
  • src/libstore/derivation-options.cc
  • src/libstore/meson.build
  • src/libstore/unix/build/derivation-builder.cc
  • src/libstore/unix/build/wasi-derivation-builder.cc
🧰 Additional context used
🧬 Code graph analysis (3)
src/libstore/derivation-options.cc (1)
src/libstore/include/nix/store/derivation-options.hh (2)
  • drv (184-184)
  • drv (201-201)
src/libstore/unix/build/wasi-derivation-builder.cc (1)
src/libexpr/primops/wasm.cc (7)
  • unwrap (20-25)
  • unwrap (20-20)
  • v (131-136)
  • v (131-131)
  • v (138-143)
  • string2span (38-41)
  • string2span (38-38)
src/libexpr/primops/wasm.cc (1)
src/libexpr/primops.cc (4)
  • state (3218-3224)
  • state (3218-3218)
  • realisePath (168-188)
  • realisePath (168-172)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_aarch64-darwin / build
🔇 Additional comments (6)
src/libexpr/primops/meson.build (1)

12-12: LGTM!

Adding wasm.cc to the primops sources is straightforward and correct.

src/libexpr/package.nix (1)

17-17: LGTM!

The wasmtime dependency is correctly added following the same pattern as toml11 and other build-time dependencies.

Also applies to: 68-68

packaging/dependencies.nix (1)

78-79: Wiring wasmtime into the dependency scope looks correct

wasmtime = pkgs.callPackage ./wasmtime.nix { }; is consistent with how other custom packages are exposed here; no further changes needed.

src/libstore/unix/build/derivation-builder.cc (2)

294-298: Naming comment on realPathInSandbox

The // FIXME: misnomer note correctly reflects that this helper currently just resolves store paths rather than honoring a sandbox path mapping. No functional impact, so this can remain as a low‑priority cleanup marker.


1996-2001: Including wasi-derivation-builder.cc alongside other platform builders is consistent

Adding #include "wasi-derivation-builder.cc" next to the other platform-specific builder includes matches the existing “single TU with included .cc files” pattern. No issues from an integration standpoint.

packaging/wasmtime.nix (1)

1-121: Wasmtime packaging looks solid; minor note on header-copy path robustness

Overall this derivation is well-structured:

  • Uses a pinned Rust toolchain (rust_1_89) and buildRustPackage.
  • Restricts doCheck to platforms with the required SIMD features and skips problematic aarch64 runs.
  • Splits outputs into out, dev, and lib, pruning .so / .a according to enableShared / enableStatic.
  • Handles shell completions correctly for both native and cross builds.
  • Adds an install-time --version check and an updateScript with an appropriate version regex.

One thing to keep in mind is the header copy:

cp -r target/${cargoShortTarget}/release/build/wasmtime-c-api-impl-*/out/include $dev/include

This relies on the exact crate/build directory naming and release profile path; if upstream layout or build profile changes, this will break somewhat opaquely. It may be worth:

  • Asserting that exactly one wasmtime-c-api-impl-* match exists, or
  • Failing with a clearer message if the glob expands to zero or multiple directories.

Otherwise the packaging is in good shape.

Please double-check against your pinned Wasmtime 40.0.0 sources that the wasmtime-c-api-impl-* build output layout (and the cargoShortTarget/release path) is stable; if upstream changes it, the cp -r .../out/include step will be the first thing to break.

Comment on lines 309 to 332
uint32_t copy_attrset(ValueId valueId, uint32_t ptr, uint32_t maxLen)
{
auto & value = *values.at(valueId);
state.forceAttrs(value, noPos, "while copying an attrset into WASM");

if (value.attrs()->size() <= maxLen) {
// FIXME: endianness.
struct Attr
{
ValueId value;
uint32_t nameLen;
};

auto buf = subspan<Attr>(memory().subspan(ptr), maxLen);

// FIXME: for determinism, we should return attributes in lexicographically sorted order.
for (const auto & [n, attr] : enumerate(*value.attrs())) {
buf[n].value = addValue(attr.value);
buf[n].nameLen = state.symbols[attr.name].size();
}
}

return value.attrs()->size();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix: Ensure deterministic attribute ordering.

The FIXME on line 324 correctly identifies that attributes should be returned in lexicographically sorted order for determinism. Nix's evaluation model requires consistent attribute ordering across evaluations.

🔎 Proposed fix
         if (value.attrs()->size() <= maxLen) {
             // FIXME: endianness.
             struct Attr
             {
                 ValueId value;
                 uint32_t nameLen;
             };
 
             auto buf = subspan<Attr>(memory().subspan(ptr), maxLen);
 
-            // FIXME: for determinism, we should return attributes in lexicographically sorted order.
-            for (const auto & [n, attr] : enumerate(*value.attrs())) {
+            // Collect and sort attributes by name for determinism
+            std::vector<std::pair<std::string_view, const Attr*>> sortedAttrs;
+            for (const auto & attr : *value.attrs()) {
+                sortedAttrs.push_back({state.symbols[attr.name], &attr});
+            }
+            std::sort(sortedAttrs.begin(), sortedAttrs.end(),
+                [](const auto & a, const auto & b) { return a.first < b.first; });
+            
+            for (const auto & [n, pair] : enumerate(sortedAttrs)) {
+                const auto & attr = *pair.second;
                 buf[n].value = addValue(attr.value);
                 buf[n].nameLen = state.symbols[attr.name].size();
             }
         }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 309-332, The copy_attrset function
writes attributes into WASM memory using enumerate(*value.attrs()) which is
non-deterministic; to fix, gather the attributes into a temporary vector (e.g.,
vector of pairs or pointers to attr), sort that vector lexicographically by the
attribute name string via state.symbols[attr.name], then iterate the sorted
vector to populate buf[n].value = addValue(attr.value) and buf[n].nameLen =
state.symbols[attr.name].size(), preserving the existing buffer handling
(subspan/Attr/maxLen) and return value.attrs()->size().

@github-actions github-actions bot temporarily deployed to pull request January 5, 2026 15:17 Inactive
@github-actions
Copy link

github-actions bot commented Jan 5, 2026

This function was named incorrectly. It denotes the path in the host
filesystem, not in the sandbox.
@github-actions github-actions bot temporarily deployed to pull request January 5, 2026 17:20 Inactive
Even though it's in a separate output, disabling it reduces the size
of libwasmtime.so by around 12 MB.
This reduces the size of the Nix closure by another 12 MB.
@github-actions github-actions bot temporarily deployed to pull request January 6, 2026 14:08 Inactive
@github-actions github-actions bot temporarily deployed to pull request January 21, 2026 12:52 Inactive
@github-actions github-actions bot temporarily deployed to pull request January 21, 2026 13:11 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/libexpr/include/nix/expr/eval.hh`:
- Around line 1037-1039: Update the doc comment for the function realisePath to
correct the misspelled helper name: change "reliaseContext()" to
"realiseContext()". Ensure the comment text refers to realiseContext()
accurately so readers can find the corresponding function.

In `@src/libexpr/primops.cc`:
- Around line 168-173: The coercion call inside EvalState::realisePath currently
uses noPos which discards source location on errors; change the coerceToPath
invocation in EvalState::realisePath to pass the incoming PosIdx pos (not noPos)
so any coercion failures preserve and report the original call-site position;
update the call site referencing coerceToPath(...) accordingly.
♻️ Duplicate comments (4)
src/libexpr/primops/wasm.cc (4)

160-176: Add a checked subspan for untrusted WASM pointers.

memory().subspan(ptr, len) is called with untrusted ptr/len, which can lead to OOB access. Introduce a bounds-checked helper and use it at all access sites (panic/warn/make_string/copy_string/…/read_file).

🛡️ Suggested fix (helper + example usage)
     auto memory()
     {
         return memory_.data(wasmCtx);
     }
+
+    std::span<uint8_t> checkedSubspan(uint32_t ptr, uint32_t len)
+    {
+        auto mem = memory();
+        if (ptr > mem.size() || len > mem.size() - ptr)
+            throw Error("Wasm memory access out of bounds: ptr=%u len=%u size=%zu", ptr, len, mem.size());
+        return mem.subspan(ptr, len);
+    }
 
     std::monostate panic(uint32_t ptr, uint32_t len)
     {
-        throw Error("Wasm panic: %s", Uncolored(span2string(memory().subspan(ptr, len))));
+        throw Error("Wasm panic: %s", Uncolored(span2string(checkedSubspan(ptr, len))));
     }

281-289: Fix endianness for ValueId reads from WASM memory.

WASM is little‑endian; reading ValueId directly is wrong on big‑endian hosts. Convert each 32‑bit word to host order before indexing values.

🧭 Suggested fix (pattern)
-        for (const auto & [n, v] : enumerate(list))
-            v = values.at(vs[n]); // FIXME: endianness
+        for (const auto & [n, v] : enumerate(list)) {
+            auto id = boost::endian::little_to_native(vs[n]);
+            v = values.at(id);
+        }

504-510: Make the Wasm instance cache thread-safe.

instancesPre is a static unordered_map with no synchronization and will race under parallel evaluation. Add locking or replace it with a concurrent map.

🔒 Suggested fix (minimal locking)
-        static std::unordered_map<SourcePath, ref<NixWasmInstancePre>> instancesPre;
+        static std::unordered_map<SourcePath, ref<NixWasmInstancePre>> instancesPre;
+        static std::mutex instancesPreMutex;
 
-        auto instancePre = instancesPre.find(wasmPath);
+        std::lock_guard<std::mutex> lock(instancesPreMutex);
+        auto instancePre = instancesPre.find(wasmPath);

519-525: Validate returned ValueId before indexing instance.values.

res[0].i32() is untrusted. An out‑of‑range id throws std::out_of_range (not caught here), which can abort evaluation. Validate the id and raise Error instead.

✅ Suggested fix
-        auto vRes = instance.values.at(res[0].i32());
+        auto resultId = res[0].i32();
+        if (resultId < 0 || static_cast<size_t>(resultId) >= instance.values.size())
+            throw Error(
+                "Wasm function '%s' from '%s' returned invalid ValueId %d",
+                functionName,
+                wasmPath,
+                resultId);
+        auto vRes = instance.values.at(static_cast<size_t>(resultId));

This allows it to be used by primops defined outside of primops.cc.
@github-actions github-actions bot temporarily deployed to pull request January 21, 2026 13:47 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/libexpr/primops/wasm.cc`:
- Around line 519-525: The code must validate the ValueId returned by the WASM
call before indexing instance.values or dereferencing: after calling
instance.runFunction(...) and confirming res[0].kind() == ValKind::I32, extract
the id (from res[0].i32()), check that it is within the valid range (>= 1 and <
instance.values.size()) and not the reserved null (0), and if invalid throw an
Error that includes functionName and wasmPath; only then retrieve
instance.values.at(id) and pass it to state.forceValue(*vRes, pos). Ensure error
paths provide the same contextual message pattern used elsewhere so the original
WASM error context is preserved.
♻️ Duplicate comments (4)
src/libexpr/primops.cc (1)

168-173: Preserve call-site position in path coercion.

noPos discards error locations; pass pos so coercion failures point to the caller.

🐛 Suggested fix
-    auto path = coerceToPath(noPos, v, context, "while realising the context of a path");
+    auto path = coerceToPath(pos, v, context, "while realising the context of a path");
src/libexpr/primops/wasm.cc (3)

160-176: Add bounds-checked WASM memory access.

memory().subspan(ptr, len) is used with untrusted ptr/len and can go OOB. Add a checked helper and use it for all subspan reads/writes (panic/warn/make_* /copy_* /read_file/etc.).

🔒 Suggested helper + example usage
     auto memory()
     {
         return memory_.data(wasmCtx);
     }
+
+    std::span<uint8_t> checkedSubspan(uint32_t ptr, uint32_t len)
+    {
+        auto mem = memory();
+        uint64_t end = (uint64_t) ptr + (uint64_t) len;
+        if (end > mem.size())
+            throw Error("Wasm memory access out of bounds: ptr=%u len=%u size=%zu", ptr, len, mem.size());
+        return mem.subspan(ptr, len);
+    }
@@
     std::monostate panic(uint32_t ptr, uint32_t len)
     {
-        throw Error("Wasm panic: %s", Uncolored(span2string(memory().subspan(ptr, len))));
+        throw Error("Wasm panic: %s", Uncolored(span2string(checkedSubspan(ptr, len))));
     }
@@
     std::monostate warn(uint32_t ptr, uint32_t len)
     {
         nix::warn(
             "'%s' function '%s': %s",
             pre->wasmPath,
             functionName.value_or("<unknown>"),
-            span2string(memory().subspan(ptr, len)));
+            span2string(checkedSubspan(ptr, len)));
         return {};
     }

335-355: Ensure deterministic attribute ordering when copying attrsets.

Enumerating *value.attrs() is not guaranteed to be lexicographically sorted, which can make WASM results nondeterministic. Sort by attribute name before writing to memory.

♻️ Suggested fix
-            // FIXME: for determinism, we should return attributes in lexicographically sorted order.
-            for (const auto & [n, attr] : enumerate(*value.attrs())) {
-                buf[n].value = addValue(attr.value);
-                buf[n].nameLen = state.symbols[attr.name].size();
-            }
+            std::vector<const Attr *> sorted;
+            sorted.reserve(value.attrs()->size());
+            for (const auto & attr : *value.attrs())
+                sorted.push_back(&attr);
+            std::sort(sorted.begin(), sorted.end(), [&](const Attr * a, const Attr * b) {
+                return state.symbols[a->name] < state.symbols[b->name];
+            });
+            for (const auto & [n, attr] : enumerate(sorted)) {
+                buf[n].value = addValue(attr->value);
+                buf[n].nameLen = state.symbols[attr->name].size();
+            }

503-510: Make instancesPre cache thread-safe.

Parallel evaluation can race on the static instancesPre map. Add locking or switch to a concurrent map.

🧵 Simple locking fix
-        static std::unordered_map<SourcePath, ref<NixWasmInstancePre>> instancesPre;
+        static std::unordered_map<SourcePath, ref<NixWasmInstancePre>> instancesPre;
+        static std::mutex instancesPreMutex;
+        std::lock_guard<std::mutex> lock(instancesPreMutex);

@github-actions github-actions bot temporarily deployed to pull request January 22, 2026 13:13 Inactive
Copy link
Member

@cole-h cole-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

;: ./result/bin/nix repl --extra-experimental-features wasm-builtin
Nix (Determinate Nix 3.15.1) 2.33.1
Type :? for help.
nix-repl> wasm = builtins.wasm "/nix/store/m68xkk3halld3xl491wmnxr0hwdi3hk1-nix-wasm-rust-0.1.0/nix_wasm_plugin_fib.wasm"

nix-repl> fib = wasm "fib"

nix-repl> fib 5
warning: '/nix/store/m68xkk3halld3xl491wmnxr0hwdi3hk1-nix-wasm-rust-0.1.0/nix_wasm_plugin_fib.wasm' function 'fib': greetings from Wasm!
8

nix-repl> fib 10
warning: '/nix/store/m68xkk3halld3xl491wmnxr0hwdi3hk1-nix-wasm-rust-0.1.0/nix_wasm_plugin_fib.wasm' function 'fib': greetings from Wasm!
89

nix-repl> fib 12
warning: '/nix/store/m68xkk3halld3xl491wmnxr0hwdi3hk1-nix-wasm-rust-0.1.0/nix_wasm_plugin_fib.wasm' function 'fib': greetings from Wasm!
233

nix-repl> fib 0
warning: '/nix/store/m68xkk3halld3xl491wmnxr0hwdi3hk1-nix-wasm-rust-0.1.0/nix_wasm_plugin_fib.wasm' function 'fib': greetings from Wasm!
1

@github-actions github-actions bot temporarily deployed to pull request January 23, 2026 09:12 Inactive
@github-actions github-actions bot temporarily deployed to pull request January 23, 2026 10:17 Inactive
This reverts commit 3aa4799. macOS
doesn't have endian.h.
@edolstra edolstra enabled auto-merge January 23, 2026 10:33
@github-actions github-actions bot temporarily deployed to pull request January 23, 2026 10:36 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/libexpr/primops/wasm.cc`:
- Around line 143-148: getValue currently only checks upper bounds and then
dereferences values[id], but values[0] is intentionally reserved and set to
nullptr, so calling getValue(ValueId 0) can null-deref; update getValue(ValueId
id) to also reject id==0 or to check that values[id] != nullptr and throw a
clear Error (e.g., "invalid or reserved ValueId %d") before dereferencing;
reference the getValue function, the ValueId parameter, and the values container
when making this change.
- Around line 532-537: The code uses res[0].i32() (int32_t) directly as a
ValueId (uint32_t) for instance.getValue which can wrap negative returns to
large indices; modify the path around instance.runFunction / res / getValue to
validate the return: check that res.size()==1 and res[0].kind()==ValKind::I32 as
before, then ensure the int32_t value (from res[0].i32()) is non-negative (and
optionally within the valid ValueId range for your instance) and throw an Error
(including functionName and wasmPath) if it is negative/invalid; only after that
safely cast to ValueId/uint32_t and call instance.getValue with that validated
id.
♻️ Duplicate comments (1)
src/libexpr/primops/wasm.cc (1)

358-362: Non-deterministic attribute ordering persists.

The FIXME on line 358 notes that attributes should be returned in lexicographically sorted order for determinism, but this hasn't been implemented. This can cause non-reproducible behavior across different evaluations.

This was flagged in a previous review but remains unaddressed.

🧹 Nitpick comments (5)
src/libexpr/primops/wasm.cc (5)

13-19: Code duplication with wasi-derivation-builder.cc.

This unwrap() helper is identical to the one in src/libstore/unix/build/wasi-derivation-builder.cc. Consider extracting it to a shared header to avoid duplication.


111-111: Unused member variable ex.

The std::exception_ptr ex member is declared but never used in this file. Consider removing it or documenting its intended purpose if it's for future use.


173-176: Memory access uses std::span::subspan which throws std::out_of_range.

memory().subspan(ptr, len) will throw std::out_of_range on out-of-bounds access, not nix::Error. While this is caught by the generic catch (std::exception & e) in regFun, the error message will lack WASM-specific context.

Consider using a checked helper similar to the subspan template that throws Error with a descriptive message.

🔧 Suggested helper
std::span<uint8_t> checkedMemorySubspan(uint32_t ptr, uint32_t len)
{
    auto mem = memory();
    if (ptr > mem.size() || len > mem.size() - ptr)
        throw Error("Wasm memory access out of bounds: ptr=%u len=%u size=%zu", ptr, len, mem.size());
    return mem.subspan(ptr, len);
}

388-398: Clarify the sentinel value convention for get_attr.

Returning 0 for a missing attribute is now valid since ValueId 0 is reserved (line 125-126). However, this convention should be documented in a comment here, as it's not immediately obvious to readers that 0 means "not found" rather than being a valid value ID.

📝 Suggested documentation
     ValueId get_attr(ValueId valueId, uint32_t ptr, uint32_t len)
     {
         auto attrName = span2string(memory().subspan(ptr, len));

         auto & value = getValue(valueId);
         state.forceAttrs(value, noPos, "while getting an attribute from Wasm");

         auto attr = value.attrs()->get(state.symbols.create(attrName));

-        return attr ? addValue(attr->value) : 0;
+        // Return 0 (reserved sentinel) if attribute not found
+        return attr ? addValue(attr->value) : 0;
     }

512-515: Cache grows unbounded without eviction.

The static instancesPre cache accumulates entries indefinitely. As noted in the FIXME, this could lead to memory bloat in long-running processes that evaluate many different WASM modules.

Consider tracking cache size and implementing LRU eviction, or moving this to EvalState where it can be cleaned up with the evaluator.

@edolstra edolstra added this pull request to the merge queue Jan 23, 2026
Merged via the queue into main with commit 370987f Jan 23, 2026
28 checks passed
@edolstra edolstra deleted the wasm branch January 23, 2026 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants