-
Notifications
You must be signed in to change notification settings - Fork 6
Wasm support #309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wasm support #309
Conversation
It adds ~400 MB to the Nix closure which is a lot.
Also, require WASM plugins to export a `nix_wasm_init_v1` function.
We don't want to reuse instances between WASM calls, since that's impure: the WASM function could use some mutable global state to return a different value every time. So instead, pre-instantiate instances, allowing them to be instantiated efficiently. Unfortunately, while this is much faster than not using pre-instantiation, it is slower than reusing an instance: about 1.3s compared to 0.7s for the call.nix test in nix-wasm-plugin-test (which performs 100K WASM calls).
📝 WalkthroughWalkthroughAdds Wasmtime packaging and linking, implements a Wasm primop and WASI derivation builder for wasm32-wasip1, moves path realisation into EvalState::realisePath and renames sandbox path APIs to realPathInHost, and registers two wasm-related experimental flags plus documentation. Changes
Sequence Diagram(s)sequenceDiagram
participant User as Nix User
participant Eval as EvalState
participant Cache as InstanceCache
participant Engine as Wasmtime Engine/Linker
participant Module as WASM Module
participant Instance as WASM Instance
participant Host as Host Functions
User->>Eval: prim_wasm(wasmPath, entry, arg)
Eval->>Cache: lookup(wasmPath)
alt cached
Cache-->>Eval: pre-instantiated data
else not cached
Eval->>Engine: prepare Engine & Linker
Engine->>Host: register host functions
Eval->>Module: load & compile module
Engine->>Instance: instantiate module with Linker
Instance-->>Eval: return pre-instantiated data
end
Eval->>Instance: create per-call Store/Context
Eval->>Instance: write arg into Memory
Eval->>Instance: call entry(ptr)
Instance->>Host: invoke host functions as needed
Instance-->>Eval: return result pointer
Eval->>Instance: read result from Memory
Eval-->>User: return Nix Value
sequenceDiagram
participant Store as Nix Store
participant Dispatcher as Builder Dispatcher
participant WasiBuilder as WASI Derivation Builder
participant Wasmtime as Wasmtime Engine/Linker
participant WasmModule as WASM Module
participant Instance as WASM Instance
participant WasiEnv as WASI Environment
Store->>Dispatcher: makeDerivationBuilder(drv: wasm32-wasip1)
Dispatcher->>WasiBuilder: instantiate WASI builder
WasiBuilder->>Wasmtime: new Engine & Linker
WasiBuilder->>WasiEnv: configure WASI (stdio, argv, env, preopens)
WasiBuilder->>WasmModule: load builder wasm path
Wasmtime->>Instance: instantiate module
WasiBuilder->>Instance: locate _start export
WasiBuilder->>Instance: invoke _start()
Instance->>WasiEnv: perform WASI syscalls
Instance-->>WasiBuilder: execution complete
WasiBuilder-->>Store: build finished
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Fix all issues with AI Agents 🤖
In @src/libexpr/primops/wasm.cc:
- Around line 255-267: The code reads ValueId uint32_t values directly from WASM
memory in make_list, copy_list, make_attrset, and copy_attrset, assuming host
endianness; fix this by converting each 32-bit word from little-endian to host
order when constructing ValueId (e.g., apply le32toh or an equivalent portable
little-endian-to-host conversion) for every read from the memory subspan (the
vs[] accesses) so ValueId values are correct on big-endian hosts; update all
occurrences in the listed functions to perform this conversion before indexing
into values or assigning into lists/attrsets.
- Around line 334-350: In copy_attrname, replace the compile-time asserts with
runtime validations: check that attrIdx < attrs.size() and that len ==
name.size(); if either check fails, raise a runtime error (e.g., throw
std::runtime_error with a clear message mentioning valueId, attrIdx, len and
name.size()) or otherwise signal a trap/error path instead of proceeding; only
perform memcpy when both checks pass and keep the rest of the logic
(state.forceAttrs, obtaining name from state.symbols, and writing into memory())
unchanged.
- Around line 433-439: The static variable name `primop_fromTOML` is misleading
for the WASM primop registration; rename the variable to a descriptive
identifier like `primop_wasm` (i.e., change `static RegisterPrimOp
primop_fromTOML(...` to `static RegisterPrimOp primop_wasm(...`) so it matches
the registered `.name = "wasm"` and `.fun = prim_wasm`; update any local
references to this static variable if present in the file to use the new name.
- Around line 309-332: The copy_attrset function writes attributes into WASM
memory using enumerate(*value.attrs()) which is non-deterministic; to fix,
gather the attributes into a temporary vector (e.g., vector of pairs or pointers
to attr), sort that vector lexicographically by the attribute name string via
state.symbols[attr.name], then iterate the sorted vector to populate
buf[n].value = addValue(attr.value) and buf[n].nameLen =
state.symbols[attr.name].size(), preserving the existing buffer handling
(subspan/Attr/maxLen) and return value.attrs()->size().
- Around line 411-418: The static unordered_map instancesPre is not thread-safe
and leaks entries; replace it with a thread-safe concurrent map (use
boost::concurrent_flat_map keyed by SourcePath) and store weak references
instead of owning refs so entries can be reclaimed (e.g., map value -> weak
pointer to NixWasmInstancePre or equivalent weak_ref). Update the lookup/insert
logic around instancesPre and the use of make_ref<NixWasmInstancePre>(wasmPath)
so you first attempt to lock the weak entry, create a new instance only if
expired, and store a weak reference in the concurrent_flat_map; optionally add
an eviction or LRU policy if unbounded growth remains a concern.
- Around line 49-53: The subspan function currently uses assert to check buffer
size which is omitted in release builds; replace the assert in
subspan(std::span<uint8_t> s, size_t len) with a runtime check that throws a
descriptive exception (e.g., std::out_of_range or std::length_error) when
s.size() < len * sizeof(T), include the required header ( <stdexcept> ), and
keep the returned span construction (use reinterpret_cast<T*>(s.data()) for
clarity) so callers get a safe failure instead of potential buffer overruns.
🧹 Nitpick comments (12)
src/libstore/derivation-options.cc (1)
399-400: Consider requiring a system feature for WASM support.The special-casing of "wasm32-wasip1" allows any system to build WASM derivations, which aligns with the PR's goal of system-independent derivations. However, this assumes Wasmtime is always available.
Consider whether wasm32-wasip1 builds should require a system feature (e.g., "wasm" or "wasmtime") to ensure the runtime is actually available on the build machine. This would be checked by the existing
getRequiredSystemFeatureslogic at lines 406-408.🔎 Alternative approach using system features
If you want to gate WASM builds on explicit system capability:
- Have wasm32-wasip1 derivations declare
requiredSystemFeatures = ["wasm"];- Remove the special case here, letting the normal system feature check at lines 406-408 handle it:
- if (drv.platform != settings.thisSystem.get() && drv.platform != "wasm32-wasip1" - && !settings.extraPlatforms.get().count(drv.platform) && !drv.isBuiltin()) + if (drv.platform != settings.thisSystem.get() + && !settings.extraPlatforms.get().count(drv.platform) && !drv.isBuiltin()) return false;This would make WASM support more explicit and consistent with how other cross-compilation scenarios are handled.
src/libstore/meson.build (1)
383-383: Consider using Meson's dependency system instead of hardcoded linker flags.Directly appending
-lwasmtimetolink_argsbypasses Meson's dependency resolution, which can lead to portability and configuration issues:
- No verification that the wasmtime library is actually available
- Library search paths may not be configured correctly
- Platform-specific library naming conventions (e.g.,
.dllvs.so) are not handled- Build will fail with cryptic linker errors rather than clear dependency errors
🔎 Recommended approach using Meson dependency system
Near the top of the file with other dependencies (around line 109-161), add:
+wasmtime = dependency('wasmtime', required : true) +deps_private += wasmtimeThen remove the hardcoded linker flag:
- link_args : linker_export_flags + [ '-lwasmtime' ], + link_args : linker_export_flags,This provides better error messages, respects pkg-config paths, and handles platform differences automatically.
src/libexpr/meson.build (1)
242-242: Consider using Meson's dependency system instead of hardcoded linker flags.This has the same portability concerns as the identical change in
src/libstore/meson.buildat line 383. Hardcoded-lwasmtimebypasses Meson's dependency resolution and can cause issues with library paths, platform-specific naming, and error reporting.🔎 Recommended approach using Meson dependency system
Near the dependencies section (around lines 41-92), add:
+wasmtime = dependency('wasmtime', required : true) +deps_private += wasmtimeThen remove the hardcoded linker flag:
- link_args : linker_export_flags + [ '-lwasmtime' ], + link_args : linker_export_flags,This approach provides better error messages, respects pkg-config configurations, and handles cross-platform differences automatically.
src/libstore/unix/build/derivation-builder.cc (1)
2033-2035: Early return forwasm32-wasip1bypasses later sandbox / uid‑range checksRouting
params.drv.platform == "wasm32-wasip1"directly toWasiDerivationBuilderhappens before:
- The diverted store check (
storeDir != realStoreDir) and- The
!useSandbox && params.drvOptions.useUidRange(drv)guard.That means for WASI builds:
uid-rangerequests will no longer be rejected when sandboxing is effectively disabled here.- Diverted stores on non‑Linux platforms will also be allowed for this path, unlike other builders.
If this is intentional (relying on the WASI runtime rather than OS sandboxing), consider making that explicit and either:
- Reject
uid-rangeinWasiDerivationBuilder’s path (or in this branch) so configuration errors are surfaced, or- Document that
uid-rangeis ignored forwasm32-wasip1and that OS sandboxing semantics differ from other systems.src/libexpr/primops.cc (1)
168-188: MakingrealisePathnon‑static for reuse is reasonableExposing
SourcePath realisePath(...)(no longerstatic) with the optionalSymlinkResolutionparameter allows other primop modules (e.g. WASM/WASI ones) to share the same path + context resolution logic. The implementation is consistent with existing callers, and the default (Full) preserves previous behavior.Just ensure there is a matching declaration in a header (e.g.
primops.hh) so external uses stay type‑checked and the function remains the single canonical implementation.src/libstore/unix/build/wasi-derivation-builder.cc (1)
5-18: Deduplicateunwrap/string2spanhelpers and tighten const‑correctnessBoth
unwrapandstring2spanappear to be cut‑and‑paste fromsrc/libexpr/primops/wasm.cc, which means future changes to the Wasmtime API or error handling will have to be done in two places.Consider:
- Moving these helpers to a small shared header (e.g. a
wasm-runtime-utils.hh) used by both primops and the WASI builder, and- Switching
string2spanto use astd::span<const uint8_t>(and avoiding the C‑style cast) if the Wasmtime C++ API accepts a const span, to keep const‑correctness and avoid accidental mutation.Please double‑check the current
wasmtime::Module::compilesignature in your pinned Wasmtime version to confirm whetherstd::span<const uint8_t>is acceptable and adjust the helper accordingly.src/libexpr/primops/wasm.cc (6)
6-6: Consider removingusing namespace wasmtime;for clarity.While acceptable in implementation files, explicitly qualifying wasmtime types improves code clarity and prevents potential name collisions with future additions to the nix namespace.
10-15: Refactor: Remove duplicate declaration and include proper header.The
realisePathfunction is already defined insrc/libexpr/primops.cc(lines 167-187). Instead of forward-declaring it here, consider extracting it to a shared header file that both files can include, or making it part of theEvalStateAPI.
94-103: Consider factory method for clearer error handling.The constructor performs heavy operations (WASM compilation via
unwrap()) that can throw. Consider using a static factory method that returnsResult<ref<NixWasmInstancePre>, Error>for clearer error handling and easier testing.
182-196: Simplify error throwing pattern.Line 196 uses an immediately-invoked lambda to throw an error, which is less clear than throwing directly. The ternary chain could also benefit from a switch statement for clarity.
🔎 Proposed refactor
uint32_t get_type(ValueId valueId) { auto & value = *values.at(valueId); state.forceValue(value, noPos); - auto t = value.type(); - return t == nInt ? 1 - : t == nFloat ? 2 - : t == nBool ? 3 - : t == nString ? 4 - : t == nPath ? 5 - : t == nNull ? 6 - : t == nAttrs ? 7 - : t == nList ? 8 - : t == nFunction ? 9 - : []() -> int { throw Error("unsupported type"); }(); + switch (value.type()) { + case nInt: return 1; + case nFloat: return 2; + case nBool: return 3; + case nString: return 4; + case nPath: return 5; + case nNull: return 6; + case nAttrs: return 7; + case nList: return 8; + case nFunction: return 9; + default: throw Error("unsupported type"); + } }
230-238: Consider adding documentation for two-phase copy pattern.This function uses a common but potentially confusing pattern: it only copies data if the buffer is large enough, but always returns the actual size. A brief comment explaining that callers should check
returnValue <= maxLenbefore using the buffer would improve clarity.
426-426: Improve readability by splitting complex expression.Line 426 chains multiple operations in a single expression, making it difficult to read and debug. Consider breaking it into separate steps.
🔎 Proposed refactor
- v = *instance.values.at(instance.runFunction(functionName, {(int32_t) instance.addValue(args[2])}).at(0).i32()); + auto argId = instance.addValue(args[2]); + auto results = instance.runFunction(functionName, {(int32_t) argId}); + auto resultId = results.at(0).i32(); + v = *instance.values.at(resultId);
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
packaging/dependencies.nixpackaging/wasmtime.nixsrc/libexpr/meson.buildsrc/libexpr/package.nixsrc/libexpr/primops.ccsrc/libexpr/primops/meson.buildsrc/libexpr/primops/wasm.ccsrc/libstore/derivation-options.ccsrc/libstore/meson.buildsrc/libstore/unix/build/derivation-builder.ccsrc/libstore/unix/build/wasi-derivation-builder.cc
🧰 Additional context used
🧬 Code graph analysis (3)
src/libstore/derivation-options.cc (1)
src/libstore/include/nix/store/derivation-options.hh (2)
drv(184-184)drv(201-201)
src/libstore/unix/build/wasi-derivation-builder.cc (1)
src/libexpr/primops/wasm.cc (7)
unwrap(20-25)unwrap(20-20)v(131-136)v(131-131)v(138-143)string2span(38-41)string2span(38-38)
src/libexpr/primops/wasm.cc (1)
src/libexpr/primops.cc (4)
state(3218-3224)state(3218-3218)realisePath(168-188)realisePath(168-172)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build_aarch64-darwin / build
🔇 Additional comments (6)
src/libexpr/primops/meson.build (1)
12-12: LGTM!Adding wasm.cc to the primops sources is straightforward and correct.
src/libexpr/package.nix (1)
17-17: LGTM!The wasmtime dependency is correctly added following the same pattern as toml11 and other build-time dependencies.
Also applies to: 68-68
packaging/dependencies.nix (1)
78-79: Wiringwasmtimeinto the dependency scope looks correct
wasmtime = pkgs.callPackage ./wasmtime.nix { };is consistent with how other custom packages are exposed here; no further changes needed.src/libstore/unix/build/derivation-builder.cc (2)
294-298: Naming comment onrealPathInSandboxThe
// FIXME: misnomernote correctly reflects that this helper currently just resolves store paths rather than honoring a sandbox path mapping. No functional impact, so this can remain as a low‑priority cleanup marker.
1996-2001: Includingwasi-derivation-builder.ccalongside other platform builders is consistentAdding
#include "wasi-derivation-builder.cc"next to the other platform-specific builder includes matches the existing “single TU with included .cc files” pattern. No issues from an integration standpoint.packaging/wasmtime.nix (1)
1-121: Wasmtime packaging looks solid; minor note on header-copy path robustnessOverall this derivation is well-structured:
- Uses a pinned Rust toolchain (
rust_1_89) andbuildRustPackage.- Restricts
doCheckto platforms with the required SIMD features and skips problematicaarch64runs.- Splits outputs into
out,dev, andlib, pruning.so/.aaccording toenableShared/enableStatic.- Handles shell completions correctly for both native and cross builds.
- Adds an install-time
--versioncheck and an updateScript with an appropriate version regex.One thing to keep in mind is the header copy:
cp -r target/${cargoShortTarget}/release/build/wasmtime-c-api-impl-*/out/include $dev/includeThis relies on the exact crate/build directory naming and
releaseprofile path; if upstream layout or build profile changes, this will break somewhat opaquely. It may be worth:
- Asserting that exactly one
wasmtime-c-api-impl-*match exists, or- Failing with a clearer message if the glob expands to zero or multiple directories.
Otherwise the packaging is in good shape.
Please double-check against your pinned Wasmtime 40.0.0 sources that the
wasmtime-c-api-impl-*build output layout (and thecargoShortTarget/releasepath) is stable; if upstream changes it, thecp -r .../out/includestep will be the first thing to break.
| uint32_t copy_attrset(ValueId valueId, uint32_t ptr, uint32_t maxLen) | ||
| { | ||
| auto & value = *values.at(valueId); | ||
| state.forceAttrs(value, noPos, "while copying an attrset into WASM"); | ||
|
|
||
| if (value.attrs()->size() <= maxLen) { | ||
| // FIXME: endianness. | ||
| struct Attr | ||
| { | ||
| ValueId value; | ||
| uint32_t nameLen; | ||
| }; | ||
|
|
||
| auto buf = subspan<Attr>(memory().subspan(ptr), maxLen); | ||
|
|
||
| // FIXME: for determinism, we should return attributes in lexicographically sorted order. | ||
| for (const auto & [n, attr] : enumerate(*value.attrs())) { | ||
| buf[n].value = addValue(attr.value); | ||
| buf[n].nameLen = state.symbols[attr.name].size(); | ||
| } | ||
| } | ||
|
|
||
| return value.attrs()->size(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix: Ensure deterministic attribute ordering.
The FIXME on line 324 correctly identifies that attributes should be returned in lexicographically sorted order for determinism. Nix's evaluation model requires consistent attribute ordering across evaluations.
🔎 Proposed fix
if (value.attrs()->size() <= maxLen) {
// FIXME: endianness.
struct Attr
{
ValueId value;
uint32_t nameLen;
};
auto buf = subspan<Attr>(memory().subspan(ptr), maxLen);
- // FIXME: for determinism, we should return attributes in lexicographically sorted order.
- for (const auto & [n, attr] : enumerate(*value.attrs())) {
+ // Collect and sort attributes by name for determinism
+ std::vector<std::pair<std::string_view, const Attr*>> sortedAttrs;
+ for (const auto & attr : *value.attrs()) {
+ sortedAttrs.push_back({state.symbols[attr.name], &attr});
+ }
+ std::sort(sortedAttrs.begin(), sortedAttrs.end(),
+ [](const auto & a, const auto & b) { return a.first < b.first; });
+
+ for (const auto & [n, pair] : enumerate(sortedAttrs)) {
+ const auto & attr = *pair.second;
buf[n].value = addValue(attr.value);
buf[n].nameLen = state.symbols[attr.name].size();
}
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 309-332, The copy_attrset function
writes attributes into WASM memory using enumerate(*value.attrs()) which is
non-deterministic; to fix, gather the attributes into a temporary vector (e.g.,
vector of pairs or pointers to attr), sort that vector lexicographically by the
attribute name string via state.symbols[attr.name], then iterate the sorted
vector to populate buf[n].value = addValue(attr.value) and buf[n].nameLen =
state.symbols[attr.name].size(), preserving the existing buffer handling
(subspan/Attr/maxLen) and return value.attrs()->size().
This function was named incorrectly. It denotes the path in the host filesystem, not in the sandbox.
Even though it's in a separate output, disabling it reduces the size of libwasmtime.so by around 12 MB.
This reduces the size of the Nix closure by another 12 MB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/libexpr/include/nix/expr/eval.hh`:
- Around line 1037-1039: Update the doc comment for the function realisePath to
correct the misspelled helper name: change "reliaseContext()" to
"realiseContext()". Ensure the comment text refers to realiseContext()
accurately so readers can find the corresponding function.
In `@src/libexpr/primops.cc`:
- Around line 168-173: The coercion call inside EvalState::realisePath currently
uses noPos which discards source location on errors; change the coerceToPath
invocation in EvalState::realisePath to pass the incoming PosIdx pos (not noPos)
so any coercion failures preserve and report the original call-site position;
update the call site referencing coerceToPath(...) accordingly.
♻️ Duplicate comments (4)
src/libexpr/primops/wasm.cc (4)
160-176: Add a checked subspan for untrusted WASM pointers.
memory().subspan(ptr, len)is called with untrustedptr/len, which can lead to OOB access. Introduce a bounds-checked helper and use it at all access sites (panic/warn/make_string/copy_string/…/read_file).🛡️ Suggested fix (helper + example usage)
auto memory() { return memory_.data(wasmCtx); } + + std::span<uint8_t> checkedSubspan(uint32_t ptr, uint32_t len) + { + auto mem = memory(); + if (ptr > mem.size() || len > mem.size() - ptr) + throw Error("Wasm memory access out of bounds: ptr=%u len=%u size=%zu", ptr, len, mem.size()); + return mem.subspan(ptr, len); + } std::monostate panic(uint32_t ptr, uint32_t len) { - throw Error("Wasm panic: %s", Uncolored(span2string(memory().subspan(ptr, len)))); + throw Error("Wasm panic: %s", Uncolored(span2string(checkedSubspan(ptr, len)))); }
281-289: Fix endianness for ValueId reads from WASM memory.WASM is little‑endian; reading
ValueIddirectly is wrong on big‑endian hosts. Convert each 32‑bit word to host order before indexingvalues.🧭 Suggested fix (pattern)
- for (const auto & [n, v] : enumerate(list)) - v = values.at(vs[n]); // FIXME: endianness + for (const auto & [n, v] : enumerate(list)) { + auto id = boost::endian::little_to_native(vs[n]); + v = values.at(id); + }
504-510: Make the Wasm instance cache thread-safe.
instancesPreis a staticunordered_mapwith no synchronization and will race under parallel evaluation. Add locking or replace it with a concurrent map.🔒 Suggested fix (minimal locking)
- static std::unordered_map<SourcePath, ref<NixWasmInstancePre>> instancesPre; + static std::unordered_map<SourcePath, ref<NixWasmInstancePre>> instancesPre; + static std::mutex instancesPreMutex; - auto instancePre = instancesPre.find(wasmPath); + std::lock_guard<std::mutex> lock(instancesPreMutex); + auto instancePre = instancesPre.find(wasmPath);
519-525: Validate returned ValueId before indexinginstance.values.
res[0].i32()is untrusted. An out‑of‑range id throwsstd::out_of_range(not caught here), which can abort evaluation. Validate the id and raiseErrorinstead.✅ Suggested fix
- auto vRes = instance.values.at(res[0].i32()); + auto resultId = res[0].i32(); + if (resultId < 0 || static_cast<size_t>(resultId) >= instance.values.size()) + throw Error( + "Wasm function '%s' from '%s' returned invalid ValueId %d", + functionName, + wasmPath, + resultId); + auto vRes = instance.values.at(static_cast<size_t>(resultId));
This allows it to be used by primops defined outside of primops.cc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/libexpr/primops/wasm.cc`:
- Around line 519-525: The code must validate the ValueId returned by the WASM
call before indexing instance.values or dereferencing: after calling
instance.runFunction(...) and confirming res[0].kind() == ValKind::I32, extract
the id (from res[0].i32()), check that it is within the valid range (>= 1 and <
instance.values.size()) and not the reserved null (0), and if invalid throw an
Error that includes functionName and wasmPath; only then retrieve
instance.values.at(id) and pass it to state.forceValue(*vRes, pos). Ensure error
paths provide the same contextual message pattern used elsewhere so the original
WASM error context is preserved.
♻️ Duplicate comments (4)
src/libexpr/primops.cc (1)
168-173: Preserve call-site position in path coercion.
noPosdiscards error locations; passposso coercion failures point to the caller.🐛 Suggested fix
- auto path = coerceToPath(noPos, v, context, "while realising the context of a path"); + auto path = coerceToPath(pos, v, context, "while realising the context of a path");src/libexpr/primops/wasm.cc (3)
160-176: Add bounds-checked WASM memory access.
memory().subspan(ptr, len)is used with untrustedptr/lenand can go OOB. Add a checked helper and use it for all subspan reads/writes (panic/warn/make_* /copy_* /read_file/etc.).🔒 Suggested helper + example usage
auto memory() { return memory_.data(wasmCtx); } + + std::span<uint8_t> checkedSubspan(uint32_t ptr, uint32_t len) + { + auto mem = memory(); + uint64_t end = (uint64_t) ptr + (uint64_t) len; + if (end > mem.size()) + throw Error("Wasm memory access out of bounds: ptr=%u len=%u size=%zu", ptr, len, mem.size()); + return mem.subspan(ptr, len); + } @@ std::monostate panic(uint32_t ptr, uint32_t len) { - throw Error("Wasm panic: %s", Uncolored(span2string(memory().subspan(ptr, len)))); + throw Error("Wasm panic: %s", Uncolored(span2string(checkedSubspan(ptr, len)))); } @@ std::monostate warn(uint32_t ptr, uint32_t len) { nix::warn( "'%s' function '%s': %s", pre->wasmPath, functionName.value_or("<unknown>"), - span2string(memory().subspan(ptr, len))); + span2string(checkedSubspan(ptr, len))); return {}; }
335-355: Ensure deterministic attribute ordering when copying attrsets.Enumerating
*value.attrs()is not guaranteed to be lexicographically sorted, which can make WASM results nondeterministic. Sort by attribute name before writing to memory.♻️ Suggested fix
- // FIXME: for determinism, we should return attributes in lexicographically sorted order. - for (const auto & [n, attr] : enumerate(*value.attrs())) { - buf[n].value = addValue(attr.value); - buf[n].nameLen = state.symbols[attr.name].size(); - } + std::vector<const Attr *> sorted; + sorted.reserve(value.attrs()->size()); + for (const auto & attr : *value.attrs()) + sorted.push_back(&attr); + std::sort(sorted.begin(), sorted.end(), [&](const Attr * a, const Attr * b) { + return state.symbols[a->name] < state.symbols[b->name]; + }); + for (const auto & [n, attr] : enumerate(sorted)) { + buf[n].value = addValue(attr->value); + buf[n].nameLen = state.symbols[attr->name].size(); + }
503-510: MakeinstancesPrecache thread-safe.Parallel evaluation can race on the static
instancesPremap. Add locking or switch to a concurrent map.🧵 Simple locking fix
- static std::unordered_map<SourcePath, ref<NixWasmInstancePre>> instancesPre; + static std::unordered_map<SourcePath, ref<NixWasmInstancePre>> instancesPre; + static std::mutex instancesPreMutex; + std::lock_guard<std::mutex> lock(instancesPreMutex);
cole-h
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
;: ./result/bin/nix repl --extra-experimental-features wasm-builtin
Nix (Determinate Nix 3.15.1) 2.33.1
Type :? for help.
nix-repl> wasm = builtins.wasm "/nix/store/m68xkk3halld3xl491wmnxr0hwdi3hk1-nix-wasm-rust-0.1.0/nix_wasm_plugin_fib.wasm"
nix-repl> fib = wasm "fib"
nix-repl> fib 5
warning: '/nix/store/m68xkk3halld3xl491wmnxr0hwdi3hk1-nix-wasm-rust-0.1.0/nix_wasm_plugin_fib.wasm' function 'fib': greetings from Wasm!
8
nix-repl> fib 10
warning: '/nix/store/m68xkk3halld3xl491wmnxr0hwdi3hk1-nix-wasm-rust-0.1.0/nix_wasm_plugin_fib.wasm' function 'fib': greetings from Wasm!
89
nix-repl> fib 12
warning: '/nix/store/m68xkk3halld3xl491wmnxr0hwdi3hk1-nix-wasm-rust-0.1.0/nix_wasm_plugin_fib.wasm' function 'fib': greetings from Wasm!
233
nix-repl> fib 0
warning: '/nix/store/m68xkk3halld3xl491wmnxr0hwdi3hk1-nix-wasm-rust-0.1.0/nix_wasm_plugin_fib.wasm' function 'fib': greetings from Wasm!
1
This reverts commit 3aa4799. macOS doesn't have endian.h.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/libexpr/primops/wasm.cc`:
- Around line 143-148: getValue currently only checks upper bounds and then
dereferences values[id], but values[0] is intentionally reserved and set to
nullptr, so calling getValue(ValueId 0) can null-deref; update getValue(ValueId
id) to also reject id==0 or to check that values[id] != nullptr and throw a
clear Error (e.g., "invalid or reserved ValueId %d") before dereferencing;
reference the getValue function, the ValueId parameter, and the values container
when making this change.
- Around line 532-537: The code uses res[0].i32() (int32_t) directly as a
ValueId (uint32_t) for instance.getValue which can wrap negative returns to
large indices; modify the path around instance.runFunction / res / getValue to
validate the return: check that res.size()==1 and res[0].kind()==ValKind::I32 as
before, then ensure the int32_t value (from res[0].i32()) is non-negative (and
optionally within the valid ValueId range for your instance) and throw an Error
(including functionName and wasmPath) if it is negative/invalid; only after that
safely cast to ValueId/uint32_t and call instance.getValue with that validated
id.
♻️ Duplicate comments (1)
src/libexpr/primops/wasm.cc (1)
358-362: Non-deterministic attribute ordering persists.The FIXME on line 358 notes that attributes should be returned in lexicographically sorted order for determinism, but this hasn't been implemented. This can cause non-reproducible behavior across different evaluations.
This was flagged in a previous review but remains unaddressed.
🧹 Nitpick comments (5)
src/libexpr/primops/wasm.cc (5)
13-19: Code duplication withwasi-derivation-builder.cc.This
unwrap()helper is identical to the one insrc/libstore/unix/build/wasi-derivation-builder.cc. Consider extracting it to a shared header to avoid duplication.
111-111: Unused member variableex.The
std::exception_ptr exmember is declared but never used in this file. Consider removing it or documenting its intended purpose if it's for future use.
173-176: Memory access usesstd::span::subspanwhich throwsstd::out_of_range.
memory().subspan(ptr, len)will throwstd::out_of_rangeon out-of-bounds access, notnix::Error. While this is caught by the genericcatch (std::exception & e)inregFun, the error message will lack WASM-specific context.Consider using a checked helper similar to the
subspantemplate that throwsErrorwith a descriptive message.🔧 Suggested helper
std::span<uint8_t> checkedMemorySubspan(uint32_t ptr, uint32_t len) { auto mem = memory(); if (ptr > mem.size() || len > mem.size() - ptr) throw Error("Wasm memory access out of bounds: ptr=%u len=%u size=%zu", ptr, len, mem.size()); return mem.subspan(ptr, len); }
388-398: Clarify the sentinel value convention forget_attr.Returning
0for a missing attribute is now valid sinceValueId 0is reserved (line 125-126). However, this convention should be documented in a comment here, as it's not immediately obvious to readers that0means "not found" rather than being a valid value ID.📝 Suggested documentation
ValueId get_attr(ValueId valueId, uint32_t ptr, uint32_t len) { auto attrName = span2string(memory().subspan(ptr, len)); auto & value = getValue(valueId); state.forceAttrs(value, noPos, "while getting an attribute from Wasm"); auto attr = value.attrs()->get(state.symbols.create(attrName)); - return attr ? addValue(attr->value) : 0; + // Return 0 (reserved sentinel) if attribute not found + return attr ? addValue(attr->value) : 0; }
512-515: Cache grows unbounded without eviction.The static
instancesPrecache accumulates entries indefinitely. As noted in the FIXME, this could lead to memory bloat in long-running processes that evaluate many different WASM modules.Consider tracking cache size and implementing LRU eviction, or moving this to
EvalStatewhere it can be cleaned up with the evaluator.
Motivation
This PR adds support for extending Nix using Wasm (specifically
wasmtime) in two ways:It adds a new builtin function,
builtins.wasm, that allows Nix functions to be written in any language that compiles to Wasm. For example,The goal is to allow complex functionality (like YAML parsers) to be written in more convenient languages (like Rust) and executed faster, without having to resort to import-from-derivation.
The Wasm function can lazily traverse its argument and can produce arbitrary Nix values (except functions). Each Wasm function call is executed as a fresh instance of the Wasm VM, ensuring determinism (i.e. you can't use the Wasm memory to pass state between function calls).
See Initial bindings/examples for writing Nix functions in Rust nix-wasm-rust#1 for examples of Nix functions written in Rust.
It adds a new system type,
wasm32-wasip1, that allows for system-independent derivations. For example:Example here: https://github.com/DeterminateSystems/nix-wasi-rust
Context
Summary by CodeRabbit
New Features
Packaging
Build / Integration
Experimental Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.