Skip to content

fix: harden Windows repo setup#1770

Closed
dutchiono wants to merge 5 commits intomilady-ai:developfrom
dutchiono:codex/windows-setup-scripts
Closed

fix: harden Windows repo setup#1770
dutchiono wants to merge 5 commits intomilady-ai:developfrom
dutchiono:codex/windows-setup-scripts

Conversation

@dutchiono
Copy link
Copy Markdown
Collaborator

@dutchiono dutchiono commented Apr 9, 2026

Summary

  • treat plugin submodules as incomplete until their workspace package manifests exist
  • repair empty initialized submodule worktrees before declaring setup success
  • add Windows-safe plugin bin shims and a bunx tsup fallback during upstream setup
  • rebuild plugin outputs when dist/ exists but declared type artifacts are missing
  • refresh bun.lock so frozen installs match the manifests on this branch

Why

Windows installs could fail even after git pull because required plugin submodules were present but empty, and plugin build steps relied on bin-resolution behavior that was brittle under Bun on Windows.

CI also surfaced a separate lockfile problem on this branch: bun install --frozen-lockfile --ignore-scripts wanted to rewrite bun.lock under the same MILADY_SKIP_LOCAL_UPSTREAMS=1 setup used by the Windows and test workflows. Refreshing the lockfile fixes that determinism issue.

CI Notes

  • MILADY_SKIP_LOCAL_UPSTREAMS: "1" is set in the Windows-focused workflows because CI should not depend on a repo-local eliza checkout.
  • Windows frozen installs stay pinned to Bun 1.3.9 only where Bun 1.3.11 still reports false root lockfile drift; the broader matrix in windows-dev-smoke.yml remains on 1.3.10 and canary to keep coverage.
  • The lockfile refresh also removes stale workspace lock entries that were no longer represented by the actual manifests visible to Bun on this branch.

Review Notes

  • The bun run build fallback in scripts/setup-upstreams.mjs is still present; one review appears to have relied on a truncated GitHub patch view that cut off before the unconditional fallback.
  • normalizeTsupVersionSpec already falls back non-exact specs such as workspace:^8 to latest, and that path is covered by tests.
  • SUBMODULE_READINESS_MARKERS still take precedence over the generic plugin-manifest fallback.
  • The app-manager.test.ts Hyperscape assertion updates are intentional test corrections for the runtime-backed launch path, not formatter-only churn.

Validation

  • bun install --frozen-lockfile --ignore-scripts with and without MILADY_SKIP_LOCAL_UPSTREAMS=1
  • bunx bun@1.3.9 install --frozen-lockfile --ignore-scripts with and without MILADY_SKIP_LOCAL_UPSTREAMS=1
  • bunx vitest run scripts/init-submodules.test.ts packages/app-core/src/runtime/init-submodules-script.test.ts scripts/setup-upstreams.test.ts
  • bun run typecheck
  • bun run check (currently still reports the existing Biome docs-ignore issue in apps/web/src/docs/content/**, unrelated to this PR's changes)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

Here is the full review — display only since the GitHub post was blocked:


Note: This PR is currently in DRAFT state.


Review: PR #1770 — fix: harden Windows repo setup


1. Classification (derived)

build/setup fix — consistent with author's claim. All changes are in scripts/ and test files; nothing touches runtime, plugins, connectors, or UI.

2. Rubric (derived)

Setup scripts (init-submodules.mjs, setup-upstreams.mjs) and their tests. Key dimensions: correctness of readiness detection logic, safety of the git read-tree repair, completeness of the Windows shim approach, robustness of the bunx version spec handling.

3. Scope verdict

in scope — Windows setup reliability is load-bearing for onboarding and CI.

4. Universal invariants

All intact. NODE_PATH sites, patch-deps.mjs, Electrobun guards, namespace, ports, plugin imports, uiShellMode, StartupPhase, VrmViewer, RPC schema, dev observability, access control files — none touched.

5. Judgment

Needed?OK. Empty plugin submodule worktrees silently passing the readiness check is a real Windows bug. Incomplete dist/ (present dir, missing types artifact) is also real.

Better than existing?OK for hasPluginBuildOutputs and the read-tree repair. One concern on the isSubmoduleCheckoutReady restructuring (below).

Duplication?OK.

Blast radius covered?OK. All three changed functions have updated tests.

Logic sound?CONCERN on two points:

  1. isSubmoduleCheckoutReady silently bypasses SUBMODULE_READINESS_MARKERS for all plugins/* paths. The new early-return runs before the map lookup, so the explicit plugins/plugin-agent-orchestrator entry is now dead code. Today the behavior is equivalent (both just check package.json), but if anyone adds a plugin to that map with a non-standard or additional marker, it will be silently ignored. The correct fix is to invert the priority: check the map first, fall back to the generic plugin-manifest check only when the path is absent from the map.

  2. getBuildCommandFallback version stripping is incomplete. preferredVersionSpec.replace(/^[~^]/, "") handles ^8.3.5 and ~8.3.5 correctly but leaves >=8.0.0, 8.x, or workspace:^8 intact, producing a bunx tsup@>=8.0.0 invocation that would fail. elizaOS plugins in practice use ^X.Y.Z, so the risk is low, but add a guard or fall back to "latest" for unrecognized formats.

Complexity appropriate?OK. Small focused helpers.

Tested meaningfully?CONCERN: the Windows .cmd shim test wraps all assertions in if (process.platform === "win32"). On Linux/macOS CI (where this runs), the test body is a no-op. The feature is never actually verified in CI. Either extract ensureWindowsCmdShim into a unit test that mocks writeFileSync, or add an explicit comment noting this is intentionally platform-gated with no CI coverage.

Matches conventions?OK. TypeScript strict, no added deps, under 500 LOC per file.

Plausible breakage mode: A plugins/* submodule whose workspace manifest lives at an atypical path (e.g., src/package.json) would permanently fail isSubmoduleCheckoutReady, causing git submodule update + read-tree to run on every bun install. Silent churn that masks real failures downstream.

6. PR-type-specific checks

Check Result
init-submodules.mjs and tests updated together OK
setup-upstreams.mjs and tests updated together OK
read-tree failure caught by outer try/catch OK
.cmd shim uses real target path, not symlink path OK
bunx fallback only fires when local tsup bin is absent OK
ensureWindowsCmdShim is a no-op on non-Windows OK
SUBMODULE_READINESS_MARKERS priority respected CONCERN — early-return bypasses map
bunx version spec handles all semver range formats CONCERN — incomplete strip
Windows cmd shim assertions execute in CI CONCERN — platform-gated, never runs

7. Security

Clear. .cmd shim content derives from local package.json bin entries — no user-controlled data. Path interpolation in shell commands matches the existing pattern already present in the base code. bunx tsup@<version> is dev-time only and trusts the same version source as bun run build.

8. Decision

REQUEST CHANGES

Three issues before this leaves draft:

  1. Ordering in isSubmoduleCheckoutReady — check SUBMODULE_READINESS_MARKERS first; only use the generic plugins/ manifest check as a fallback when the path is not in the map. The current early return makes the explicit map entry for plugins/plugin-agent-orchestrator unreachable dead code.

  2. Version spec robustness in getBuildCommandFallback — guard against non-~/^ prefixes (>=, workspace:, etc.) and fall back to "latest" rather than emitting a malformed bunx invocation.

  3. Windows shim test coverage — extract a direct unit test for ensureWindowsCmdShim that doesn't depend on process.platform, or explicitly document the CI gap.

@github-actions github-actions bot added category:feature Auto-managed semantic PR category trust:probationary Building trust, closer scrutiny (auto-managed) ci labels Apr 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

Here is my review for PR #1770:


1. Classification (derived): workflow/CI + build tooling hardening — reclassified. Pre-classification says "feature" but the diff is entirely build scripts, CI workflows, and their tests. No runtime, plugin, or UI code is touched.

2. Rubric (derived): Correctness of the unified Windows/non-Windows submodule init path, safety of the read-tree empty-worktree repair, validity of the bunx tsup fallback control flow, and version consistency across CI workflow files.

3. Scope verdict: in scope

4. Universal invariants: all intact — NODE_PATH (3 sites), patch-deps.mjs, Electrobun try/catch guards, namespace, ports, dynamic imports, StartupPhase, VrmViewer, RPC schema, dev observability endpoints, and access control files are all untouched.

5. Judgment:

  • Needed? — OK. The pre-existing Windows path initialized only plugins/plugin-agent-orchestrator via a hardcoded step; the rest of the plugin submodule ecosystem was silently skipped. Real Windows build failures were the consequence.
  • Better than existing? — OK. Unifying submodule init under init-submodules.mjs for all platforms eliminates the drift risk between the Windows-specific and general paths.
  • Duplication? — OK. No reinvention of existing helpers.
  • Blast radius covered? — CONCERN: see items below.
  • Logic sound? — OK with one note.
  • Complexity appropriate? — OK. getBuildCommandFallback, hasPluginBuildOutputs, ensureWindowsCmdShim are each under 20 lines and single-purpose.
  • Tested meaningfully? — OK. Every new code path has a unit test. Drift-guard tests are updated. prepare-package-dist.test.ts runs the real script against a temp fixture.
  • Matches conventions? — OK. Biome-clean, no secrets, no new deps.
  • Plausible breakage mode: A new plugin submodule is added with Windows-incompatible filenames in HEAD (like the existing plugin-openrouter). The unified path will now attempt to init it on Windows CI and fail, whereas the old path would simply not have touched it. Mitigation is the existing SKIP_SUBMODULES set — contributors must remember to add entries there.

6. PR-type-specific checks:

Workflow unification:

  • OK — All four affected workflow files switch to node scripts/init-submodules.mjs. No shell: bash required; node is in PATH on GitHub Actions Windows runners.
  • OK — ci-workflow-drift.test.ts is updated to assert both Windows workflow files contain the unified step.
  • CONCERN: Bun version asymmetry. test-electrobun-release.yml bumps 1.3.91.3.11, but release-electrobun.yml, release-electrobun-build-linux-x64-testbox.yml, and release-electrobun-build-windows-x64-testbox.yml stay at 1.3.9. The audit/drift tests are updated to match only the test workflow. This creates a deliberate test/release version mismatch with no explanation.

init-submodules.mjs — plugin readiness semantics:

  • OK — isSubmoduleCheckoutReady for plugins/ now checks for at least one of package.json or typescript/package.json via .some(). Previously, plugins not in SUBMODULE_READINESS_MARKERS returned true unconditionally (empty markers → considered ready), which was the root cause of the "empty worktree passes readiness check" bug.
  • OK — The git -C "${submodule.path}" read-tree --reset -u HEAD repair only fires when the worktree is empty after git submodule update returns without error. An empty worktree has no local changes to destroy.

setup-upstreams.mjsbunx fallback:

  • OK — The fallback condition buildCommandFallback && !pathExists(...requiredBinPath) is correct: if local tsup is present, falls through to bun run build. The continue after bunx prevents double-build.
  • OK — hasPluginBuildOutputs correctly catches the "dist/ present, types file absent" partial-build scenario.
  • CONCERN (minor): ensureWindowsCmdShim hardcodes node as interpreter. For tsup this is correct. But the function is called for every bin entry unconditionally — a non-Node bin in a future plugin would get a broken .cmd shim silently.

prepare-package-dist.mjs:

  • OK — readInstalledPackageVersion is well-guarded. Scoped package path splitting is correct. Fallback only activates when workspace map misses the dep.

Submodule pointer bumps:

  • CONCERN: steward-fi and test/contracts/lib/openzeppelin-contracts are bumped to new commits. Neither is explained in the PR body nor related to Windows build hardening. Scope creep from a new contributor.

7. Security: clear. No new outbound endpoints, no credential exposure, no prompt injection vectors. The bunx tsup@<version> fallback makes a network call to npm at setup time — expected for a build-time fallback and equivalent to bun install.

8. Decision: REQUEST CHANGES

Two changes required before merge:

Blocking:

  1. Bun version asymmetry — test-electrobun-release.yml is on 1.3.11 while all release Electrobun workflows stay on 1.3.9. Either bump all Electrobun workflow pins to 1.3.11 in this PR, or revert the test-workflow bump and do it as a coordinated PR covering all files.
  2. Unrelated submodule bumps — The steward-fi and test/contracts/lib/openzeppelin-contracts pointer updates are unexplained and orthogonal to Windows setup hardening. Remove them from this PR.

Non-blocking (address before or in a follow-up):

  • ensureWindowsCmdShim hardcodes node — add a comment that this is intentionally Node-only so future contributors know to revisit if non-Node bins are added.

The core Windows hardening logic is sound and the test coverage is good. Two administrative issues are blocking merge.

@github-actions github-actions bot added category:feature Auto-managed semantic PR category trust:probationary Building trust, closer scrutiny (auto-managed) and removed trust:probationary Building trust, closer scrutiny (auto-managed) category:feature Auto-managed semantic PR category labels Apr 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

Here is the completed review for PR #1770:


1. Classification (derived): build/CI fix — reclassified from "feature" (label); the title is accurate, all meaningful changes are in setup scripts and CI workflows with no new user-facing capability.

2. Rubric (derived): Logic correctness of the submodule repair paths, idempotency of the new shim/build logic, CI workflow consistency, and shell command construction safety.

3. Scope verdict: in scope

4. Universal invariants: all intact — the PR does not touch packages/agent/src/runtime/eliza.ts, scripts/run-node.mjs, apps/app/electrobun/src/native/agent.ts, patch-deps.mjs, or any access control files.

5. Judgment:

  • Needed? — OK. The Windows-only git submodule update --depth 1 plugins/plugin-agent-orchestrator hardcode was a poor workaround that excluded all other plugin submodules. The empty-worktree repair and manifest-based readiness check address a real class of failures.
  • Better than existing? — OK. Consolidating platform-conditional workflow steps into a cross-platform script is strictly better.
  • Duplication? — OK.
  • Blast radius covered? — CONCERN: two unrelated submodule hash bumps (steward-fi, test/contracts/lib/openzeppelin-contracts) included with no explanation.
  • Logic sound? — OK. isSubmoduleCheckoutReady correctly short-circuits on plugins/ with .some() semantics. The read-tree --reset -u HEAD repair is a well-known fix for empty post-init worktrees. hasPluginBuildOutputs (checking declared types artifact, not just dist/) is correct. The prepare-package-dist fallback to installed package version for workspace:* deps is sound.
  • Complexity appropriate? — OK.
  • Tested meaningfully? — OK. init-submodules-script.test.ts and setup-upstreams.test.ts cover the repair paths. prepare-package-dist.test.ts uses a real temp directory. Tests fail without the fix.
  • Matches conventions? — CONCERN: see specific issues below.
  • Plausible breakage mode: bunx tsup@<version> fetches from npm at setup time. If the extracted version was yanked, the fallback silently skips the build (continue runs without validating dist/ exists) — the missing dist/ is only caught at runtime.

6. PR-type-specific checks:

  • CI consolidation — OK: three workflows unified to node scripts/init-submodules.mjs; drift test updated.
  • Bun version bump (partial) — CONCERN: test-electrobun-release.yml is bumped 1.3.91.3.11. The other electrobun release workflows stay at 1.3.9 with an explicit comment: "pinned: 1.3.10 clean-lockfile breaks Windows frozen installs." The PR skips 1.3.10, jumps to 1.3.11 in one workflow only, with no explanation of whether 1.3.11 resolves the Windows frozen-install regression. This creates a version skew between test and release workflows that could mask a Windows breakage in production releases.
  • ensureWindowsCmdShim uses node not bun — CONCERN: The shim calls node "${target}" %*. elizaOS plugin bins are ES modules that may use Bun-specific APIs or #!/usr/bin/env bun shebangs. Calling them with node instead of bun may silently fail on Windows. Needs either a change to bun or an explicit comment explaining the Node.js compatibility assumption.
  • CMD escaping — low risk: only """ is escaped; Windows filesystem paths can't contain &, | etc. in practice, so this is fine but worth a comment.
  • Submodule pointer bumps — CONCERN: steward-fi and openzeppelin-contracts bumped with no mention in the PR description. Should be explained or split out.
  • readInstalledPackageVersion — OK: correctly handles scoped packages and returns undefined on read errors.

7. Security: clear.

8. Decision: REQUEST CHANGES

Required before merge:

  1. Bun version bump: explain why only test-electrobun-release.yml gets 1.3.11 while release workflows stay at 1.3.9. If 1.3.11 fixes the frozen-install regression, bump all electrobun workflows together and update/remove the pin comment. If it doesn't, revert.

  2. Unrelated submodule bumps: remove steward-fi + openzeppelin-contracts pointer changes and commit them separately with explanation, or justify them in the PR description.

  3. ensureWindowsCmdShim uses node: change to bun (the project runtime) or add an explicit comment documenting the Node.js-compatibility assumption for all plugin bins.


The gh pr comment command was blocked for approval. Re-run the command or approve it to post this review to GitHub.

@github-actions github-actions bot added category:feature Auto-managed semantic PR category trust:probationary Building trust, closer scrutiny (auto-managed) and removed trust:probationary Building trust, closer scrutiny (auto-managed) category:feature Auto-managed semantic PR category labels Apr 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

Here is the completed review for PR #1770. Since posting required tool approval that wasn't granted, I'll display it here:


1. Classification (derived): workflow/CI + build infrastructure — reclassified because: the diff is entirely build scripts, test fixtures, and workflow YAML; the "feature" label is misleading — this is a bug fix / infrastructure hardening for Windows dev setup

2. Rubric (derived): Logic soundness in the new init/repair paths, correctness of the CI workflow consolidation, no regressions for non-Windows platforms, test coverage for each new code path, and absence of unexplained bundled changes.

3. Scope verdict: in scope

4. Universal invariants: all intact — NODE_PATH, patch-deps, Electrobun guards, namespacing, ports, plugin imports, StartupPhase, VrmViewer, RPC schema, dev observability, and access control files are untouched.

5. Judgment:

  • Needed? — OK. Real issue: Windows git submodule update can complete without populating the worktree, and bun run build fails when the plugin bin isn't a .cmd shim. Both are confirmed by the pre-existing Windows workarounds in the workflow files.

  • Better than existing? — OK. The old workaround only initialized plugin-agent-orchestrator on Windows, silently skipping all other plugins. The new approach handles all tracked plugins uniformly.

  • Duplication? — OK.

  • Blast radius covered? — CONCERN: steward-fi and test/contracts/lib/openzeppelin-contracts submodule pointers are bumped with zero explanation. These are unrelated to Windows repo-setup hardening. A probationary contributor bundling unexplained external-repo pointer changes is a flag.

  • Logic sound? — CONCERN: Dead code in the modified isSubmoduleCheckoutReady. After the two early-return branches (markerPaths.length > 0 and pluginWorkspaceManifests.length > 0), the remaining code:

    if (markerPaths.length === 0) {
      return true;
    }
    return true;

    markerPaths.length is provably 0 here (the > 0 branch already returned), so the if is dead. Functionally harmless but should be a single return true;.

    ensurePluginBuildOutputs restructure is sound: bunx tsup@<version> fires only when the local tsup bin is absent; if present, falls through to bun run build. hasPluginBuildOutputs correctly gates on the types artifact — a partial dist missing declarations was previously treated as complete.

    git -C "${submodule.path}" read-tree --reset -u HEAD is legitimate Git worktree recovery. The quoting is consistent with the existing pre-PR pattern in the same file.

  • Complexity appropriate? — OK. All new helpers are single-purpose and directly used by their callers.

  • Tested meaningfully? — OK. Coverage is excellent: manifest-based readiness detection, explicit-marker precedence, re-init on missing manifest, worktree repair, cmd-shim writing, bunx fallback with exact/ranged/workspace tsup versions, dist-with-missing-types rebuild. Each test would fail without its corresponding change.

  • Matches conventions? — OK.

  • Plausible breakage mode: A plugin submodule whose workspace manifest lives at neither package.json nor typescript/package.json would be permanently detected as incomplete and trigger spurious re-inits on every install. Watch as the plugin set grows.

6. PR-type-specific checks:

  • Workflow consolidation (Windows fork removed, unified step added): OK
  • ci-workflow-drift.test.ts updated for Windows workflows: OK
  • Drift test count of 6 in test.yml unchanged: OK
  • node scripts/init-submodules.mjs used (not bun) for pre-setup-step compatibility: OK
  • No hardcoded ports, no new postinstall scripts, no patch-deps expansion: OK
  • Unexplained submodule bumps: CONCERN

7. Security: clear.

8. Decision: REQUEST CHANGES

Required before merge:

  1. Explain or revert the steward-fi and test/contracts/lib/openzeppelin-contracts submodule bumps. Not mentioned in the summary, no connection to the PR goal. If they crept in from a rebase, revert them.
  2. Clean up dead code in isSubmoduleCheckoutReady. Replace the trailing if (markerPaths.length === 0) { return true; } return true; block with a single return true;.
  3. Remove draft status when the above are resolved.

The core logic is correct and the test coverage is exemplary — two small fixes away from mergeable.

@github-actions github-actions bot added category:feature Auto-managed semantic PR category trust:probationary Building trust, closer scrutiny (auto-managed) and removed trust:probationary Building trust, closer scrutiny (auto-managed) category:feature Auto-managed semantic PR category labels Apr 9, 2026
- treat plugin submodules as incomplete until workspace manifests exist
- repair empty initialized submodule worktrees before declaring success
- add Windows-safe plugin bin shims with bunx tsup fallback
- rebuild plugin outputs when dist/ exists but type artifacts are missing
- fix CI workflow bootstrap for Windows smoke and preload jobs
- add tests covering submodule init, setup-upstreams, and workflow drift

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dutchiono dutchiono force-pushed the codex/windows-setup-scripts branch from b69fa9b to 7d8827c Compare April 9, 2026 14:45
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

The GitHub comment requires approval. Here is my complete structured review:


⚠️ Note: this is a DRAFT PR. Review is still actionable — flagging issues for pre-merge cleanup.

Classification (derived): workflow/CI + build tooling — consistent with author's claim. The category:feature label is wrong; this is a bug fix / robustness improvement.

Rubric (derived): Windows-path submodule init logic in init-submodules.mjs, setup-upstreams.mjs Windows cmd shim, CI workflow unification, and test coverage of new failure paths.

Scope verdict: in scope

Universal invariants: all intact — NODE_PATH sites, patch-deps.mjs, Electrobun startup guards, namespace/port/dynamic-import/StartupPhase/VrmViewer invariants all unaffected.


Judgment

  • Needed?OK. Windows submodule init was skipping all plugins except plugin-agent-orchestrator via a hardcoded git submodule update command. The fix generalizes this correctly.
  • Better than existing?OK. Replacing shell: bash (requires Git Bash) + hardcoded submodule path with node scripts/init-submodules.mjs (portable, extensible) is a clear improvement.
  • Duplication?OK.
  • Blast radius covered?OK. All three Windows CI workflows updated, drift test updated, tests added.
  • Logic sound?OK with one nit below. Happy path: isSubmoduleCheckoutReady now correctly fails for plugin submodules with empty worktrees; read-tree --reset -u HEAD repairs; second check either passes or throws. Failure paths: if read-tree throws, outer catch increments failed++ — correct. If bunx fallback throws, propagates — correct.
  • Complexity appropriate?OK. Each new function is small and focused.
  • Tested meaningfully?OK. Tests cover: plugin manifest requirement, explicit-markers precedence, re-init on missing manifest, read-tree repair path, cmd shim content, tsup bunx fallback with version pinning, tsup arg forwarding, loose-version fallback to latest, dist-present-but-types-missing rebuild.
  • Matches conventions?OK.
  • Plausible breakage mode identified? — A plugin submodule that has neither package.json nor typescript/package.json (e.g., a non-TS plugin, or a differently-nested manifest) would be stuck in a re-init loop. All current elizaOS plugins are TypeScript so this is theoretical, but getPluginWorkspaceManifestPaths should carry a comment acknowledging the assumption.

PR-type-specific checks

  • Workflow pinned versions: N/A — no new action versions.
  • Concurrency groups: OK — preserved.
  • Drift test count: OK — the .toBe(6) assertion targets ci.yml only; test.yml changes don't affect it. New toContain assertions correctly guard the Windows smoke workflows.
  • windows-dev-smoke.yml checkout: OKactions/checkout@v4 defaults to submodules: false; install uses --ignore-scripts, so the explicit init step is required.
  • Double-init risk in preload smoke: acceptable — idempotent.
  • normalizeTsupVersionSpec injection safety: OK — version passes strict semver regex or falls back to "latest"; no shell-special characters in either path.
  • ensureWindowsCmdShim quoting: OK for internal use — paths come from trusted package.json bin entries, not user input.

Bug: dead code in isSubmoduleCheckoutReady

After the refactor, the tail of the function is:

if (markerPaths.length === 0) {   // always true at this point
  return true;
}
return true;  // unreachable

At that point markerPaths.length is always 0 because the > 0 case already returned. Both statements are dead. Biome may flag the unreachable return. Simplify the tail to just:

return true;

Security

Clear. No prompt injection vectors, no credential exposure, no unexpected outbound HTTP. The bunx tsup@<version> invocation sources the version from devDependencies in plugin package.json files we control — no greater supply-chain exposure than bun install.


Decision: REQUEST CHANGES

Single required fix: remove the dead if (markerPaths.length === 0) { return true; } + trailing return true; from isSubmoduleCheckoutReady — replace with a single return true;. Everything else is solid work.

@github-actions github-actions bot added category:feature Auto-managed semantic PR category trust:probationary Building trust, closer scrutiny (auto-managed) and removed trust:probationary Building trust, closer scrutiny (auto-managed) category:feature Auto-managed semantic PR category labels Apr 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

Here is my review of PR #1770:


1. Classification (derived): workflow/CI + build-tooling fix — consistent with author's claim. The core is script hardening (init-submodules readiness, read-tree repair, cmd shims, bunx tsup fallback); the workflow changes consolidate a Windows-only workaround into the general script.

2. Rubric (derived): CI correctness, script logic soundness, lock file integrity (was bun install run in a state that corrupts workspace resolutions?), and meaningful test coverage for new code paths.

3. Scope verdict: in scope

4. Universal invariants: all intact — only scripts, tests, and CI workflows touched; none of the three NODE_PATH sites, patch-deps.mjs, Electrobun guards, namespace config, port constants, or access control files modified.


5. Judgment:

  • Needed? — OK. The Windows-specific init path was a maintenance hazard that could drift. The empty-worktree repair and cmd-shim additions address real installation failure modes.

  • Better than existing? — OK. The OS-conditional in test.yml was a fragility. The new code is more principled: detect incompleteness, repair, re-check, fail loudly if still broken.

  • Duplication? — OK. No existing equivalent for any of the new helpers.

  • Blast radius covered?CONCERN: The bun.lock was committed while @elizaos/plugin-openrouter resolved from npm (2.0.0-alpha.13) instead of the local workspace submodule. The .gitmodules still tracks plugins/plugin-openrouter as a submodule, but the lock now pins the npm release. Any developer running bun install after merge will silently get the npm-published package instead of the workspace checkout. The most likely cause: bun install was run with MILADY_SKIP_LOCAL_UPSTREAMS=1 (the same env var now added to Windows CI), and that lock was committed verbatim. Must be regenerated with the submodule present and MILADY_SKIP_LOCAL_UPSTREAMS unset.

  • Logic sound? — OK on the core paths.

    1. isSubmoduleCheckoutReady now requires a workspace manifest for any plugins/* without explicit markers. Correct — the old "return true with no markers" was the root cause.
    2. git -C "${submodule.path}" read-tree --reset -u HEAD is the right plumbing command for the "initialized but empty working tree" Windows edge case. The double-check after repair correctly surfaces unrecoverable states.
    3. normalizeTsupVersionSpec pins ^8.3.5tsup@8.3.5 (strips range prefix). Intentional, acceptable for a bootstrap path, but deserves an inline comment.
  • Complexity appropriate? — OK. Each new helper is single-responsibility.

  • Tested meaningfully? — OK. Tests cover: plugin submodule incomplete without manifest, explicit markers take precedence, reinit on empty worktree, read-tree repair path, cmd shim contents, tsup fallback with exact version, args passthrough, non-semver version → latest, partial build (dist exists but .d.ts missing). Notably thorough for a first contribution.

  • Matches conventions? — CONCERN (minor): plugins/plugin-groq is deleted as a tracked submodule (mode 160000) with no mention in the PR description. The .gitmodules on the base already omits it (stale gitlink), so removal is likely valid cleanup — but deserves a sentence of explanation from a new contributor.

  • Plausible breakage mode: Developer merges, runs bun install, @elizaos/plugin-openrouter silently resolves to npm 2.0.0-alpha.13 instead of the workspace checkout. Local changes to that plugin become invisible to the runtime.


6. PR-type-specific checks:

  • Workflow consolidation uses node scripts/init-submodules.mjs on all platforms: OK — enforced by new ci-workflow-drift.test.ts assertions
  • MILADY_SKIP_LOCAL_UPSTREAMS: "1" added to Windows CI: OK — correct for CI without local elizaOS checkouts
  • Pinned action versions unchanged: OK
  • No runtime deps added: OK — jsdom is dev-only
  • @elizaos/plugin-openrouter workspace resolution intact in lock file: FAIL

7. Security: clear. ensureWindowsCmdShim escapes """ (correct for CMD). The bunx tsup@version fallback only triggers when the build script is literally tsup or tsup <args>. Supply chain risk is equivalent to the existing bun run build path.


8. Decision: REQUEST CHANGES

Required before merge:

  1. Lock file: Regenerate bun.lock with plugins/plugin-openrouter submodule initialized and MILADY_SKIP_LOCAL_UPSTREAMS unset. The current lock silently downgrades it to the npm release.
  2. plugins/plugin-groq removal: If intentional cleanup of a stale gitlink, say so in the PR description. If accidental, revert it.
  3. Draft status: Mark as ready when the above are resolved.

The core hardening logic is sound and the test coverage is thorough. Once the lock file is clean this should be fast to re-review.

@github-actions github-actions bot added category:feature Auto-managed semantic PR category trust:probationary Building trust, closer scrutiny (auto-managed) and removed trust:probationary Building trust, closer scrutiny (auto-managed) category:feature Auto-managed semantic PR category labels Apr 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

Here is my completed review. I'll show it to you here — let me know if you want it posted.


⚠️ Note: This PR is in DRAFT state. Reviewing as submitted — feedback is for when you're ready to mark it ready-for-review.


1. Classification (derived): workflow/CI + build-scripts — consistent with author's claim. Three overlapping concerns: submodule initialization hardening, Windows Bun version pinning, and plugin build fallback logic.

2. Rubric (derived): Does the cross-platform submodule init preserve all existing initialization paths? Does the plugin build fallback introduce a regression for non-tsup packages? Are tests asserting the right outcomes?

3. Scope verdict: in scope

4. Universal invariants: all intact — NODE_PATH, patch-deps, Electrobun guards, namespace, ports, imports, UI defaults, StartupPhase, VrmViewer, RPC sync, dev observability, access control files all untouched.

5. Judgment:

  • Needed? — OK. Windows CI was broken by an incomplete submodule workaround; this addresses a real, reproducible failure mode.

  • Better than existing? — OK for the submodule unification. The --depth 1 Windows-only branch is correctly replaced with a generic script that handles plugin readiness. git -C "${path}" read-tree --reset -u HEAD as a repair step for empty initialized worktrees is an established Git pattern.

  • Duplication? — OK.

  • Blast radius covered? — CONCERN: windows-dev-smoke.yml hardcodes bun-version: "1.3.9" and test.yml uses an expression matrix.os == 'windows-latest' && '1.3.9' || env.BUN_VERSION — two sources of truth for the same pin. Bumping one will miss the other.

  • Logic sound? — CONCERN: The diff for ensurePluginBuildOutputs shows the original bun run build block removed (- lines) and replaced only with a tsup-specific bunx fallback. If getBuildCommandFallback returns null (non-tsup build) or the tsup binary already exists, the code falls through the bunx block and does nothing — no build runs. The existing test at line 362 ("builds root-level plugin packages…") uses build: "bun run build.ts" (not tsup), so getBuildCommandFallback returns null, and if bun run build is gone, that test should fail. Either the diff is truncated and bun run build survives as an unconditional fallback — which is the intended design but invisible in the diff — or the existing test is now broken. This must be resolved before merge.

  • Complexity appropriate?getBuildCommandFallback's split(/\s+/).slice(1) to re-parse build script args is brittle for quoted arguments. Acceptable for current plugin corpus, but flagged.

  • Tested meaningfully? — CONCERN: "rebuilds plugin packages when dist exists but the declared types file is missing" has an inconsistent expected value. The mock has node_modules/.bin/tsup returning true, so the bunx path is not triggered. The code should fall through to bun run build with label: "bun run build (@elizaos/plugin-local-embedding)". But the test asserts label: "bunx tsup@8.5.0 (@elizaos/plugin-local-embedding)" — the bunx label on a bun command. This is a copy-paste error or tests behavior that doesn't exist.

  • Matches conventions? — OK.

  • Plausible breakage mode: A plugin with a partial dist/ (e.g., index.js present, index.d.ts missing after an interrupted build) is recognized as needing rebuild by hasPluginBuildOutputs, but if the build system isn't tsup and bun run build was removed, setup completes silently without rebuilding. Developer gets a broken import at runtime with no error.

6. PR-type-specific checks:

  • Submodule init unified across all three workflow files: OK
  • MILADY_SKIP_LOCAL_UPSTREAMS: "1" in Windows smoke workflows (prevents unnecessary eliza checkout): OK
  • git read-tree --reset -u HEAD tried once then hard-fail — doesn't loop: OK
  • isSubmoduleCheckoutReady precedence (explicit markers → plugin manifests → true): OK, logic correct
  • ensureWindowsCmdShim sanitization: CONCERN — only " is escaped. A bin path containing \n, %, &, | would break or inject into the CMD script. Unrealistic in practice for npm bin entries, but worth a hard reject-if-invalid check rather than partial escaping.
  • Bun pin documented with inline comment: OK
  • Drift test updated for new patterns: OK

7. Security: clear. No credentials, no outbound HTTP, no auth changes. CMD shim sanitization gap limited to local dev machines.

8. Decision: REQUEST CHANGES

Blockers before merge:

  1. Resolve the bun run build fate. The diff marks it removed; the existing test requires it. If it survives as a fallback after the bunx block, make that visible in the diff. If removed, add it back and test the non-tsup path explicitly.

  2. Fix the test label in "rebuilds plugin packages when dist exists but the declared types file is missing" — expected label should be "bun run build (@elizaos/plugin-local-embedding)", not the bunx label.

  3. Single source of truth for the Windows Bun pin. Define 1.3.9 as a shared env var or add a drift assertion verifying both workflow files agree on the same string.


@github-actions github-actions bot added category:feature Auto-managed semantic PR category trust:probationary Building trust, closer scrutiny (auto-managed) and removed trust:probationary Building trust, closer scrutiny (auto-managed) category:feature Auto-managed semantic PR category labels Apr 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

The review is complete. Here's my verdict:


1. Classification (derived): build/CI + aesthetic — reclassified from feature. ~20% real fix, ~80% Biome formatter churn across 37 unrelated files.

2. Rubric: Correctness of submodule init logic, safety of new shell commands, bunx fallback path correctness, workflow invariant compliance, justification for bundled formatting mass.

3. Scope verdict: in scope (core fix); out of scope (37 formatter-only files)

4. Universal invariants: all intact

5. Key judgment calls:

  • Needed? OK — real Windows breakage (empty submodule worktrees, missing cmd shims)
  • Logic sound? CONCERN — app-manager.test.ts changes three assertions from toEqual({authToken, characterId})toBeUndefined() with no explanation. Semantic regression or silent test correction?
  • Tested? OK for new code — ensureWindowsCmdShim, bunx fallback, submodule repair all have unit tests
  • Conventions? CONCERN — 1,684 lines of Biome reformatting obscures the 503 lines of actual logic change

6. PR-specific checks: Bun 1.3.9 pinned on Windows with no tracking issue to unpin when upstream fixes the frozen-lockfile false-positive. The registry-client.test.ts switch to refreshRegistry() and the @hyperscape/plugin-hyperscape correction are both legitimate fixes.

7. Security: clear

Decision: REQUEST CHANGES

Required before merge:

  1. Exit DRAFT state (self-blocking)
  2. Explain the authMessage assertion changes in app-manager.test.ts
  3. Isolate or justify the 37-file formatter churn
  4. Add tracking reference for the Bun 1.3.9 pin

@github-actions github-actions bot added category:feature Auto-managed semantic PR category trust:probationary Building trust, closer scrutiny (auto-managed) and removed trust:probationary Building trust, closer scrutiny (auto-managed) category:feature Auto-managed semantic PR category labels Apr 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

The review is ready. The command to post it was blocked — please approve it to publish, or let me know if you'd like any changes to the review text first.

Summary of the verdict:

REQUEST CHANGES — The Windows CI hardening work itself (workflow consolidation, init-submodules.mjs empty-worktree repair, setup-upstreams.mjs cmd shims + tsup fallback) is solid and well-tested. However, the PR has three blocking concerns that need resolution before merge:

  1. Auth assertion weakening in app-manager.test.ts — tests that previously verified auth tokens are forwarded now assert toBeUndefined(). No clear explanation in the code.
  2. Lockfile removes groq plugin workspace entry without explaining whether the submodule was intentionally removed or if the lockfile was refreshed without it checked out (a frozen-install trap for other developers).
  3. git read-tree --reset -u HEAD runs with no guard on local dev, silently discarding uncommitted changes in plugin submodules.

Plus a strong recommendation to split the ~800 lines of Biome reformatting from the behavioral changes.

@github-actions github-actions bot added category:feature Auto-managed semantic PR category trust:probationary Building trust, closer scrutiny (auto-managed) and removed trust:probationary Building trust, closer scrutiny (auto-managed) category:feature Auto-managed semantic PR category labels Apr 9, 2026
@dutchiono
Copy link
Copy Markdown
Collaborator Author

Consolidated into #1774. Closing this PR so the recovery work is reviewed and validated in one place.

@dutchiono dutchiono closed this Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build category:feature Auto-managed semantic PR category ci tests trust:probationary Building trust, closer scrutiny (auto-managed)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant