Skip to content

sync: rebase on upstream (2026-03-26)#959

Closed
AgentWOPR wants to merge 17 commits intoNVIDIA:mainfrom
wopr-network:sync/upstream-2026-03-26
Closed

sync: rebase on upstream (2026-03-26)#959
AgentWOPR wants to merge 17 commits intoNVIDIA:mainfrom
wopr-network:sync/upstream-2026-03-26

Conversation

@AgentWOPR
Copy link
Copy Markdown

@AgentWOPR AgentWOPR commented Mar 26, 2026

Automated upstream sync

Rebased our WOPR sidecar commits onto upstream/main (NVIDIA/NemoClaw).

What this does

  • Pulls in latest upstream changes (security fixes, features, CI improvements)
  • Resolves any rebase conflicts (preserving wopr/ sidecar)
  • Verifies sidecar + Dockerfile integrity

Verify

  • Build passes
  • wopr/sidecar.js intact
  • Dockerfile includes sidecar setup

Summary by CodeRabbit

  • New Features

    • WOPR sidecar service with health and authenticated provisioning endpoints for tenant onboarding; returns tenant info on success.
    • Sidecar listens on port 3100 and can run as the container foreground process.
    • Automated upstream sync flow with optional PR/push/dry-run modes and conflict-resolution assistance.
  • Chores

    • CI workflow for building and publishing container images.
    • Improved container initialization to seed default configs and ensure runtime directories.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

📝 Walkthrough

Walkthrough

Adds a WOPR sidecar (HTTP health/provisioning), CI and upstream-sync GitHub Actions, Dockerfile/runtime initialization changes to include sidecar and default configs, and a new upstream-sync script that automates rebasing and conflict resolution using Claude agent SDK.

Changes

Cohort / File(s) Summary
CI Workflows
.github/workflows/ci.yml, .github/workflows/upstream-sync.yml
Adds CI pipeline to build/push Docker images and a scheduled/manual upstream-sync workflow that fetches upstream, prepares environment, and runs the sync script with `pr
WOPR Sidecar
wopr/sidecar.js, wopr/package.json
New Node.js sidecar exposing GET /internal/health and POST /internal/provision for tenant provisioning (bearer auth), configures ~/.openclaw/openclaw.json, and returns tenant metadata.
Docker image / Runtime defaults
Dockerfile
Copies wopr/ into image, pre-creates OpenClaw runtime dirs with restrictive perms, snapshots /sandbox/.openclaw and /sandbox/.nemoclaw to /opt/nemoclaw-defaults/, sets USER sandbox, HOME=/data, and exposes port 3100.
Entrypoint / Startup
scripts/nemoclaw-start.sh
On startup, restores defaults into ${HOME}/.openclaw and ${HOME}/.nemoclaw if missing, ensures agent directory exists, creates /tmp/gateway.log, and conditionally execs the WOPR sidecar (/opt/wopr/sidecar.js) as the foreground process.
Upstream sync automation
scripts/upstream-sync.mjs
New sync script that fetches upstream, creates backup branch, rebases, invokes Claude agent SDK to resolve conflicts and fix sidecar syntax issues, runs basic checks, and either opens a PR, force-pushes, or dry-runs based on flags.
Misc / Artifacts
agent-events.log (produced), ~/.claude/.credentials.json (written by workflow)
Workflows/scripts emit agent-events.log and workflows write Claude credentials for agent SDK runtime (artifact upload retained).

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Workflow as "upstream-sync workflow"
    participant Script as "scripts/upstream-sync.mjs"
    participant Git as "Git (local/upstream/origin)"
    participant Agent as "Claude Agent SDK"
    participant GH as "GitHub (gh CLI)"

    User->>Workflow: trigger (cron or manual with mode)
    Workflow->>Script: run with mode (pr/push/dry-run), supply tokens
    Script->>Git: fetch upstream, compute behind-count
    alt behind > 0
        Script->>Git: create backup branch, attempt rebase upstream/main
        alt rebase succeeds
            Script->>Script: continue checks
        else rebase conflicts
            Script->>Git: list conflicting files
            Script->>Agent: send conflict markers & rules
            Agent->>Script: return resolved patches
            Script->>Git: stage resolved files, git rebase --continue
        end
        Script->>Script: run syntax checks (e.g., node --check wopr/sidecar.js)
        alt syntax issues
            Script->>Agent: request code fixes
            Agent->>Script: return fixes
            Script->>Git: apply fixes, continue
        end
        alt mode == dry-run
            Script->>Script: skip push
        else mode == push
            Script->>Git: force-push origin/main (with GH_TOKEN)
        else mode == pr
            Script->>Git: create sync branch, push
            Script->>GH: create PR via gh CLI
        end
    end
    Script->>Workflow: upload agent-events.log artifact
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 I hopped in the build with a hop and a spin,

I watched the sidecar guard the gateway within.
The sync agent stitched branches with nimble delight,
Workflows hum softly through day and through night.
A rabbit-approved release—ready to flight! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'sync: rebase on upstream (2026-03-26)' is generic and lacks clarity about the actual content of the changeset, which introduces a WOPR sidecar component with multiple workflow and script additions beyond just rebasing. Consider a more descriptive title such as 'Add WOPR sidecar with upstream sync automation' to better reflect the primary changes (sidecar endpoints, workflows, and helper scripts).
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@TSavo
Copy link
Copy Markdown

TSavo commented Mar 26, 2026

Accidental PR from automated fork sync — closing. Sorry!

TSavo pushed a commit to wopr-network/nemoclaw that referenced this pull request Mar 26, 2026
gh pr create defaults to parent (NVIDIA/NemoClaw) on forks.
Add --repo wopr-network/nemoclaw to target our fork.
Also clean up debug output from credential investigation.

Note: accidental PR NVIDIA#959 was opened on NVIDIA/NemoClaw — needs manual close.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@TSavo
Copy link
Copy Markdown

TSavo commented Mar 26, 2026

Accidental PR from automated fork sync script. Apologies!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/ci.yml:
- Around line 3-11: The workflow currently allows pull_request events to run the
"build-and-push" job on a self-hosted runner via runs-on: [self-hosted, Linux,
X64]; change this so untrusted fork PRs never run on your private runner by
splitting or conditioning the job: keep the self-hosted runner for trusted push
events (push to main) and add a separate job (or conditional job) for
pull_request that uses a GitHub-hosted runner like ubuntu-latest; update the
"on:" triggers and adjust the "build-and-push" job (or create a new
"build-and-push-pr" job) so that pull_request runs on github-hosted runners
while push continues to use runs-on: [self-hosted, Linux, X64].

In @.github/workflows/upstream-sync.yml:
- Around line 1-23: The workflow "Upstream Sync" lacks a concurrency guard so
overlapping runs of the sync job can race and force-push main; add a top-level
concurrency section to the workflow (above jobs) that defines a stable group
name (e.g., "upstream-sync-${{ github.workflow }}" or include event/input like
mode) and set cancel-in-progress: true to ensure any running sync is cancelled
when a new run starts; this change affects the workflow header and protects the
sync job named "sync" from concurrent executions that could rebase/push the same
branch.

In `@Dockerfile`:
- Around line 160-164: The final USER sandbox and ENV HOME=/data in the
Dockerfile force scripts/nemoclaw-start.sh into the non-root branch so it resets
HOME and never reaches the WOPR sidecar startup; fix by keeping the container
entrypoint executed as a privileged user so the privileged branch in
scripts/nemoclaw-start.sh runs and the sidecar starts: either remove or move the
final "USER sandbox" to after the ENTRYPOINT/CMD invocation (or explicitly set
USER root before the ENTRYPOINT), and ensure ENV HOME remains appropriate for
the privileged startup path so the code paths that start the WOPR sidecar (the
privileged branch in scripts/nemoclaw-start.sh) are reachable.
- Around line 157-159: The default snapshot copies /sandbox/.openclaw preserving
symlinks that point back to /sandbox/.openclaw-data; change the copy to
dereference symlinks so targets are copied instead of links. Update the RUN that
currently uses "cp -a /sandbox/.openclaw /opt/nemoclaw-defaults && cp -a
/sandbox/.nemoclaw /opt/nemoclaw-defaults/.nemoclaw" to a form that dereferences
links (e.g., use "cp -aL" or an rsync equivalent) so the files under
/opt/nemoclaw-defaults no longer reference /sandbox and
scripts/nemoclaw-start.sh will copy a self-contained tree into ${HOME} without
pointing back to /sandbox.

In `@scripts/nemoclaw-start.sh`:
- Around line 310-315: The current script does an exec on the sidecar (exec node
/opt/wopr/sidecar.js) which makes the gateway unreachable and prevents PID 1
from supervising both processes; instead start the sidecar in background (&) and
capture its PID (e.g. SIDECAR_PID=$!), ensure the gateway is also started in
background earlier and its PID captured (GATEWAY_PID), add traps to forward
SIGTERM/SIGINT to both PIDs (kill -TERM "$GATEWAY_PID" "$SIDECAR_PID"), then use
wait -n "$GATEWAY_PID" "$SIDECAR_PID" so PID 1 exits when either process dies
(optionally exit with that process's status); replace the exec node
/opt/wopr/sidecar.js line with this supervised background + wait/trap pattern.

In `@scripts/upstream-sync.mjs`:
- Around line 289-293: The debug log currently prints a substring of the GH
token (ghToken.slice(0, 8)) which leaks secret material; remove that emission
and instead either omit any GH_TOKEN-derived output or replace it with a
non-secret indicator (e.g., "GH_TOKEN: [REDACTED]" or a boolean/opt-in flag) in
the debug block that uses log and tryRun; ensure the ghToken variable is never
logged, and gate any auth diagnostics behind an explicit debug flag (or
environment opt-in) before calling log with auth-related messages from tryRun/gh
auth status.
- Around line 243-257: After invoking runAgent to repair wopr/sidecar.js when
sidecarCheck.ok is false, re-run the syntax check (call tryRun("node --check
wopr/sidecar.js") again) and verify the new result; if the second sidecar check
still fails, log an explicit error and fail the process (e.g., exit non-zero or
throw) so buildCheck() cannot fall through to "Build check passed."; update the
logic around sidecarCheck, tryRun, and runAgent to perform this re-validation
and abort on persistent syntax errors.
- Around line 101-120: The agent is being granted shell access and bypassed
permissions in runAgent, exposing live tokens; update runAgent to remove
dangerous tools and flags by default (change the default tools list in runAgent
from including "Bash" to only ["Read","Edit","Write","Glob","Grep"] and ensure
the options passed to _query do not set permissionMode: "bypassPermissions" or
allowDangerouslySkipPermissions: true), keep all git/git-add/rebase operations
in the host script (not performed by the agent), and in buildCheck ensure that
after the agent attempts any fixes you re-run the host process check (run "node
--check" or equivalent from the host) to validate the repair succeeded; also
remove any logging of GH_PAT (do not log first 8 chars or any substring) —
replace uses that log GH_PAT with a no-op or redacted message referencing the
GH_PAT variable name.

In `@wopr/sidecar.js`:
- Around line 15-24: The code currently treats an empty SECRET as a valid
credential; update the authentication to fail-closed by making assertSecret
return false if SECRET is empty (or otherwise invalid) before comparing tokens,
or enforce presence at startup and exit if SECRET is falsy; specifically, modify
the assertSecret function (and/or the top-level SECRET initialization check) so
that when SECRET is undefined/empty the function always rejects the request,
ensuring /internal/provision cannot be written when the secret is not
configured.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f0da510a-13fd-47fc-9b91-3f44e61478ea

📥 Commits

Reviewing files that changed from the base of the PR and between f0f53e4 and 7aacae8.

📒 Files selected for processing (7)
  • .github/workflows/ci.yml
  • .github/workflows/upstream-sync.yml
  • Dockerfile
  • scripts/nemoclaw-start.sh
  • scripts/upstream-sync.mjs
  • wopr/package.json
  • wopr/sidecar.js

Comment on lines +3 to +11
on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
build-and-push:
runs-on: [self-hosted, Linux, X64]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Don't execute fork PRs on the self-hosted runner.

pull_request plus runs-on: [self-hosted, ...] lets untrusted PR code run arbitrary Docker build steps on your private runner. Keep self-hosted execution for trusted pushes/same-repo branches only, or move PR builds to GitHub-hosted runners.

🔒 Minimal guard
 jobs:
   build-and-push:
+    if: github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository
     runs-on: [self-hosted, Linux, X64]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
build-and-push:
runs-on: [self-hosted, Linux, X64]
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
build-and-push:
if: github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository
runs-on: [self-hosted, Linux, X64]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/ci.yml around lines 3 - 11, The workflow currently allows
pull_request events to run the "build-and-push" job on a self-hosted runner via
runs-on: [self-hosted, Linux, X64]; change this so untrusted fork PRs never run
on your private runner by splitting or conditioning the job: keep the
self-hosted runner for trusted push events (push to main) and add a separate job
(or conditional job) for pull_request that uses a GitHub-hosted runner like
ubuntu-latest; update the "on:" triggers and adjust the "build-and-push" job (or
create a new "build-and-push-pr" job) so that pull_request runs on github-hosted
runners while push continues to use runs-on: [self-hosted, Linux, X64].

Comment on lines +1 to +23
name: Upstream Sync

on:
schedule:
# Daily at 06:30 UTC (offset from paperclip's 06:00)
- cron: "30 6 * * *"
workflow_dispatch:
inputs:
mode:
description: "Sync mode"
required: true
default: "pr"
type: choice
options:
- pr
- push
- dry-run

jobs:
sync:
runs-on: self-hosted
timeout-minutes: 20

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add workflow concurrency before this can push main.

This workflow can be started by schedule, manual dispatch, and reruns, but there is no concurrency guard. Two overlapping runs can rebase from the same starting point and both force-push main or reuse the same sync/upstream-YYYY-MM-DD branch.

⛔ Serialize sync runs
 name: Upstream Sync
+
+concurrency:
+  group: upstream-sync-main
+  cancel-in-progress: true
 
 on:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/upstream-sync.yml around lines 1 - 23, The workflow
"Upstream Sync" lacks a concurrency guard so overlapping runs of the sync job
can race and force-push main; add a top-level concurrency section to the
workflow (above jobs) that defines a stable group name (e.g., "upstream-sync-${{
github.workflow }}" or include event/input like mode) and set
cancel-in-progress: true to ensure any running sync is cancelled when a new run
starts; this change affects the workflow header and protects the sync job named
"sync" from concurrent executions that could rebase/push the same branch.

Comment on lines +157 to +159
# Save build-time config as defaults — startup script copies to writable HOME
RUN cp -a /sandbox/.openclaw /opt/nemoclaw-defaults \
&& cp -a /sandbox/.nemoclaw /opt/nemoclaw-defaults/.nemoclaw
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

The default snapshot still points back into /sandbox/.openclaw-data.

Dockerfile.base makes /sandbox/.openclaw/* symlinks to absolute /sandbox/.openclaw-data/* targets. cp -a preserves those links here, so when scripts/nemoclaw-start.sh later copies this tree into ${HOME}/.openclaw, writes under HOME=/data still go back to /sandbox and fail on a read-only rootfs.

🐛 One minimal fix
-RUN cp -a /sandbox/.openclaw /opt/nemoclaw-defaults \
+RUN cp -aL /sandbox/.openclaw /opt/nemoclaw-defaults \
     && cp -a /sandbox/.nemoclaw /opt/nemoclaw-defaults/.nemoclaw
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Save build-time config as defaults — startup script copies to writable HOME
RUN cp -a /sandbox/.openclaw /opt/nemoclaw-defaults \
&& cp -a /sandbox/.nemoclaw /opt/nemoclaw-defaults/.nemoclaw
# Save build-time config as defaults — startup script copies to writable HOME
RUN cp -aL /sandbox/.openclaw /opt/nemoclaw-defaults \
&& cp -a /sandbox/.nemoclaw /opt/nemoclaw-defaults/.nemoclaw
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile` around lines 157 - 159, The default snapshot copies
/sandbox/.openclaw preserving symlinks that point back to
/sandbox/.openclaw-data; change the copy to dereference symlinks so targets are
copied instead of links. Update the RUN that currently uses "cp -a
/sandbox/.openclaw /opt/nemoclaw-defaults && cp -a /sandbox/.nemoclaw
/opt/nemoclaw-defaults/.nemoclaw" to a form that dereferences links (e.g., use
"cp -aL" or an rsync equivalent) so the files under /opt/nemoclaw-defaults no
longer reference /sandbox and scripts/nemoclaw-start.sh will copy a
self-contained tree into ${HOME} without pointing back to /sandbox.

Comment on lines +310 to +315
# Start WOPR sidecar in foreground — keeps the container alive.
# Serves /internal/health + /internal/provision for nemoclaw-platform.
if [ -f /opt/wopr/sidecar.js ]; then
echo "[wopr-sidecar] starting in foreground (port ${PORT:-3100})"
exec node /opt/wopr/sidecar.js
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep PID 1 supervising both the gateway and the sidecar.

After the background launch at Lines 303-305, exec node /opt/wopr/sidecar.js makes Line 319 unreachable. If the gateway dies later, the container stays up behind the sidecar process and can continue serving stale health. Use wait -n/traps or a small supervisor so either process brings the container down.

🔁 One way to keep both processes under supervision
 if [ -f /opt/wopr/sidecar.js ]; then
   echo "[wopr-sidecar] starting in foreground (port ${PORT:-3100})"
-  exec node /opt/wopr/sidecar.js
+  node /opt/wopr/sidecar.js &
+  SIDECAR_PID=$!
+  trap 'kill "$GATEWAY_PID" "$SIDECAR_PID" 2>/dev/null || true' EXIT INT TERM
+  wait -n "$GATEWAY_PID" "$SIDECAR_PID"
+  exit $?
 fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Start WOPR sidecar in foreground — keeps the container alive.
# Serves /internal/health + /internal/provision for nemoclaw-platform.
if [ -f /opt/wopr/sidecar.js ]; then
echo "[wopr-sidecar] starting in foreground (port ${PORT:-3100})"
exec node /opt/wopr/sidecar.js
fi
# Start WOPR sidecar in foreground — keeps the container alive.
# Serves /internal/health + /internal/provision for nemoclaw-platform.
if [ -f /opt/wopr/sidecar.js ]; then
echo "[wopr-sidecar] starting in foreground (port ${PORT:-3100})"
node /opt/wopr/sidecar.js &
SIDECAR_PID=$!
trap 'kill "$GATEWAY_PID" "$SIDECAR_PID" 2>/dev/null || true' EXIT INT TERM
wait -n "$GATEWAY_PID" "$SIDECAR_PID"
exit $?
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/nemoclaw-start.sh` around lines 310 - 315, The current script does an
exec on the sidecar (exec node /opt/wopr/sidecar.js) which makes the gateway
unreachable and prevents PID 1 from supervising both processes; instead start
the sidecar in background (&) and capture its PID (e.g. SIDECAR_PID=$!), ensure
the gateway is also started in background earlier and its PID captured
(GATEWAY_PID), add traps to forward SIGTERM/SIGINT to both PIDs (kill -TERM
"$GATEWAY_PID" "$SIDECAR_PID"), then use wait -n "$GATEWAY_PID" "$SIDECAR_PID"
so PID 1 exits when either process dies (optionally exit with that process's
status); replace the exec node /opt/wopr/sidecar.js line with this supervised
background + wait/trap pattern.

Comment on lines +101 to +120
async function runAgent(prompt, opts = {}) {
await loadSdk();
const phase = opts.phase ?? "unknown";
const tools = opts.tools ?? ["Read", "Edit", "Write", "Bash", "Glob", "Grep"];
let result = "";
let turnCount = 0;

log(`Agent [${phase}] starting (model: ${opts.model ?? "claude-sonnet-4-6"}, maxTurns: ${opts.maxTurns ?? 60})`);
logEvent(phase, { type: "agent_start", model: opts.model, maxTurns: opts.maxTurns ?? 60 });

for await (const message of _query({
prompt,
options: {
cwd: CWD,
allowedTools: tools,
permissionMode: "bypassPermissions",
allowDangerouslySkipPermissions: true,
maxTurns: opts.maxTurns ?? 60,
model: opts.model ?? "claude-sonnet-4-6",
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the complete runAgent function and surrounding code
cat -n scripts/upstream-sync.mjs | head -150 | tail -70

Repository: NVIDIA/NemoClaw

Length of output: 3016


🏁 Script executed:

# Check the GitHub workflow file for credential injection
cat -n .github/workflows/upstream-sync.yml | grep -A 5 -B 5 "CLAUDE_CODE_OAUTH_TOKEN\|GH_TOKEN"

Repository: NVIDIA/NemoClaw

Length of output: 535


🏁 Script executed:

# Examine the buildCheck function mentioned in the review
sed -n '243,257p' scripts/upstream-sync.mjs

Repository: NVIDIA/NemoClaw

Length of output: 528


🏁 Script executed:

# Examine the gitPush function and token logging
sed -n '289,293p' scripts/upstream-sync.mjs

Repository: NVIDIA/NemoClaw

Length of output: 459


The agent receives shell access while processing untrusted upstream content with live credentials in scope.

The workflow (lines 77-79 in .github/workflows/upstream-sync.yml) injects CLAUDE_CODE_OAUTH_TOKEN and GH_TOKEN into the process environment. The runAgent() function at line 104 grants the model the Bash tool by default, and line 116-117 set permissionMode: "bypassPermissions" and allowDangerouslySkipPermissions: true. This allows prompt injection in a conflicted upstream file to execute arbitrary commands on the runner with push rights. When buildCheck() (line 243-257) invokes the agent to fix syntax errors, those errors can come from untrusted upstream sources.

Restrict the agent to read/edit tools only (["Read", "Edit", "Write", "Glob", "Grep"]) and keep git operations (git add, git rebase --continue) in the host script.

Additionally, line 291 logs the first 8 characters of GH_PAT, which should be removed from logs entirely, and the agent fix at line 243-257 should re-run node --check afterward to validate the repair succeeded.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/upstream-sync.mjs` around lines 101 - 120, The agent is being granted
shell access and bypassed permissions in runAgent, exposing live tokens; update
runAgent to remove dangerous tools and flags by default (change the default
tools list in runAgent from including "Bash" to only
["Read","Edit","Write","Glob","Grep"] and ensure the options passed to _query do
not set permissionMode: "bypassPermissions" or allowDangerouslySkipPermissions:
true), keep all git/git-add/rebase operations in the host script (not performed
by the agent), and in buildCheck ensure that after the agent attempts any fixes
you re-run the host process check (run "node --check" or equivalent from the
host) to validate the repair succeeded; also remove any logging of GH_PAT (do
not log first 8 chars or any substring) — replace uses that log GH_PAT with a
no-op or redacted message referencing the GH_PAT variable name.

Comment on lines +243 to +257
// Check sidecar syntax
const sidecarCheck = tryRun("node --check wopr/sidecar.js");
if (!sidecarCheck.ok) {
log("Sidecar syntax check failed. Invoking agent to fix...");
await runAgent(
`The WOPR sidecar has a syntax error after upstream sync:

\`\`\`
${sidecarCheck.output.slice(0, 2000)}
\`\`\`

Fix the syntax error in wopr/sidecar.js. Do NOT remove sidecar functionality.`,
{ model: "claude-sonnet-4-6", phase: "sidecar-fix" },
);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Re-validate the sidecar after the auto-fix.

If the agent leaves wopr/sidecar.js broken, buildCheck() still falls through to Build check passed. and the sync can push a syntax error. Re-run node --check and fail if it still does not parse.

✅ Make the syntax repair verifiable
   const sidecarCheck = tryRun("node --check wopr/sidecar.js");
   if (!sidecarCheck.ok) {
     log("Sidecar syntax check failed. Invoking agent to fix...");
     await runAgent(
       `The WOPR sidecar has a syntax error after upstream sync:
@@
       { model: "claude-sonnet-4-6", phase: "sidecar-fix" },
     );
+    const sidecarRetry = tryRun("node --check wopr/sidecar.js");
+    if (!sidecarRetry.ok) {
+      log(`Sidecar syntax check still failing:\n${sidecarRetry.output}`);
+      return false;
+    }
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/upstream-sync.mjs` around lines 243 - 257, After invoking runAgent to
repair wopr/sidecar.js when sidecarCheck.ok is false, re-run the syntax check
(call tryRun("node --check wopr/sidecar.js") again) and verify the new result;
if the second sidecar check still fails, log an explicit error and fail the
process (e.g., exit non-zero or throw) so buildCheck() cannot fall through to
"Build check passed."; update the logic around sidecarCheck, tryRun, and
runAgent to perform this re-validation and abort on persistent syntax errors.

Comment on lines +15 to +24
const SECRET = process.env.WOPR_PROVISION_SECRET ?? "";
const GATEWAY_URL = process.env.WOPR_GATEWAY_URL ?? "";
const PORT = parseInt(process.env.PORT ?? process.env.WOPR_SIDECAR_PORT ?? "3100", 10);
const OPENCLAW_CONFIG_PATH = path.join(os.homedir(), ".openclaw", "openclaw.json");

function assertSecret(req) {
const auth = req.headers["authorization"] ?? "";
if (!auth.startsWith("Bearer ")) return false;
return auth.slice("Bearer ".length).trim() === SECRET;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail closed when the provision secret is missing.

SECRET defaults to "", so Authorization: Bearer passes assertSecret() in an unconfigured deployment. That turns /internal/provision into a writable config endpoint exactly when the secret is absent.

🔐 Minimal fail-closed fix
 function assertSecret(req) {
+  if (!SECRET) return false;
   const auth = req.headers["authorization"] ?? "";
   if (!auth.startsWith("Bearer ")) return false;
   return auth.slice("Bearer ".length).trim() === SECRET;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const SECRET = process.env.WOPR_PROVISION_SECRET ?? "";
const GATEWAY_URL = process.env.WOPR_GATEWAY_URL ?? "";
const PORT = parseInt(process.env.PORT ?? process.env.WOPR_SIDECAR_PORT ?? "3100", 10);
const OPENCLAW_CONFIG_PATH = path.join(os.homedir(), ".openclaw", "openclaw.json");
function assertSecret(req) {
const auth = req.headers["authorization"] ?? "";
if (!auth.startsWith("Bearer ")) return false;
return auth.slice("Bearer ".length).trim() === SECRET;
}
const SECRET = process.env.WOPR_PROVISION_SECRET ?? "";
const GATEWAY_URL = process.env.WOPR_GATEWAY_URL ?? "";
const PORT = parseInt(process.env.PORT ?? process.env.WOPR_SIDECAR_PORT ?? "3100", 10);
const OPENCLAW_CONFIG_PATH = path.join(os.homedir(), ".openclaw", "openclaw.json");
function assertSecret(req) {
if (!SECRET) return false;
const auth = req.headers["authorization"] ?? "";
if (!auth.startsWith("Bearer ")) return false;
return auth.slice("Bearer ".length).trim() === SECRET;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wopr/sidecar.js` around lines 15 - 24, The code currently treats an empty
SECRET as a valid credential; update the authentication to fail-closed by making
assertSecret return false if SECRET is empty (or otherwise invalid) before
comparing tokens, or enforce presence at startup and exit if SECRET is falsy;
specifically, modify the assertSecret function (and/or the top-level SECRET
initialization check) so that when SECRET is undefined/empty the function always
rejects the request, ensuring /internal/provision cannot be written when the
secret is not configured.

WOPR and others added 17 commits March 26, 2026 05:01
Adds /internal/health and /internal/provision endpoints so nemoclaw-platform
can provision NemoClaw containers using the same contract as paperclip-platform.

On provision: rewrites openclaw.json to route inference through WOPR's gateway
instead of NVIDIA's, enabling metered billing via platform-core credit ledger.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Dockerfile: change ENTRYPOINT from /bin/bash to /usr/local/bin/nemoclaw-start
  so the container actually runs the startup script instead of exiting immediately
- nemoclaw-start.sh: run sidecar in foreground via exec (keeps container alive)
  with tail -f fallback if no sidecar present
- sidecar.js: read PORT env var (set by fleet.ts to 3100) instead of
  WOPR_SIDECAR_PORT defaulting to 3001

Co-authored-by: WOPR <wopr@localhost>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
FleetManager creates containers with ReadonlyRootfs=true. The startup
script was writing to /sandbox/.openclaw/ which is in the read-only
rootfs, causing OSError: Read-only file system.

Fix:
- Dockerfile: save build-time openclaw config to /opt/nemoclaw-defaults/,
  set HOME=/data so all writes go to the writable volume mount
- nemoclaw-start.sh: copy default config from /opt/nemoclaw-defaults/
  to $HOME/.openclaw/ on first boot before running openclaw commands

Co-authored-by: WOPR <wopr@localhost>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The platform's checkHealth expects { ok: true, provisioning: true }
but the sidecar only returned { ok, gateway }. Without the provisioning
field, health checks always fail, blocking subdomain routing and
container provision.

Co-authored-by: WOPR <wopr@localhost>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
)

provision-client's checkHealth() hits /internal/provision/health, not
/internal/health. The sidecar only served /internal/health, so health
checks always got 404 → always unhealthy → no subdomain routing, no
provisioning.

Co-authored-by: WOPR <wopr@localhost>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Agent SDK-powered rebase that:
1. Fetches NVIDIA/NemoClaw upstream daily at 06:30 UTC
2. Rebases our 5 WOPR sidecar commits on top
3. Resolves conflicts via Claude Agent SDK
4. Verifies sidecar + Dockerfile integrity
5. Creates a PR (or force-pushes with --push flag)

Uses extraheader auth to bypass self-hosted runner credential helpers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
extraheader alone wasn't enough — system credential manager still
intercepted. Now uses: token-embedded URL + extraheader + env overrides
(GIT_ASKPASS=/bin/true, GCM_INTERACTIVE=never) + system-level
credential.helper unset.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previous approaches (config unset, env vars, URL tokens) all failed
because the self-hosted runner's system-level GCM intercepts before
any of those take effect. Inline -c flags on the git command itself
are the only override that takes absolute precedence.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Push :staging tag alongside :latest, SSH to staging VPS,
copy prod DB, restart staging containers, verify health.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Self-hosted runner had leftover @anthropic-ai/claude-code dir causing
ENOTEMPTY on npm install -g. Remove old install before reinstalling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
-c flags had shell quoting issues with the AUTHORIZATION header space.
GIT_CONFIG_COUNT/KEY/VALUE env vars set via execSync env option bypass
all config levels cleanly without shell interpolation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Write a tiny shell script that echoes username/password, then override
credential.helper via GIT_CONFIG_COUNT env. No base64, no extraheader,
no quoting issues — git's native credential protocol handles it all.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Credential managers intercept the 'origin' remote. Create a fresh
temporary remote with token embedded in URL — credential managers
have no cached creds for it. Clean up after push.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lper

Need to see what's intercepting git push on the self-hosted runner.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
gh pr create defaults to parent (NVIDIA/NemoClaw) on forks.
Add --repo wopr-network/nemoclaw to target our fork.
Also clean up debug output from credential investigation.

Note: accidental PR NVIDIA#959 was opened on NVIDIA/NemoClaw — needs manual close.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Agent SDK reads ~/.claude/.credentials.json, not env vars.
Write CLAUDE_CREDENTIALS_JSON org secret to disk before sync.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@AgentWOPR AgentWOPR force-pushed the sync/upstream-2026-03-26 branch from 7aacae8 to a86248c Compare March 26, 2026 05:03
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
.github/workflows/upstream-sync.yml (1)

1-23: ⚠️ Potential issue | 🟠 Major

Add top-level workflow concurrency to prevent overlapping sync pushes.

This workflow can overlap across schedule/manual/reruns and race on the same rebase/push path.

Suggested fix
 name: Upstream Sync
+
+concurrency:
+  group: upstream-sync-main
+  cancel-in-progress: true
 
 on:
GitHub Actions docs: confirm top-level `concurrency` behavior for cancelling in-progress runs and recommended usage for scheduled workflows that can push branches.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/upstream-sync.yml around lines 1 - 23, Add a top-level
GitHub Actions concurrency block to prevent overlapping runs of the "Upstream
Sync" workflow (affecting jobs under jobs.sync) by configuring a unique
concurrency group and enabling cancel-in-progress; for example, add a
concurrency mapping that includes workflow name and ref/inputs (so scheduled and
manual runs dedupe correctly) and set cancel-in-progress: true to ensure any
in-flight sync run is cancelled before a new one proceeds.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/upstream-sync.yml:
- Around line 85-93: The node invocation is expanding $FLAG unquoted which can
trigger shellword-splitting/globbing (SC2086); update the call that runs
scripts/upstream-sync.mjs to quote the variable (use "$FLAG") so MODE/FLAG are
passed as a single argument to node and avoid unintended splitting or glob
expansion when invoking scripts/upstream-sync.mjs.
- Around line 64-68: Remove the "Write Claude credentials" step that creates
~/.claude/.credentials.json and stops echoing the secret; the upstream-sync flow
uses CLAUDE_CODE_OAUTH_TOKEN passed to the Agent SDK, so eliminate the workflow
step referencing secrets.CLAUDE_CREDENTIALS_JSON (the block with name "Write
Claude credentials") to avoid writing secrets to disk and rely solely on the
CLAUDE_CODE_OAUTH_TOKEN env var used by upstream-sync.mjs.

---

Duplicate comments:
In @.github/workflows/upstream-sync.yml:
- Around line 1-23: Add a top-level GitHub Actions concurrency block to prevent
overlapping runs of the "Upstream Sync" workflow (affecting jobs under
jobs.sync) by configuring a unique concurrency group and enabling
cancel-in-progress; for example, add a concurrency mapping that includes
workflow name and ref/inputs (so scheduled and manual runs dedupe correctly) and
set cancel-in-progress: true to ensure any in-flight sync run is cancelled
before a new one proceeds.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f0649cf0-ba27-4916-850f-b6fabb814499

📥 Commits

Reviewing files that changed from the base of the PR and between 7aacae8 and a86248c.

📒 Files selected for processing (7)
  • .github/workflows/ci.yml
  • .github/workflows/upstream-sync.yml
  • Dockerfile
  • scripts/nemoclaw-start.sh
  • scripts/upstream-sync.mjs
  • wopr/package.json
  • wopr/sidecar.js
✅ Files skipped from review due to trivial changes (3)
  • wopr/package.json
  • .github/workflows/ci.yml
  • scripts/upstream-sync.mjs
🚧 Files skipped from review as they are similar to previous changes (2)
  • scripts/nemoclaw-start.sh
  • wopr/sidecar.js

Comment on lines +64 to +68
- name: Write Claude credentials
run: |
mkdir -p ~/.claude
echo '${{ secrets.CLAUDE_CREDENTIALS_JSON }}' > ~/.claude/.credentials.json

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether repository scripts reference the credentials file path.
rg -n --hidden -S '\.claude/\.credentials\.json|CLAUDE_CREDENTIALS_JSON|CLAUDE_CODE_OAUTH_TOKEN|ANTHROPIC_API_KEY'

Repository: NVIDIA/NemoClaw

Length of output: 1535


🏁 Script executed:

cat scripts/upstream-sync.mjs | head -100

Repository: NVIDIA/NemoClaw

Length of output: 3273


🏁 Script executed:

grep -n "credentials\|\.claude" scripts/upstream-sync.mjs

Repository: NVIDIA/NemoClaw

Length of output: 41


🏁 Script executed:

wc -l scripts/upstream-sync.mjs

Repository: NVIDIA/NemoClaw

Length of output: 88


🏁 Script executed:

cat -n scripts/upstream-sync.mjs | tail -300

Repository: NVIDIA/NemoClaw

Length of output: 11904


Remove the unused Claude credentials file write step.

The upstream-sync.mjs script authenticates exclusively via the CLAUDE_CODE_OAUTH_TOKEN environment variable passed directly to the Agent SDK, not via the credentials file. Writing CLAUDE_CREDENTIALS_JSON to disk increases secret-at-rest risk without providing any benefit.

-      - name: Write Claude credentials
-        run: |
-          mkdir -p ~/.claude
-          echo '${{ secrets.CLAUDE_CREDENTIALS_JSON }}' > ~/.claude/.credentials.json
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Write Claude credentials
run: |
mkdir -p ~/.claude
echo '${{ secrets.CLAUDE_CREDENTIALS_JSON }}' > ~/.claude/.credentials.json
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/upstream-sync.yml around lines 64 - 68, Remove the "Write
Claude credentials" step that creates ~/.claude/.credentials.json and stops
echoing the secret; the upstream-sync flow uses CLAUDE_CODE_OAUTH_TOKEN passed
to the Agent SDK, so eliminate the workflow step referencing
secrets.CLAUDE_CREDENTIALS_JSON (the block with name "Write Claude credentials")
to avoid writing secrets to disk and rely solely on the CLAUDE_CODE_OAUTH_TOKEN
env var used by upstream-sync.mjs.

Comment on lines +85 to +93
run: |
MODE="${{ github.event.inputs.mode || 'pr' }}"
case "$MODE" in
push) FLAG="--push" ;;
pr) FLAG="--pr" ;;
dry-run) FLAG="--dry-run" ;;
esac
node scripts/upstream-sync.mjs $FLAG

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Quick check for unquoted variable expansions in workflow run blocks.
rg -n --hidden -S '\$[A-Za-z_][A-Za-z0-9_]*' .github/workflows/upstream-sync.yml

Repository: NVIDIA/NemoClaw

Length of output: 292


Quote $FLAG when invoking the sync script.

Line 92 should quote the variable to satisfy SC2086 and prevent word-splitting/globbing issues.

Suggested fix
-          node scripts/upstream-sync.mjs $FLAG
+          node scripts/upstream-sync.mjs "$FLAG"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
run: |
MODE="${{ github.event.inputs.mode || 'pr' }}"
case "$MODE" in
push) FLAG="--push" ;;
pr) FLAG="--pr" ;;
dry-run) FLAG="--dry-run" ;;
esac
node scripts/upstream-sync.mjs $FLAG
run: |
MODE="${{ github.event.inputs.mode || 'pr' }}"
case "$MODE" in
push) FLAG="--push" ;;
pr) FLAG="--pr" ;;
dry-run) FLAG="--dry-run" ;;
esac
node scripts/upstream-sync.mjs "$FLAG"
🧰 Tools
🪛 actionlint (1.7.11)

[error] 85-85: shellcheck reported issue in this script: SC2086:info:7:32: Double quote to prevent globbing and word splitting

(shellcheck)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/upstream-sync.yml around lines 85 - 93, The node
invocation is expanding $FLAG unquoted which can trigger
shellword-splitting/globbing (SC2086); update the call that runs
scripts/upstream-sync.mjs to quote the variable (use "$FLAG") so MODE/FLAG are
passed as a single argument to node and avoid unintended splitting or glob
expansion when invoking scripts/upstream-sync.mjs.

@AgentWOPR AgentWOPR closed this Mar 26, 2026
@AgentWOPR AgentWOPR deleted the sync/upstream-2026-03-26 branch March 26, 2026 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants