fix: start OpenClaw gateway and port forwarding in nemoclaw start by futhgar · Pull Request #381 · NVIDIA/NemoClaw

futhgar · 2026-03-19T02:01:51Z

Summary

nemoclaw start now manages the OpenClaw gateway (inside sandbox) and dashboard port forwarding alongside the existing Telegram bridge and cloudflared tunnel
nemoclaw stop cleanly tears down all four services
nemoclaw status shows the state of all four services
Adds validate_name() to reject sandbox names with shell metacharacters
Adds resolve_sandbox() to auto-detect the active sandbox when --sandbox is not provided

Problem

After a system reboot, the OpenClaw gateway inside the sandbox stops and port forwarding is lost. Users must manually SSH into the sandbox, restart the gateway, and re-establish port forwarding. This makes integration with external dashboards (e.g. Mission Control) fragile.

Approach

Rather than introducing SSH proxy tunnels, this uses the existing openshell sandbox exec and openshell forward start commands that the codebase already relies on for sandbox interaction and port forwarding during onboarding.

The gateway is run in the foreground inside the sandbox via openshell sandbox exec, which keeps a live host-side PID for the existing PID-based service tracking to work correctly.

Test plan

All existing tests pass (node --test test/cli.test.js test/service-env.test.js -- 25/25)
New tests for validate_name (accepts valid names, rejects metacharacters, rejects empty)
New tests for resolve_sandbox (returns explicit name when set)
New tests verify show_status iterates all four services and do_stop stops gateway services
CLI help test verifies gateway is mentioned in start description
Manual: nemoclaw start on a DGX Spark with an existing sandbox

Fixes #311

Summary by CodeRabbit

New Features
- Added gateway service and host port-forwarding to service management; services auto-detect environment and validate sandbox names.
Documentation
- Updated CLI help to list available services: gateway, port fwd, Telegram, tunnel.
Tests
- Added CLI help test and new environment tests for sandbox resolution, name validation, and presence/management of gateway and port-forward services.

Signed-off-by: Josue Balandrano Coronel josuebc@pm.me

coderabbitai · 2026-03-19T02:02:08Z

📝 Walkthrough

Walkthrough

Adds management for an OpenClaw gateway and dashboard port forwarding: scripts/start-services.sh now defines openclaw-gateway and gateway-forward, includes validate_name() and resolve_sandbox(), and starts/stops/reports these services; CLI help text updated and tests extended to cover the new services. (33 words)

Changes

Cohort / File(s)	Summary
CLI Help & Tests `bin/nemoclaw.js`, `test/cli.test.js`	Help text for `nemoclaw start` updated to list `gateway, port fwd`; added CLI test asserting `gateway` appears in help output.
Service Management Script `scripts/start-services.sh`	Added `validate_name()` and `resolve_sandbox()`; added managed services `openclaw-gateway` and `gateway-forward`; updated `do_start()`, `do_stop()`, and `show_status()` to handle them, with conditional start/forward when `openshell` and sandbox are resolvable and idempotent PID-file handling.
Service Environment Tests `test/service-env.test.js`	Added subprocess tests exercising `validate_name` and `resolve_sandbox`, and assertions that `start-services.sh` lists and handles `openclaw-gateway` and `gateway-forward` (including stop logic).

Sequence Diagram

sequenceDiagram
    participant User
    participant NemoClaw as "nemoclaw start"
    participant Validate as "validate_name()"
    participant Resolve as "resolve_sandbox()"
    participant OpenShell as "openshell"
    participant Sandbox as "Sandbox (OpenClaw)"

    User->>NemoClaw: Run "nemoclaw start"
    NemoClaw->>Validate: Check SANDBOX_NAME validity
    alt Invalid name
        Validate-->>NemoClaw: Reject
        NemoClaw-->>User: Warn and skip gateway services
    else Valid name
        Validate-->>NemoClaw: Accept
        NemoClaw->>Resolve: Determine sandbox (SANDBOX_NAME or auto-detect)
        alt Sandbox found
            Resolve-->>NemoClaw: Return sandbox name
            NemoClaw->>OpenShell: exec in sandbox -> "openclaw gateway run"
            OpenShell->>Sandbox: Start gateway process
            Sandbox-->>OpenShell: Gateway running (PID)
            NemoClaw->>OpenShell: Start port forward -> $DASHBOARD_PORT
            OpenShell-->>NemoClaw: Port forward active
            NemoClaw-->>User: Report gateway + forward started
        else No sandbox
            Resolve-->>NemoClaw: Not found
            NemoClaw-->>User: Warn and skip gateway services
        end
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Inside the sandbox, gears align,
Gateway wakes and threads entwine,
Ports forwarded, banners bright,
Start and stop done through the night,
NemoClaw hops — all systems fine.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: adding OpenClaw gateway and port forwarding management to the nemoclaw start workflow.
Linked Issues check	✅ Passed	All coding requirements from issue `#311` are met: gateway is started via nemoclaw start, port forwarding is configured, status shows gateway state, and stop cleanly terminates gateway services.
Out of Scope Changes check	✅ Passed	All changes directly support the primary objective of managing OpenClaw gateway and port forwarding; help text update is a necessary documentation change for the new functionality.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

test/service-env.test.js (1)

138-189: Tests duplicate logic instead of testing the actual implementation.

The validate_name tests inline a copy of the function logic rather than sourcing and invoking the actual function from start-services.sh. This means the tests won't catch regressions if the real implementation changes.

Consider using the callValidateName helper (or a similar approach) to test the actual script:

♻️ Suggested refactor to test actual implementation

     it("accepts valid sandbox names", () => {
-      // validate_name is a case statement that calls fail on bad input.
-      // We source just the function definition, not the whole script.
-      const result = spawnSync(
-        "bash",
-        ["-c", `
-          validate_name() {
-            case "$1" in
-              (*[!A-Za-z0-9._-]*|'') echo "INVALID"; return 1 ;;
-            esac
-            echo "VALID"
-          }
-          validate_name "my-sandbox"
-        `],
-        { encoding: "utf-8" }
-      );
-      assert.equal(result.stdout.trim(), "VALID");
+      // Source functions from script and test actual implementation
+      const result = spawnSync("bash", ["-c", `
+        fail() { echo "INVALID"; exit 1; }
+        source <(sed -n '/^validate_name/,/^}/p' "${START_SERVICES_SH}")
+        validate_name "my-sandbox" && echo "VALID"
+      `], { encoding: "utf-8" });
+      assert.equal(result.stdout.trim(), "VALID");
     });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/service-env.test.js` around lines 138 - 189, The tests currently inline
a duplicate validate_name implementation instead of exercising the real function
from start-services.sh; update the three test cases in test/service-env.test.js
to source the script (or use the existing callValidateName helper) and invoke
the actual validate_name from start-services.sh (or callValidateName) so the
tests exercise the real implementation and catch regressions—replace the
embedded function blocks with sourcing/start-services.sh invocation and then
call validate_name (or callValidateName) with the same inputs ("my-sandbox",
"foo;rm -rf /", and empty string) and assert on the real output.

scripts/start-services.sh (1)

174-180: Race condition: gateway may not be ready after fixed 3-second sleep.

The hardcoded sleep 3 assumes the gateway binds its port within 3 seconds. On slower systems or under load, this may not be sufficient, causing the port forwarding to fail silently if the gateway isn't ready.

Consider adding a readiness check loop instead of a fixed sleep:

♻️ Suggested improvement

       start_service openclaw-gateway \
         openshell sandbox exec "$sandbox" -- openclaw gateway run
-      # Give the gateway a moment to bind its port
-      sleep 3
+      # Wait for gateway to be ready (up to 30s)
+      for _ in $(seq 1 30); do
+        if openshell sandbox exec "$sandbox" -- curl -sf "http://localhost:$DASHBOARD_PORT/health" >/dev/null 2>&1; then
+          break
+        fi
+        sleep 1
+      done

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/start-services.sh` around lines 174 - 180, The fixed "sleep 3" after
starting openclaw-gateway is racy; replace it with a readiness loop that polls
the gateway until it binds its port or a timeout is reached. After calling
start_service openclaw-gateway (and using openshell sandbox exec "$sandbox" --
... to run inside the sandbox), repeatedly test the gateway endpoint/port (e.g.,
with curl against a health URL or nc/tcp probe) and break when successful,
logging progress and failing cleanly if the timeout elapses; encapsulate this in
a small helper (e.g., wait_for_openclaw or wait_for_port) and call it instead of
sleep 3 so is_running/start_service logic reliably waits for the gateway to be
ready.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/service-env.test.js`:
- Around line 126-136: The helper function callValidateName is dead code; either
delete the entire callValidateName function or refactor the subsequent tests
that inline the validate_name invocation to call callValidateName instead. If
removing, simply delete the callValidateName definition (which uses spawnSync,
START_SERVICES_SH and sets env SANDBOX_NAME). If refactoring, replace the inline
spawnSync calls in the tests with calls to callValidateName(name) and reuse its
return value, ensuring the same encoding and env behavior is preserved.

---

Nitpick comments:
In `@scripts/start-services.sh`:
- Around line 174-180: The fixed "sleep 3" after starting openclaw-gateway is
racy; replace it with a readiness loop that polls the gateway until it binds its
port or a timeout is reached. After calling start_service openclaw-gateway (and
using openshell sandbox exec "$sandbox" -- ... to run inside the sandbox),
repeatedly test the gateway endpoint/port (e.g., with curl against a health URL
or nc/tcp probe) and break when successful, logging progress and failing cleanly
if the timeout elapses; encapsulate this in a small helper (e.g.,
wait_for_openclaw or wait_for_port) and call it instead of sleep 3 so
is_running/start_service logic reliably waits for the gateway to be ready.

In `@test/service-env.test.js`:
- Around line 138-189: The tests currently inline a duplicate validate_name
implementation instead of exercising the real function from start-services.sh;
update the three test cases in test/service-env.test.js to source the script (or
use the existing callValidateName helper) and invoke the actual validate_name
from start-services.sh (or callValidateName) so the tests exercise the real
implementation and catch regressions—replace the embedded function blocks with
sourcing/start-services.sh invocation and then call validate_name (or
callValidateName) with the same inputs ("my-sandbox", "foo;rm -rf /", and empty
string) and assert on the real output.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 745963ad-56ba-4982-8f48-03379dad0a1a

📥 Commits

Reviewing files that changed from the base of the PR and between 9513eca and 516223a.

📒 Files selected for processing (4)

bin/nemoclaw.js
scripts/start-services.sh
test/cli.test.js
test/service-env.test.js

test/service-env.test.js

`nemoclaw start` previously only started the Telegram bridge and cloudflared tunnel. After a system reboot, the OpenClaw gateway inside the sandbox and the dashboard port forwarding were lost, requiring manual intervention. Add `openclaw-gateway` and `gateway-forward` as managed services in start-services.sh so `nemoclaw start` / `nemoclaw stop` / `nemoclaw status` manage the full service lifecycle. - resolve_sandbox() auto-detects the active sandbox when not explicit - validate_name() rejects identifiers with shell metacharacters - Gateway started via `openshell sandbox exec` (foreground, PID tracked) - Port forwarding started via `openshell forward start` (PID tracked) - Status and banner updated to show all four services Fixes NVIDIA#311 Signed-off-by: Josue Gomez <josue@guatulab.com>

Address CodeRabbit review: the callValidateName helper sourced the full start-services.sh which has side effects that break in the test environment. Replace with a focused testValidateName function that defines just the validate_name case pattern directly. Signed-off-by: Josue Gomez <josue@guatulab.com> Signed-off-by: futhgar <jmaldonado.rosa@gmail.com>

coderabbitai

🧹 Nitpick comments (1)

test/service-env.test.js (1)

367-386: Consider adding test coverage for the fallback branch.

This test only verifies the explicit SANDBOX_NAME != "default" path. The real resolve_sandbox function has a second branch that calls openshell sandbox list when SANDBOX_NAME is "default". Consider adding:

A test case verifying the "default" branch is entered when SANDBOX_NAME="default".
A test verifying the function returns empty/nothing when openshell is unavailable (the || true fallback).

📝 Example additional test case

it("returns empty when SANDBOX_NAME is default and openshell unavailable", () => {
  const result = spawnSync(
    "bash",
    ["-c", `
      SANDBOX_NAME="default"
      resolve_sandbox() {
        if [ "$SANDBOX_NAME" != "default" ]; then
          printf '%s\\n' "$SANDBOX_NAME"
          return
        fi
        if command -v openshell > /dev/null 2>&1; then
          openshell sandbox list 2>/dev/null | awk '/Ready/{print $1; exit}' || true
        fi
      }
      resolve_sandbox
    `],
    { encoding: "utf-8" }
  );
  expect(result.status).toBe(0);
  expect(result.stdout.trim()).toBe("");
});

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/service-env.test.js` around lines 367 - 386, Add tests covering the
"default" branch of resolve_sandbox: create one test where
SANDBOX_NAME="default" and stub an openshell that emits a Ready line (so
resolve_sandbox parses and returns that sandbox id) and another test where
SANDBOX_NAME="default" and openshell is unavailable (ensure command -v openshell
fails) to assert resolve_sandbox returns empty; implement both using spawnSync
with a small inline bash function resolve_sandbox (same name used in existing
test) and either manipulate PATH to point to a temp directory containing a fake
openshell script that prints a Ready line or leave openshell missing to exercise
the || true fallback, then assert result.status is 0 and stdout.trim() matches
expected values (sandbox id or empty).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/service-env.test.js`:
- Around line 367-386: Add tests covering the "default" branch of
resolve_sandbox: create one test where SANDBOX_NAME="default" and stub an
openshell that emits a Ready line (so resolve_sandbox parses and returns that
sandbox id) and another test where SANDBOX_NAME="default" and openshell is
unavailable (ensure command -v openshell fails) to assert resolve_sandbox
returns empty; implement both using spawnSync with a small inline bash function
resolve_sandbox (same name used in existing test) and either manipulate PATH to
point to a temp directory containing a fake openshell script that prints a Ready
line or leave openshell missing to exercise the || true fallback, then assert
result.status is 0 and stdout.trim() matches expected values (sandbox id or
empty).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 293f8a4a-2a0b-4558-a063-a5453d8c61d5

📥 Commits

Reviewing files that changed from the base of the PR and between ed0a561 and c087e2b.

📒 Files selected for processing (4)

bin/nemoclaw.js
scripts/start-services.sh
test/cli.test.js
test/service-env.test.js

✅ Files skipped from review due to trivial changes (2)

test/cli.test.js
bin/nemoclaw.js

🚧 Files skipped from review as they are similar to previous changes (1)

scripts/start-services.sh

coderabbitai bot reviewed Mar 19, 2026

View reviewed changes

test/service-env.test.js Outdated Show resolved Hide resolved

This was referenced Mar 19, 2026

fix(security): harden process execution — shell injection + credential leak via ps aux #384

Closed

fix(security): harden process execution — shell injection + credential leak via ps aux #387

Closed

wscurran mentioned this pull request Mar 25, 2026

[NemoClaw][Ubuntu 24.04.3] openclaw tui shows unknown in the footer before the first prompt is sent #707

Open

2 tasks

futhgar added 2 commits March 28, 2026 17:12

futhgar force-pushed the fix/gateway-auto-restart branch from ed0a561 to c087e2b Compare March 28, 2026 21:12

coderabbitai bot reviewed Mar 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: start OpenClaw gateway and port forwarding in nemoclaw start#381

fix: start OpenClaw gateway and port forwarding in nemoclaw start#381
futhgar wants to merge 2 commits intoNVIDIA:mainfrom
futhgar:fix/gateway-auto-restart

futhgar commented Mar 19, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 19, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

futhgar commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Approach

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

futhgar commented Mar 19, 2026 •

edited

Loading

coderabbitai bot commented Mar 19, 2026 •

edited

Loading