Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds resumable per-user onboarding sessions with versioned JSON persistence, sensitive-data redaction, atomic writes, and lifecycle helpers; introduces resume/repair logic for gateways and sandboxes, runtime-recovery heuristics and CLI Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI as "nemoclaw CLI"
participant Session as "Onboard Session\n(persisted JSON)"
participant Onboard as "Onboard Engine"
participant Gateway as "OpenShell Gateway"
participant Sandbox as "OpenShell Sandbox"
rect rgba(200,150,255,0.5)
Note over User,Onboard: Initial run — failure mid-step
User->>CLI: nemoclaw onboard
CLI->>Session: loadSession()
Session-->>CLI: null
CLI->>Onboard: start onboarding
Onboard->>Onboard: run steps (preflight → gateway → sandbox ...)
Onboard->>Gateway: start gateway
Gateway-->>Onboard: ready
Onboard->>Onboard: markStepStarted("sandbox")
Onboard->>Sandbox: create/push image
Sandbox-->>Onboard: error (OOM/timeout)
Onboard->>Onboard: markStepFailed("sandbox")
Onboard->>Session: saveSession(redacted failure)
Session-->>Onboard: persisted
end
rect rgba(150,200,255,0.5)
Note over User,Onboard: Resume run — skip completed steps
User->>CLI: nemoclaw onboard --resume
CLI->>Session: loadSession()
Session-->>CLI: {steps, lastCompletedStep:"gateway", resumable:true}
CLI->>Onboard: initialize(resume=true)
Onboard->>Onboard: skip completed steps (preflight, gateway)
Onboard->>Gateway: inspect gateway status
Gateway-->>Onboard: healthy / reuse
Onboard->>Sandbox: resume/create remaining steps
Sandbox-->>Onboard: success
Onboard->>Onboard: markStepComplete("sandbox")
Onboard->>Session: completeSession()
Session-->>Onboard: persisted complete
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Thanks for submitting this draft PR, it addresses issues with the installer and onboarding process, and the changes are well-documented and easy to follow, which could improve the user experience for new users. Let us know when you are ready for review. |
…iliency # Conflicts: # bin/lib/onboard.js # test/onboard.test.js
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (3)
test/e2e/test-onboard-repair.sh (1)
253-255: Hardcoded model name may cause test fragility.The assertion expects the exact model string
nvidia/nemotron-3-super-120b-a12b. If the default model changes in the future, this test will fail even though the conflict detection logic is working correctly.💡 Consider extracting expected model to a variable
+EXPECTED_DEFAULT_MODEL="${EXPECTED_DEFAULT_MODEL:-nvidia/nemotron-3-super-120b-a12b}" + echo "$provider_conflict_output" | grep -q "Resumable state recorded model 'nvidia/nemotron-3-super-120b-a12b', not 'nvidia/conflicting-model'." \ +echo "$provider_conflict_output" | grep -q "Resumable state recorded model '${EXPECTED_DEFAULT_MODEL}', not 'nvidia/conflicting-model'." \ && pass "Conflicting model message is explicit" \ || fail "Conflicting model message missing or incorrect"Alternatively, use a more relaxed pattern that just checks for the conflict structure without the exact model name.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-onboard-repair.sh` around lines 253 - 255, The test is brittle because it hardcodes the exact model string in the grep assertion; update the assertion in the test that checks provider_conflict_output to either use a variable (e.g., define expected_model and substitute it into the check) or switch to a relaxed regex that verifies the conflict message structure (e.g., checks for "Resumable state recorded model .* not .*" or "Resumable state recorded model '.*', not '.*'") so the test verifies conflict semantics without depending on a specific default model name.test/e2e/test-onboard-resume.sh (1)
253-253: Consider adding a timeout guard for the E2E test.Long-running E2E tests can hang indefinitely if external services become unresponsive. A top-level timeout would prevent CI from stalling.
💡 Optional: Add timeout wrapper
Add near the top of the script after
set -uo pipefail:# Fail if the entire test takes longer than 10 minutes TIMEOUT_SECONDS=600 if command -v timeout >/dev/null 2>&1; then if [[ "${NEMOCLAW_E2E_NO_TIMEOUT:-}" != "1" ]]; then exec timeout --signal=TERM "$TIMEOUT_SECONDS" "$0" "$@" || exit $? fi fi export NEMOCLAW_E2E_NO_TIMEOUT=1🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-onboard-resume.sh` at line 253, Add a top-level timeout guard to the E2E script to prevent indefinite hangs: after the existing set -uo pipefail line, wrap the whole script in a conditional re-exec that uses the system timeout utility (respect the NEMOCLAW_E2E_NO_TIMEOUT env var to allow disabling in local runs) and export that flag so the helper doesn’t reapply on the re-exec; set a sane default (e.g., 600 seconds) for TIMEOUT_SECONDS and ensure the re-exec uses TERM signal on timeout and returns the timeout exit code so CI fails appropriately.install.sh (1)
247-255: TheZSH_VERSIONcheck is ineffective in this context.Since the script runs under
bash(per the shebang),ZSH_VERSIONwill never be set. TheSHELLbasename fallback handles the intended case, so this isn't a bug, but the first condition is dead code.💡 Consider simplifying
detect_shell_profile() { local profile="$HOME/.bashrc" - if [[ -n "${ZSH_VERSION:-}" ]] || [[ "$(basename "${SHELL:-}")" == "zsh" ]]; then + if [[ "$(basename "${SHELL:-}")" == "zsh" ]]; then profile="$HOME/.zshrc" elif [[ ! -f "$HOME/.bashrc" && -f "$HOME/.profile" ]]; then profile="$HOME/.profile" fi printf "%s" "$profile" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@install.sh` around lines 247 - 255, The ZSH_VERSION check in detect_shell_profile is dead code because the script runs under bash; remove the redundant `[[ -n "${ZSH_VERSION:-}" ]]` clause and simplify the conditional to rely on the SHELL basename check (i.e., keep the `[[ "$(basename "${SHELL:-}")" == "zsh" ]]` branch), leaving the rest of detect_shell_profile (the fallback to .profile and default to .bashrc) unchanged to preserve behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/onboard.js`:
- Around line 2588-2590: The create step currently calls startRecordedStep then
awaits createSandbox and only persists sandboxName when createSandbox returns;
move persistence earlier so the chosen sandboxName is saved before the streaming
create begins: obtain or compute the sandboxName up-front (the value assigned to
sandboxName), call onboardSession.markStepComplete or another persistence method
to record the sandboxName (with provider, model, nimContainer as needed)
immediately after startRecordedStep and before awaiting createSandbox(), then
proceed to call createSandbox(gpu, model, provider, preferredInferenceApi) and
update the step result if needed; modify usages around startRecordedStep,
createSandbox, sandboxName, and onboardSession.markStepComplete accordingly.
- Around line 1239-1246: requestedProvider (from getRequestedProviderHint) is a
selection key and must be normalized to the effective provider name before
comparing with session.provider to avoid false conflicts; update the conflict
check in onboard.js to map/resolve requestedProvider to the runtime provider
name (use the same mapping/resolver used elsewhere in the codebase) and then
compare the resolved name to session.provider (i.e., resolve requestedProvider
-> effectiveProviderName, then compare effectiveProviderName !==
session.provider before pushing the conflicts entry).
- Around line 189-217: isGatewayHealthy currently treats any active gateway as
"healthy" which prevents getGatewayReuseState from ever returning
"active-unnamed"; change isGatewayHealthy(statusOutput, gwInfoOutput,
activeGatewayInfoOutput) to require the named gateway
(hasStaleGateway(gwInfoOutput) or a positive check for the named NemoClaw) in
addition to connected before returning true, so only a named NemoClaw gateway
yields "healthy"; keep the separate active-unnamed detection logic in
getGatewayReuseState (the existing activeGatewayInfoOutput checks) so
getGatewayReuseState can return "active-unnamed" when there's an active gateway
but no named NemoClaw.
- Around line 1562-1589: recoverGatewayRuntime currently returns true as soon as
any "Connected" status appears even if the attempt to select GATEWAY_NAME
failed; change it to verify the selection before signaling recovery by checking
the actually selected gateway after calling runOpenshell(["gateway","select",
GATEWAY_NAME]) (for example call runCaptureOpenshell(["gateway","current"]) or
parse an appropriate runCaptureOpenshell(["status"]) field) and only set
process.env.OPENSHELL_GATEWAY = GATEWAY_NAME and return true when the
current/active gateway equals GATEWAY_NAME; apply this check in both the initial
pre-loop check and inside the loop before treating the runtime as recovered
(references: recoverGatewayRuntime, runOpenshell, runCaptureOpenshell,
GATEWAY_NAME, process.env.OPENSHELL_GATEWAY).
- Around line 2504-2506: The three direct shell-outs that call openshell should
use the wrapper helper to respect resolved-binary behavior: replace the
runCapture invocations that populate gatewayStatus, gatewayInfo, and
activeGatewayInfo with calls to runCaptureOpenshell(...) (preserving the same
command string arguments and options), and apply the same change to any other
new resume-time openshell shell-outs in this file so they all use
runCaptureOpenshell instead of raw runCapture.
- Around line 2541-2560: When resumeProviderSelection is true the code restores
credentialEnv but does not populate the actual secret value into process.env,
causing setupInference (which reads process.env[credentialEnv]) to fail; update
the resume branch (the block guarded by resumeProviderSelection in onboard.js)
to hydrate the secret into process.env[credentialEnv] from the saved onboarding
state (e.g., the value stored on onboardSession or the persisted session object)
before proceeding to setupInference, ensuring credentialEnv and its secret are
available just like when setupNim() returns them.
---
Nitpick comments:
In `@install.sh`:
- Around line 247-255: The ZSH_VERSION check in detect_shell_profile is dead
code because the script runs under bash; remove the redundant `[[ -n
"${ZSH_VERSION:-}" ]]` clause and simplify the conditional to rely on the SHELL
basename check (i.e., keep the `[[ "$(basename "${SHELL:-}")" == "zsh" ]]`
branch), leaving the rest of detect_shell_profile (the fallback to .profile and
default to .bashrc) unchanged to preserve behavior.
In `@test/e2e/test-onboard-repair.sh`:
- Around line 253-255: The test is brittle because it hardcodes the exact model
string in the grep assertion; update the assertion in the test that checks
provider_conflict_output to either use a variable (e.g., define expected_model
and substitute it into the check) or switch to a relaxed regex that verifies the
conflict message structure (e.g., checks for "Resumable state recorded model .*
not .*" or "Resumable state recorded model '.*', not '.*'") so the test verifies
conflict semantics without depending on a specific default model name.
In `@test/e2e/test-onboard-resume.sh`:
- Line 253: Add a top-level timeout guard to the E2E script to prevent
indefinite hangs: after the existing set -uo pipefail line, wrap the whole
script in a conditional re-exec that uses the system timeout utility (respect
the NEMOCLAW_E2E_NO_TIMEOUT env var to allow disabling in local runs) and export
that flag so the helper doesn’t reapply on the re-exec; set a sane default
(e.g., 600 seconds) for TIMEOUT_SECONDS and ensure the re-exec uses TERM signal
on timeout and returns the timeout exit code so CI fails appropriately.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 897bdfbd-283f-4038-b3c2-af2d60b1dc16
📒 Files selected for processing (17)
bin/lib/onboard-session.jsbin/lib/onboard.jsbin/lib/runtime-recovery.jsbin/nemoclaw.jsinstall.shscripts/debug.shscripts/install.shtest/cli.test.jstest/e2e/test-double-onboard.shtest/e2e/test-onboard-repair.shtest/e2e/test-onboard-resume.shtest/install-preflight.test.jstest/onboard-session.test.jstest/onboard.test.jstest/runtime-recovery.test.jstest/uninstall.test.jsuninstall.sh
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
test/e2e/test-onboard-resume.sh (1)
225-247: Relax hardcoded step-number assertions to reduce false failures.The fixed
[x/7]checks are fragile to benign workflow evolution. The semantic resume assertions above already validate the intended behavior.💡 Suggested simplification
-if echo "$resume_output" | grep -q "\[1/7\] Preflight checks"; then - fail "Resume reran preflight unexpectedly" -else - pass "Resume did not rerun preflight" -fi - -if echo "$resume_output" | grep -q "\[2/7\] Starting OpenShell gateway"; then - fail "Resume reran gateway startup unexpectedly" -else - pass "Resume did not rerun gateway startup" -fi - -if echo "$resume_output" | grep -q "\[3/7\] Creating sandbox"; then - fail "Resume reran sandbox creation unexpectedly" -else - pass "Resume did not rerun sandbox creation" -fi +# Keep semantic checks based on explicit resume markers: +# - [resume] Skipping preflight (cached) +# - [resume] Skipping gateway (running) +# - [resume] Skipping sandbox (<name>)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-onboard-resume.sh` around lines 225 - 247, The tests currently assert exact "[x/7]" step prefixes in the resume_output (the grep checks for "\[1/7\] Preflight checks", "\[2/7\] Starting OpenShell gateway", "\[3/7\] Creating sandbox", "\[5/7\] Setting up inference provider") which is fragile; update each grep to match the semantic message without the hardcoded step numbers (e.g., grep for "Preflight checks", "Starting OpenShell gateway", "Creating sandbox", "Setting up inference provider" or use a regex that allows an optional "[*/*]" prefix) so the assertions still verify the intended behavior using resume_output but no longer fail on benign reordering/renumbering of steps.test/e2e/test-onboard-repair.sh (1)
267-277: Avoid hardcoding recorded provider/model in conflict checks.These assertions depend on current defaults rather than the actual recorded session values, so they can fail after legitimate default changes.
💡 Suggested refactor (derive expectations from session file)
+RECORDED_PROVIDER="$(node -e 'const fs=require("fs");const s=JSON.parse(fs.readFileSync(process.argv[1],"utf8"));process.stdout.write(s.provider||"");' "$SESSION_FILE")" +RECORDED_MODEL="$(node -e 'const fs=require("fs");const s=JSON.parse(fs.readFileSync(process.argv[1],"utf8"));process.stdout.write(s.model||"");' "$SESSION_FILE")" + -if echo "$provider_conflict_output" | grep -q "Resumable state recorded provider 'nvidia-nim', not 'cloud'."; then +if echo "$provider_conflict_output" | grep -q "Resumable state recorded provider '${RECORDED_PROVIDER}', not 'cloud'."; then pass "Conflicting provider message is explicit" else fail "Conflicting provider message missing or incorrect" fi -if echo "$provider_conflict_output" | grep -q "Resumable state recorded model 'nvidia/nemotron-3-super-120b-a12b', not 'nvidia/conflicting-model'."; then +if echo "$provider_conflict_output" | grep -q "Resumable state recorded model '${RECORDED_MODEL}', not 'nvidia/conflicting-model'."; then pass "Conflicting model message is explicit" else fail "Conflicting model message missing or incorrect" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-onboard-repair.sh` around lines 267 - 277, The test currently hardcodes expected provider/model strings when asserting on provider_conflict_output; instead parse the actual recorded values from the session (e.g., read the session JSON used in the test) and build the expected substrings dynamically, then assert that provider_conflict_output contains those derived messages. Concretely, in test/e2e/test-onboard-repair.sh replace the static grep literals with values extracted from the session file (the same file used to record resumable state) and reference those variables when checking provider_conflict_output and model conflict messages so the assertions follow the recorded session values rather than fixed defaults.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e/test-onboard-repair.sh`:
- Around line 63-80: Add a trap-backed guaranteed teardown so cleanup always
runs on early exits: at the top of the script (after
SANDBOX_NAME/OTHER_SANDBOX_NAME/SESSION_FILE are set) register a trap for EXIT
and ERR that invokes a cleanup function; implement that cleanup function to run
the same teardown commands (run_nemoclaw "$SANDBOX_NAME" destroy, run_nemoclaw
"$OTHER_SANDBOX_NAME" destroy, openshell sandbox delete for both names,
openshell forward stop 18789, openshell gateway destroy -g nemoclaw, and rm -f
"$SESSION_FILE") and ensure the trap is idempotent (safe to call multiple times)
so all exit paths no longer leak sandbox/gateway/session artifacts.
In `@test/e2e/test-onboard-resume.sh`:
- Around line 274-278: The test currently uses grep against $REGISTRY which can
match incidental substrings; change the assertion to parse the registry JSON and
verify a real sandbox entry exists. Replace the grep check with a jq-based check
that reads $REGISTRY and tests for an object with the sandbox name (e.g. jq
'any(.sandboxes[]; .name == env.SANDBOX_NAME)' or similar), and use that jq exit
status to call pass "Registry contains resumed sandbox entry" or fail "Registry
does not contain resumed sandbox entry" while keeping the same variables
($REGISTRY, $SANDBOX_NAME) and pass/fail calls.
- Around line 64-79: Define a reusable cleanup function (e.g., cleanup) that
runs the existing teardown steps (calling run_nemoclaw "$SANDBOX_NAME" destroy,
openshell sandbox delete "$SANDBOX_NAME", openshell forward stop 18789,
openshell gateway destroy -g nemoclaw and rm -f "$SESSION_FILE") and install it
as an EXIT trap near the top of the script so it always runs even if the script
aborts early; ensure the trap is registered immediately after initializing
SANDBOX_NAME/SESSION_FILE (and before any early-exit branches) and make the
cleanup function idempotent/silent on missing resources so repeated runs won’t
error.
---
Nitpick comments:
In `@test/e2e/test-onboard-repair.sh`:
- Around line 267-277: The test currently hardcodes expected provider/model
strings when asserting on provider_conflict_output; instead parse the actual
recorded values from the session (e.g., read the session JSON used in the test)
and build the expected substrings dynamically, then assert that
provider_conflict_output contains those derived messages. Concretely, in
test/e2e/test-onboard-repair.sh replace the static grep literals with values
extracted from the session file (the same file used to record resumable state)
and reference those variables when checking provider_conflict_output and model
conflict messages so the assertions follow the recorded session values rather
than fixed defaults.
In `@test/e2e/test-onboard-resume.sh`:
- Around line 225-247: The tests currently assert exact "[x/7]" step prefixes in
the resume_output (the grep checks for "\[1/7\] Preflight checks", "\[2/7\]
Starting OpenShell gateway", "\[3/7\] Creating sandbox", "\[5/7\] Setting up
inference provider") which is fragile; update each grep to match the semantic
message without the hardcoded step numbers (e.g., grep for "Preflight checks",
"Starting OpenShell gateway", "Creating sandbox", "Setting up inference
provider" or use a regex that allows an optional "[*/*]" prefix) so the
assertions still verify the intended behavior using resume_output but no longer
fail on benign reordering/renumbering of steps.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a4191510-d125-41e7-88d8-a61fe074532f
📒 Files selected for processing (3)
scripts/debug.shtest/e2e/test-onboard-repair.shtest/e2e/test-onboard-resume.sh
| SANDBOX_NAME="${NEMOCLAW_SANDBOX_NAME:-e2e-repair}" | ||
| OTHER_SANDBOX_NAME="${NEMOCLAW_OTHER_SANDBOX_NAME:-e2e-other}" | ||
| SESSION_FILE="$HOME/.nemoclaw/onboard-session.json" | ||
| RESTORE_API_KEY="${NVIDIA_API_KEY:-}" | ||
|
|
||
| # ══════════════════════════════════════════════════════════════════ | ||
| # Phase 0: Pre-cleanup | ||
| # ══════════════════════════════════════════════════════════════════ | ||
| section "Phase 0: Pre-cleanup" | ||
| info "Destroying any leftover sandbox/gateway from previous runs..." | ||
| run_nemoclaw "$SANDBOX_NAME" destroy 2>/dev/null || true | ||
| run_nemoclaw "$OTHER_SANDBOX_NAME" destroy 2>/dev/null || true | ||
| openshell sandbox delete "$SANDBOX_NAME" 2>/dev/null || true | ||
| openshell sandbox delete "$OTHER_SANDBOX_NAME" 2>/dev/null || true | ||
| openshell forward stop 18789 2>/dev/null || true | ||
| openshell gateway destroy -g nemoclaw 2>/dev/null || true | ||
| rm -f "$SESSION_FILE" | ||
| pass "Pre-cleanup complete" |
There was a problem hiding this comment.
Add guaranteed teardown on early exit paths.
There are multiple exit 1 branches before final cleanup, so failed runs can leak sandbox/gateway/session artifacts and poison subsequent runs.
💡 Suggested hardening (trap-backed cleanup)
SANDBOX_NAME="${NEMOCLAW_SANDBOX_NAME:-e2e-repair}"
OTHER_SANDBOX_NAME="${NEMOCLAW_OTHER_SANDBOX_NAME:-e2e-other}"
SESSION_FILE="$HOME/.nemoclaw/onboard-session.json"
RESTORE_API_KEY="${NVIDIA_API_KEY:-}"
+
+cleanup() {
+ run_nemoclaw "$SANDBOX_NAME" destroy 2>/dev/null || true
+ run_nemoclaw "$OTHER_SANDBOX_NAME" destroy 2>/dev/null || true
+ openshell sandbox delete "$SANDBOX_NAME" 2>/dev/null || true
+ openshell sandbox delete "$OTHER_SANDBOX_NAME" 2>/dev/null || true
+ openshell forward stop 18789 2>/dev/null || true
+ openshell gateway destroy -g nemoclaw 2>/dev/null || true
+ rm -f "$SESSION_FILE"
+}
+
+trap cleanup EXIT📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| SANDBOX_NAME="${NEMOCLAW_SANDBOX_NAME:-e2e-repair}" | |
| OTHER_SANDBOX_NAME="${NEMOCLAW_OTHER_SANDBOX_NAME:-e2e-other}" | |
| SESSION_FILE="$HOME/.nemoclaw/onboard-session.json" | |
| RESTORE_API_KEY="${NVIDIA_API_KEY:-}" | |
| # ══════════════════════════════════════════════════════════════════ | |
| # Phase 0: Pre-cleanup | |
| # ══════════════════════════════════════════════════════════════════ | |
| section "Phase 0: Pre-cleanup" | |
| info "Destroying any leftover sandbox/gateway from previous runs..." | |
| run_nemoclaw "$SANDBOX_NAME" destroy 2>/dev/null || true | |
| run_nemoclaw "$OTHER_SANDBOX_NAME" destroy 2>/dev/null || true | |
| openshell sandbox delete "$SANDBOX_NAME" 2>/dev/null || true | |
| openshell sandbox delete "$OTHER_SANDBOX_NAME" 2>/dev/null || true | |
| openshell forward stop 18789 2>/dev/null || true | |
| openshell gateway destroy -g nemoclaw 2>/dev/null || true | |
| rm -f "$SESSION_FILE" | |
| pass "Pre-cleanup complete" | |
| SANDBOX_NAME="${NEMOCLAW_SANDBOX_NAME:-e2e-repair}" | |
| OTHER_SANDBOX_NAME="${NEMOCLAW_OTHER_SANDBOX_NAME:-e2e-other}" | |
| SESSION_FILE="$HOME/.nemoclaw/onboard-session.json" | |
| RESTORE_API_KEY="${NVIDIA_API_KEY:-}" | |
| cleanup() { | |
| run_nemoclaw "$SANDBOX_NAME" destroy 2>/dev/null || true | |
| run_nemoclaw "$OTHER_SANDBOX_NAME" destroy 2>/dev/null || true | |
| openshell sandbox delete "$SANDBOX_NAME" 2>/dev/null || true | |
| openshell sandbox delete "$OTHER_SANDBOX_NAME" 2>/dev/null || true | |
| openshell forward stop 18789 2>/dev/null || true | |
| openshell gateway destroy -g nemoclaw 2>/dev/null || true | |
| rm -f "$SESSION_FILE" | |
| } | |
| trap cleanup EXIT | |
| # ══════════════════════════════════════════════════════════════════ | |
| # Phase 0: Pre-cleanup | |
| # ══════════════════════════════════════════════════════════════════ | |
| section "Phase 0: Pre-cleanup" | |
| info "Destroying any leftover sandbox/gateway from previous runs..." | |
| run_nemoclaw "$SANDBOX_NAME" destroy 2>/dev/null || true | |
| run_nemoclaw "$OTHER_SANDBOX_NAME" destroy 2>/dev/null || true | |
| openshell sandbox delete "$SANDBOX_NAME" 2>/dev/null || true | |
| openshell sandbox delete "$OTHER_SANDBOX_NAME" 2>/dev/null || true | |
| openshell forward stop 18789 2>/dev/null || true | |
| openshell gateway destroy -g nemoclaw 2>/dev/null || true | |
| rm -f "$SESSION_FILE" | |
| pass "Pre-cleanup complete" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/test-onboard-repair.sh` around lines 63 - 80, Add a trap-backed
guaranteed teardown so cleanup always runs on early exits: at the top of the
script (after SANDBOX_NAME/OTHER_SANDBOX_NAME/SESSION_FILE are set) register a
trap for EXIT and ERR that invokes a cleanup function; implement that cleanup
function to run the same teardown commands (run_nemoclaw "$SANDBOX_NAME"
destroy, run_nemoclaw "$OTHER_SANDBOX_NAME" destroy, openshell sandbox delete
for both names, openshell forward stop 18789, openshell gateway destroy -g
nemoclaw, and rm -f "$SESSION_FILE") and ensure the trap is idempotent (safe to
call multiple times) so all exit paths no longer leak sandbox/gateway/session
artifacts.
| SANDBOX_NAME="${NEMOCLAW_SANDBOX_NAME:-e2e-resume}" | ||
| SESSION_FILE="$HOME/.nemoclaw/onboard-session.json" | ||
| REGISTRY="$HOME/.nemoclaw/sandboxes.json" | ||
| RESTORE_API_KEY="${NVIDIA_API_KEY:-}" | ||
|
|
||
| # ══════════════════════════════════════════════════════════════════ | ||
| # Phase 0: Pre-cleanup | ||
| # ══════════════════════════════════════════════════════════════════ | ||
| section "Phase 0: Pre-cleanup" | ||
| info "Destroying any leftover sandbox/gateway from previous runs..." | ||
| run_nemoclaw "$SANDBOX_NAME" destroy 2>/dev/null || true | ||
| openshell sandbox delete "$SANDBOX_NAME" 2>/dev/null || true | ||
| openshell forward stop 18789 2>/dev/null || true | ||
| openshell gateway destroy -g nemoclaw 2>/dev/null || true | ||
| rm -f "$SESSION_FILE" | ||
| pass "Pre-cleanup complete" |
There was a problem hiding this comment.
Guarantee cleanup even when the test aborts early.
This script also exits before final cleanup in several branches, which can leave artifacts behind and destabilize later E2E runs.
💡 Suggested hardening (shared cleanup + EXIT trap)
SANDBOX_NAME="${NEMOCLAW_SANDBOX_NAME:-e2e-resume}"
SESSION_FILE="$HOME/.nemoclaw/onboard-session.json"
REGISTRY="$HOME/.nemoclaw/sandboxes.json"
RESTORE_API_KEY="${NVIDIA_API_KEY:-}"
+
+cleanup() {
+ run_nemoclaw "$SANDBOX_NAME" destroy 2>/dev/null || true
+ openshell sandbox delete "$SANDBOX_NAME" 2>/dev/null || true
+ openshell forward stop 18789 2>/dev/null || true
+ openshell gateway destroy -g nemoclaw 2>/dev/null || true
+ rm -f "$SESSION_FILE"
+}
+
+trap cleanup EXIT🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/test-onboard-resume.sh` around lines 64 - 79, Define a reusable
cleanup function (e.g., cleanup) that runs the existing teardown steps (calling
run_nemoclaw "$SANDBOX_NAME" destroy, openshell sandbox delete "$SANDBOX_NAME",
openshell forward stop 18789, openshell gateway destroy -g nemoclaw and rm -f
"$SESSION_FILE") and install it as an EXIT trap near the top of the script so it
always runs even if the script aborts early; ensure the trap is registered
immediately after initializing SANDBOX_NAME/SESSION_FILE (and before any
early-exit branches) and make the cleanup function idempotent/silent on missing
resources so repeated runs won’t error.
| if [ -f "$REGISTRY" ] && grep -q "$SANDBOX_NAME" "$REGISTRY"; then | ||
| pass "Registry contains resumed sandbox entry" | ||
| else | ||
| fail "Registry does not contain resumed sandbox entry" | ||
| fi |
There was a problem hiding this comment.
Use structured JSON validation for registry assertions.
String grep on JSON can pass on incidental matches; parse the registry to verify an actual sandbox entry.
💡 Suggested assertion
-if [ -f "$REGISTRY" ] && grep -q "$SANDBOX_NAME" "$REGISTRY"; then
+if [ -f "$REGISTRY" ] && node -e '
+const fs = require("fs");
+const name = process.argv[1];
+const file = process.argv[2];
+const data = JSON.parse(fs.readFileSync(file, "utf8"));
+const entries = Array.isArray(data) ? data : Object.values(data || {});
+process.exit(entries.some((e) => e && e.name === name) ? 0 : 1);
+' "$SANDBOX_NAME" "$REGISTRY"; then
pass "Registry contains resumed sandbox entry"
else
fail "Registry does not contain resumed sandbox entry"
fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if [ -f "$REGISTRY" ] && grep -q "$SANDBOX_NAME" "$REGISTRY"; then | |
| pass "Registry contains resumed sandbox entry" | |
| else | |
| fail "Registry does not contain resumed sandbox entry" | |
| fi | |
| if [ -f "$REGISTRY" ] && node -e ' | |
| const fs = require("fs"); | |
| const name = process.argv[1]; | |
| const file = process.argv[2]; | |
| const data = JSON.parse(fs.readFileSync(file, "utf8")); | |
| const entries = Array.isArray(data) ? data : Object.values(data || {}); | |
| process.exit(entries.some((e) => e && e.name === name) ? 0 : 1); | |
| ' "$SANDBOX_NAME" "$REGISTRY"; then | |
| pass "Registry contains resumed sandbox entry" | |
| else | |
| fail "Registry does not contain resumed sandbox entry" | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/test-onboard-resume.sh` around lines 274 - 278, The test currently
uses grep against $REGISTRY which can match incidental substrings; change the
assertion to parse the registry JSON and verify a real sandbox entry exists.
Replace the grep check with a jq-based check that reads $REGISTRY and tests for
an object with the sandbox name (e.g. jq 'any(.sandboxes[]; .name ==
env.SANDBOX_NAME)' or similar), and use that jq exit status to call pass
"Registry contains resumed sandbox entry" or fail "Registry does not contain
resumed sandbox entry" while keeping the same variables ($REGISTRY,
$SANDBOX_NAME) and pass/fail calls.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@install.sh`:
- Around line 106-133: print_done() currently prints full PATH/profile recovery
instructions when NEMOCLAW_READY_NOW=false and
NEMOCLAW_RECOVERY_EXPORT_DIR/_PROFILE are set, duplicating the same instructions
later in post_install_message(); remove the duplication by either (A) deleting
the blocks in print_done() that print the "%s$%s export PATH=..." and "source
..." lines and keep the detailed recovery instructions only in
post_install_message(), or (B) replace those blocks in print_done() with a
single brief line (e.g., "See post-install instructions below for PATH/profile
recovery") so post_install_message() remains the sole source of detailed
recovery steps; edit the print_done() function accordingly and reference
NEMOCLAW_RECOVERY_EXPORT_DIR and NEMOCLAW_RECOVERY_PROFILE conditionals when
making the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 78dc209c-7d60-4d1e-85f9-0b6d8544b9fc
📒 Files selected for processing (7)
bin/lib/onboard-session.jsbin/lib/onboard.jsinstall.shtest/e2e/test-onboard-repair.shtest/e2e/test-onboard-resume.shtest/onboard-session.test.jstest/onboard.test.js
✅ Files skipped from review due to trivial changes (3)
- test/onboard-session.test.js
- bin/lib/onboard-session.js
- test/e2e/test-onboard-resume.sh
🚧 Files skipped from review as they are similar to previous changes (2)
- test/e2e/test-onboard-repair.sh
- bin/lib/onboard.js
| if [[ "$ONBOARD_RAN" == true ]]; then | ||
| local sandbox_name | ||
| sandbox_name="$(resolve_default_sandbox_name)" | ||
| printf " ${C_GREEN}Your OpenClaw Sandbox is live.${C_RESET}\n" | ||
| printf " ${C_DIM}Sandbox in, break things, and tell us what you find.${C_RESET}\n" | ||
| printf "\n" | ||
| printf " ${C_GREEN}Next:${C_RESET}\n" | ||
| printf " %s$%s nemoclaw %s connect\n" "$C_GREEN" "$C_RESET" "$sandbox_name" | ||
| printf " %ssandbox@%s$%s openclaw tui\n" "$C_GREEN" "$sandbox_name" "$C_RESET" | ||
| elif [[ "$NEMOCLAW_READY_NOW" == true ]]; then | ||
| printf " ${C_GREEN}NemoClaw CLI is ready in this shell.${C_RESET}\n" | ||
| printf " ${C_DIM}Onboarding has not run yet.${C_RESET}\n" | ||
| printf "\n" | ||
| printf " ${C_GREEN}Next:${C_RESET}\n" | ||
| printf " %s$%s nemoclaw onboard\n" "$C_GREEN" "$C_RESET" | ||
| else | ||
| printf " ${C_GREEN}NemoClaw CLI is installed.${C_RESET}\n" | ||
| printf " ${C_DIM}Onboarding did not run because this shell cannot resolve 'nemoclaw' yet.${C_RESET}\n" | ||
| printf "\n" | ||
| printf " ${C_GREEN}Next:${C_RESET}\n" | ||
| if [[ -n "$NEMOCLAW_RECOVERY_EXPORT_DIR" ]]; then | ||
| printf " %s$%s export PATH=\"%s:\$PATH\"\n" "$C_GREEN" "$C_RESET" "$NEMOCLAW_RECOVERY_EXPORT_DIR" | ||
| fi | ||
| if [[ -n "$NEMOCLAW_RECOVERY_PROFILE" ]]; then | ||
| printf " %s$%s source %s\n" "$C_GREEN" "$C_RESET" "$NEMOCLAW_RECOVERY_PROFILE" | ||
| fi | ||
| printf " %s$%s nemoclaw onboard\n" "$C_GREEN" "$C_RESET" | ||
| fi |
There was a problem hiding this comment.
Potential duplicate recovery instructions.
When NEMOCLAW_READY_NOW=false and NEMOCLAW_RECOVERY_EXPORT_DIR is set, both print_done() (lines 126-132) and post_install_message() (lines 581-596) will print PATH recovery instructions, resulting in duplicate output for the user.
Consider either:
- Removing the recovery instructions from
print_done()and relying solely onpost_install_message(), or - Having
print_done()only print a brief message directing users to the instructions below.
🛠️ Proposed fix to avoid duplication
else
printf " ${C_GREEN}NemoClaw CLI is installed.${C_RESET}\n"
printf " ${C_DIM}Onboarding did not run because this shell cannot resolve 'nemoclaw' yet.${C_RESET}\n"
printf "\n"
- printf " ${C_GREEN}Next:${C_RESET}\n"
- if [[ -n "$NEMOCLAW_RECOVERY_EXPORT_DIR" ]]; then
- printf " %s$%s export PATH=\"%s:\$PATH\"\n" "$C_GREEN" "$C_RESET" "$NEMOCLAW_RECOVERY_EXPORT_DIR"
- fi
- if [[ -n "$NEMOCLAW_RECOVERY_PROFILE" ]]; then
- printf " %s$%s source %s\n" "$C_GREEN" "$C_RESET" "$NEMOCLAW_RECOVERY_PROFILE"
- fi
- printf " %s$%s nemoclaw onboard\n" "$C_GREEN" "$C_RESET"
+ printf " ${C_DIM}See recovery instructions below.${C_RESET}\n"
fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if [[ "$ONBOARD_RAN" == true ]]; then | |
| local sandbox_name | |
| sandbox_name="$(resolve_default_sandbox_name)" | |
| printf " ${C_GREEN}Your OpenClaw Sandbox is live.${C_RESET}\n" | |
| printf " ${C_DIM}Sandbox in, break things, and tell us what you find.${C_RESET}\n" | |
| printf "\n" | |
| printf " ${C_GREEN}Next:${C_RESET}\n" | |
| printf " %s$%s nemoclaw %s connect\n" "$C_GREEN" "$C_RESET" "$sandbox_name" | |
| printf " %ssandbox@%s$%s openclaw tui\n" "$C_GREEN" "$sandbox_name" "$C_RESET" | |
| elif [[ "$NEMOCLAW_READY_NOW" == true ]]; then | |
| printf " ${C_GREEN}NemoClaw CLI is ready in this shell.${C_RESET}\n" | |
| printf " ${C_DIM}Onboarding has not run yet.${C_RESET}\n" | |
| printf "\n" | |
| printf " ${C_GREEN}Next:${C_RESET}\n" | |
| printf " %s$%s nemoclaw onboard\n" "$C_GREEN" "$C_RESET" | |
| else | |
| printf " ${C_GREEN}NemoClaw CLI is installed.${C_RESET}\n" | |
| printf " ${C_DIM}Onboarding did not run because this shell cannot resolve 'nemoclaw' yet.${C_RESET}\n" | |
| printf "\n" | |
| printf " ${C_GREEN}Next:${C_RESET}\n" | |
| if [[ -n "$NEMOCLAW_RECOVERY_EXPORT_DIR" ]]; then | |
| printf " %s$%s export PATH=\"%s:\$PATH\"\n" "$C_GREEN" "$C_RESET" "$NEMOCLAW_RECOVERY_EXPORT_DIR" | |
| fi | |
| if [[ -n "$NEMOCLAW_RECOVERY_PROFILE" ]]; then | |
| printf " %s$%s source %s\n" "$C_GREEN" "$C_RESET" "$NEMOCLAW_RECOVERY_PROFILE" | |
| fi | |
| printf " %s$%s nemoclaw onboard\n" "$C_GREEN" "$C_RESET" | |
| fi | |
| if [[ "$ONBOARD_RAN" == true ]]; then | |
| local sandbox_name | |
| sandbox_name="$(resolve_default_sandbox_name)" | |
| printf " ${C_GREEN}Your OpenClaw Sandbox is live.${C_RESET}\n" | |
| printf " ${C_DIM}Sandbox in, break things, and tell us what you find.${C_RESET}\n" | |
| printf "\n" | |
| printf " ${C_GREEN}Next:${C_RESET}\n" | |
| printf " %s$%s nemoclaw %s connect\n" "$C_GREEN" "$C_RESET" "$sandbox_name" | |
| printf " %ssandbox@%s$%s openclaw tui\n" "$C_GREEN" "$sandbox_name" "$C_RESET" | |
| elif [[ "$NEMOCLAW_READY_NOW" == true ]]; then | |
| printf " ${C_GREEN}NemoClaw CLI is ready in this shell.${C_RESET}\n" | |
| printf " ${C_DIM}Onboarding has not run yet.${C_RESET}\n" | |
| printf "\n" | |
| printf " ${C_GREEN}Next:${C_RESET}\n" | |
| printf " %s$%s nemoclaw onboard\n" "$C_GREEN" "$C_RESET" | |
| else | |
| printf " ${C_GREEN}NemoClaw CLI is installed.${C_RESET}\n" | |
| printf " ${C_DIM}Onboarding did not run because this shell cannot resolve 'nemoclaw' yet.${C_RESET}\n" | |
| printf "\n" | |
| printf " ${C_DIM}See recovery instructions below.${C_RESET}\n" | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@install.sh` around lines 106 - 133, print_done() currently prints full
PATH/profile recovery instructions when NEMOCLAW_READY_NOW=false and
NEMOCLAW_RECOVERY_EXPORT_DIR/_PROFILE are set, duplicating the same instructions
later in post_install_message(); remove the duplication by either (A) deleting
the blocks in print_done() that print the "%s$%s export PATH=..." and "source
..." lines and keep the detailed recovery instructions only in
post_install_message(), or (B) replace those blocks in print_done() with a
single brief line (e.g., "See post-install instructions below for PATH/profile
recovery") so post_install_message() remains the sole source of detailed
recovery steps; edit the print_done() function accordingly and reference
NEMOCLAW_RECOVERY_EXPORT_DIR and NEMOCLAW_RECOVERY_PROFILE conditionals when
making the change.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
bin/lib/onboard.js (2)
2569-2594:⚠️ Potential issue | 🟠 MajorHydrate stored credentials when resuming provider selection.
When
resumeProviderSelectionis true,credentialEnvis restored from the session (e.g.,"NVIDIA_API_KEY"), but the actual secret value is never loaded intoprocess.env.setupInference()at Line 2598 builds provider env fromprocess.env[credentialEnv], which will beundefinedfor resumed remote-provider sessions unless the user re-exported the key in their current shell.Proposed fix
if (resumeProviderSelection) { + if (credentialEnv && !process.env[credentialEnv]) { + const savedCredential = getCredential(credentialEnv); + if (savedCredential) { + process.env[credentialEnv] = savedCredential; + } + } resumeStepMessage("provider selection", `${provider} / ${model}`); } else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 2569 - 2594, When resumeProviderSelection is true you restore credentialEnv (the env var name) but not the secret value itself, so before calling setupInference / continuing the flow set process.env[credentialEnv] to the stored secret from the session (e.g. session.steps.provider_selection.<credentialValueField>); update the resume branch that uses resumeStepMessage to also read the provider_selection step payload (the stored credential value) and assign it into process.env[credentialEnv] so setupInference sees the secret when using process.env[credentialEnv].
240-246:⚠️ Potential issue | 🟠 MajorUse
runOpenshell()for sandbox repair commands.Similar to the previous issue,
repairRecordedSandboxuses rawopenshellcommands viarun()instead of the resolved-binary wrapper.Proposed fix
function repairRecordedSandbox(sandboxName) { if (!sandboxName) return; note(` [resume] Cleaning up recorded sandbox '${sandboxName}' before recreating it.`); - run(`openshell forward stop 18789 2>/dev/null || true`, { ignoreError: true }); - run(`openshell sandbox delete "${sandboxName}" 2>/dev/null || true`, { ignoreError: true }); + runOpenshell(["forward", "stop", "18789"], { ignoreError: true }); + runOpenshell(["sandbox", "delete", sandboxName], { ignoreError: true }); registry.removeSandbox(sandboxName); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 240 - 246, The repairRecordedSandbox function currently calls raw commands via run(...) — replace those calls with the resolved-binary wrapper runOpenshell to ensure the correct openshell binary is used: in repairRecordedSandbox, change run(`openshell forward stop 18789 2>/dev/null || true`, { ignoreError: true }) to use runOpenshell('forward', ['stop','18789'], { ignoreError: true }) (or equivalent API your runOpenshell exposes), and change run(`openshell sandbox delete "${sandboxName}" 2>/dev/null || true`, { ignoreError: true }) to runOpenshell('sandbox', ['delete', sandboxName], { ignoreError: true }); keep the note(...) and registry.removeSandbox(sandboxName) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/onboard.js`:
- Around line 233-238: The getSandboxReuseState function currently calls
runCapture to invoke openshell which can fail if openshell is in ~/.local/bin
and PATH wasn't reloaded; change both calls that set getOutput and listOutput to
use runCaptureOpenshell instead of runCapture so openshell is resolved reliably,
keeping the same arguments (including ignoreError: true) and then pass those
outputs into getSandboxStateFromOutputs as before; update references to
getOutput and listOutput in getSandboxReuseState accordingly.
---
Duplicate comments:
In `@bin/lib/onboard.js`:
- Around line 2569-2594: When resumeProviderSelection is true you restore
credentialEnv (the env var name) but not the secret value itself, so before
calling setupInference / continuing the flow set process.env[credentialEnv] to
the stored secret from the session (e.g.
session.steps.provider_selection.<credentialValueField>); update the resume
branch that uses resumeStepMessage to also read the provider_selection step
payload (the stored credential value) and assign it into
process.env[credentialEnv] so setupInference sees the secret when using
process.env[credentialEnv].
- Around line 240-246: The repairRecordedSandbox function currently calls raw
commands via run(...) — replace those calls with the resolved-binary wrapper
runOpenshell to ensure the correct openshell binary is used: in
repairRecordedSandbox, change run(`openshell forward stop 18789 2>/dev/null ||
true`, { ignoreError: true }) to use runOpenshell('forward', ['stop','18789'], {
ignoreError: true }) (or equivalent API your runOpenshell exposes), and change
run(`openshell sandbox delete "${sandboxName}" 2>/dev/null || true`, {
ignoreError: true }) to runOpenshell('sandbox', ['delete', sandboxName], {
ignoreError: true }); keep the note(...) and registry.removeSandbox(sandboxName)
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0cbb97c6-c79d-4ab2-9330-5b2eb2d6e1b7
📒 Files selected for processing (1)
bin/lib/onboard.js
|
Superseded by the signed-commit replacement PR created from branch . Closing this one to keep review on the replacement branch. |
|
New PR: #961 |
Summary
Hardens NemoClaw install and onboarding resiliency across repeat runs, partial failures, and restart/runtime drift.
Addresses:
What Changed
nemoclaw onboard --resumenemoclaw debugHow To Test
Start from the PR branch:
Please focus on resiliency and recovery behavior:
Fresh onboard
./install.shnemoclaw onboardnemoclaw <sandbox> statusandnemoclaw <sandbox> connectInterrupted onboard + resume
nemoclaw onboard --resumeRepeat onboard
Runtime/gateway drift
nemoclaw <sandbox> statusandnemoclaw <sandbox> connectonboard --resumeguidance when neededDebug visibility
nemoclaw debug --quickPlease include in any bug report:
nemoclaw debug --quickopenshell statusopenshell sandbox listopenshell gateway infoValidation