fix: v0.2.1 — sandbox spawn reliability and auto-topup#246
Conversation
…rent All replication tools (start_child, check_child_status, verify_child_constitution) were running commands on the parent's sandbox instead of the child's. Fixed by using createScopedClient(child.sandboxId) to route exec/writeFile to the correct sandbox. Also fixed spawn to clone from GitHub (matching README) instead of installing a non-existent npm package, and added proper error handling to start_child with process verification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… tools editFile() was committing snapshots to ~/.automaton/ (state dir) instead of the repo root where source code lives. Now uses gitCommit() targeting process.cwd(). Also triggers `npm run build` after editing .ts/.js/.tsx/.jsx files so changes take effect at runtime. Adds revert_last_edit (git revert HEAD) and reset_to_upstream (git reset --hard origin/main) tools so the agent can recover from bad self-modifications. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rker When sandbox creation fails with 402 INSUFFICIENT_CREDITS, attempt to top up credits via x402 and retry once before falling back to a local worker. Uses a 60s cooldown to prevent hammering the topup endpoint. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The hardcoded diskGb=5 with 1024MB memory doesn't match any pricing tier (Medium requires 10GB disk). Add a SANDBOX_TIERS lookup so the disk and vCPU are always consistent with the requested memory size. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The spawn_child tool calls spawnChild() directly without the topup-and-retry logic that the orchestrator's spawnAgent has. Add the same pattern: catch 402, topup credits, retry once. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Conway API no longer supports DELETE /v1/sandboxes — sandboxes are prepaid and non-refundable. Make deleteSandbox a no-op, remove cleanup calls in spawn error paths, update the delete_sandbox tool to inform the agent, and let SandboxCleanup transition to cleaned_up directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
c8a7049 to
21ab862
Compare
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
| const topup = await topupForSandbox({ | ||
| apiUrl: ctx.config.conwayApiUrl, | ||
| account: ctx.identity.account, | ||
| error: err, | ||
| }); | ||
| if (topup?.success) { | ||
| const retryLifecycle = new ChildLifecycle(ctx.db.raw); | ||
| const retryGenesis = generateGenesisConfig(ctx.identity, ctx.config, { | ||
| name: args.name as string, | ||
| specialization: args.specialization as string | undefined, | ||
| message: args.message as string | undefined, | ||
| }); | ||
| child = await spawnChild( | ||
| ctx.conway, | ||
| ctx.identity, | ||
| ctx.db, | ||
| retryGenesis, | ||
| retryLifecycle, | ||
| ); | ||
| } |
There was a problem hiding this comment.
🔴 Unhandled exception in spawn_child retry path causes original error to be lost
In the spawn_child tool's auto-topup retry logic, the retry spawnChild() call on line 1606 is not wrapped in a try-catch. If the retry throws (e.g., max children reached because the first attempt already created a child record in failed state, or any other spawn error), the exception escapes the outer catch block entirely, and the if (!child) throw err on line 1616 is never reached.
Detailed comparison with orchestrator's spawnAgent
The orchestrator's spawnAgent in src/agent/loop.ts:287-309 correctly wraps the retry spawn in its own try-catch:
try {
const child = await retrySpawn(...);
return { ... };
} catch (retryError) {
logger.warn("Spawn retry after topup failed", { ... });
}But the spawn_child tool in src/agent/tools.ts:1594-1613 does NOT wrap either topupForSandbox() or the retry spawnChild() in a try-catch:
const topup = await topupForSandbox({...}); // could throw
if (topup?.success) {
child = await spawnChild(...); // CAN THROW, escapes catch block
}If the retry spawnChild throws, the new error propagates out of the catch block, the original err is lost, and the if (!child) throw err on line 1616 is never reached. This can result in confusing error messages and the failed first-attempt child record being left in an inconsistent state.
Impact: When a 402 topup succeeds but the retry spawn fails, the user gets an unhandled error from the retry instead of graceful fallback behavior. The original error context is lost.
| const topup = await topupForSandbox({ | |
| apiUrl: ctx.config.conwayApiUrl, | |
| account: ctx.identity.account, | |
| error: err, | |
| }); | |
| if (topup?.success) { | |
| const retryLifecycle = new ChildLifecycle(ctx.db.raw); | |
| const retryGenesis = generateGenesisConfig(ctx.identity, ctx.config, { | |
| name: args.name as string, | |
| specialization: args.specialization as string | undefined, | |
| message: args.message as string | undefined, | |
| }); | |
| child = await spawnChild( | |
| ctx.conway, | |
| ctx.identity, | |
| ctx.db, | |
| retryGenesis, | |
| retryLifecycle, | |
| ); | |
| } | |
| const topup = await topupForSandbox({ | |
| apiUrl: ctx.config.conwayApiUrl, | |
| account: ctx.identity.account, | |
| error: err, | |
| }); | |
| if (topup?.success) { | |
| try { | |
| const retryLifecycle = new ChildLifecycle(ctx.db.raw); | |
| const retryGenesis = generateGenesisConfig(ctx.identity, ctx.config, { | |
| name: args.name as string, | |
| specialization: args.specialization as string | undefined, | |
| message: args.message as string | undefined, | |
| }); | |
| child = await spawnChild( | |
| ctx.conway, | |
| ctx.identity, | |
| ctx.db, | |
| retryGenesis, | |
| retryLifecycle, | |
| ); | |
| } catch { /* retry failed, will throw original err below */ } | |
| } |
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
INSUFFICIENT_CREDITS, automatically top up credits via x402 and retry once (both orchestratorspawnAgentandspawn_childtool paths)createSandboxspecs with valid Conway tiers (was sending1 vCPU / 1024 MB / 5 GBwhich doesn't match any tier; now uses a lookup table)DELETE /v1/sandboxes— makedeleteSandboxa no-op, clean up all callersTest plan
pnpm exec tsc --noEmit— zero type errorspnpm vitest run— 141 tests passing across replication, lifecycle, and tools-security suites🤖 Generated with Claude Code