fix: prevent unbounded Claude subprocess spawning in worker daemon by jayvenn21 · Pull Request #1008 · thedotmack/claude-mem

jayvenn21 · 2026-02-07T17:15:15Z

Summary

This PR introduces defensive process lifecycle controls to prevent the
worker-service.cjs --daemon from spawning unbounded Claude CLI subprocesses
that are never cleaned up, which can lead to catastrophic memory, swap, and disk
exhaustion over time.

This change does not alter core functionality or user-facing behavior under
normal workloads. It adds guardrails and cleanup to ensure long-running daemon
sessions remain stable.

Problem

On macOS (and potentially other platforms), the daemon currently:

spawns Claude CLI subprocesses without a hard upper bound
does not reliably terminate or reap completed / idle children
allows orphaned subprocesses to accumulate indefinitely

Over multi-day usage, this results in:

hundreds of orphaned claude processes
tens of GB of RAM + swap consumption
disk exhaustion and cascading service failures (e.g. Redis, Docker)

Multiple related issues report the same failure mode.

Changes

This PR introduces minimal but critical safety mechanisms:

Track spawned Claude subprocesses explicitly
Enforce a maximum concurrent subprocess limit
Ensure child processes are terminated and removed on task completion
Add defensive cleanup on daemon startup to reap stale children
Log warnings when limits are hit instead of silently spawning indefinitely

These changes are intentionally conservative and avoid architectural rewrites.

Why this approach

This is a production-safety fix:

prevents resource exhaustion
avoids breaking existing workflows
buys time for a more comprehensive worker-pool redesign if desired

The goal is to fail safely rather than catastrophically.

Testing

Verified subprocess cleanup under repeated task execution
Verified daemon remains stable over extended runtime
Confirmed no behavior change for normal interactive usage

Related issues

greptile-apps · 2026-02-07T17:16:40Z

Greptile Overview

Greptile Summary

This PR adds defensive lifecycle controls around Claude CLI subprocesses in the worker daemon:

ProcessRegistry now tracks spawned subprocesses and enforces a hard cap (MAX_CONCURRENT_CLAUDE_SUBPROCESSES).
WorkerService invokes an extra startup reap to kill stale/orphaned Claude processes from prior runs, and continues running the periodic orphan reaper.

Overall this fits the existing approach of process management/cleanup in src/services/worker/ProcessRegistry.ts and the worker’s background initialization flow in src/services/worker-service.ts to keep long-running daemons stable and avoid runaway resource usage.

Confidence Score: 3/5

This PR is directionally safe but has a couple of correctness edge cases that can kill valid subprocesses or wedge spawning under load.
The added guardrails are valuable, but the startup reap currently calls reapOrphanedProcesses with an empty active-session set (which can SIGKILL any registry-tracked children if any are running at that moment). Separately, cap enforcement is based on registry size, so if the registry gets out of sync with live processes the worker can start refusing spawns until restart.
src/services/worker-service.ts, src/services/worker/ProcessRegistry.ts

Important Files Changed

Filename	Overview
src/services/worker-service.ts	Adds a startup call to reap orphaned Claude subprocesses after background initialization; current implementation passes an empty active session set which can SIGKILL valid in-flight subprocesses.
src/services/worker/ProcessRegistry.ts	Adds a max concurrent subprocess cap and exposes active process count; cap enforcement throws when registry size reaches limit, which can wedge spawning if the registry gets out of sync with live processes.

Sequence Diagram

sequenceDiagram
  autonumber
  participant WS as WorkerService
  participant PR as ProcessRegistry
  participant OS as OS/ps
  participant CP as Claude subprocess

  WS->>WS: initializeBackground()
  WS->>WS: initializationCompleteFlag=true
  WS->>WS: resolveInitialization()
  WS->>PR: reapOrphanedProcesses(activeSessionIds=∅)
  PR->>PR: iterate processRegistry
  PR-->>CP: SIGKILL tracked children not in activeSessionIds
  PR->>OS: ps | grep claude... (killSystemOrphans)
  OS-->>PR: orphan list (ppid=1)
  PR-->>CP: SIGKILL system orphans
  WS->>PR: startOrphanReaper(getActiveSessionIds)
  loop every 5 minutes
    PR->>PR: reapOrphanedProcesses(activeSessionIds)
    PR->>OS: killSystemOrphans()
  end

  Note over PR,CP: createPidCapturingSpawn enforces MAX_CONCURRENT_CLAUDE_SUBPROCESSES
  WS->>PR: createPidCapturingSpawn(sessionDbId)
  PR-->>CP: spawn(command,args)
  PR->>PR: registerProcess(pid, sessionDbId)
  CP-->>PR: 'exit' event
  PR->>PR: unregisterProcess(pid)

greptile-apps

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

src/services/worker-service.ts

src/services/worker/ProcessRegistry.ts

…p check

jayvenn21 · 2026-02-07T17:22:48Z

It looks like the failing claude-review check is due to missing OIDC
permissions rather than a code issue.

The error indicates the GitHub Action cannot fetch an ID token
(ACTIONS_ID_TOKEN_REQUEST_URL is not set), which commonly happens on
forked PRs unless the workflow explicitly enables:

permissions:
id-token: write

Since this is a contributor PR from a fork, the action may need to be
disabled for forks, made non-blocking, or the permissions adjusted by a
maintainer.

Happy to rebase or rerun once that’s resolved.

JGrubb · 2026-02-10T14:37:14Z

See here also - #1010

jayvenn21 · 2026-02-10T16:16:26Z

See here also - #1010

Sounds good. I'll take a look at it.

thedotmack · 2026-02-16T05:25:12Z

Closing in favor of #1085 (rodboev) which provides a more comprehensive solution to the same problem. Thank you for the contribution!

fix: prevent unbounded Claude subprocess spawning in worker daemon

3e6f8bb

greptile-apps bot reviewed Feb 7, 2026

View reviewed changes

src/services/worker-service.ts Outdated Show resolved Hide resolved

src/services/worker/ProcessRegistry.ts Show resolved Hide resolved

fix(process): safe startup reap (OS orphans only) and prune before ca…

faaef98

…p check

VedanthB mentioned this pull request Feb 9, 2026

Bug: worker daemon leaks 233+ claude subprocesses (load avg 151, M4 Pro) #1040

Closed

thedotmack closed this Feb 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: prevent unbounded Claude subprocess spawning in worker daemon#1008

fix: prevent unbounded Claude subprocess spawning in worker daemon#1008
jayvenn21 wants to merge 2 commits intothedotmack:mainfrom
jayvenn21:fix/unbounded-claude-subprocess-spawn

jayvenn21 commented Feb 7, 2026

Uh oh!

greptile-apps bot commented Feb 7, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Uh oh!

jayvenn21 commented Feb 7, 2026

Uh oh!

JGrubb commented Feb 10, 2026

Uh oh!

jayvenn21 commented Feb 10, 2026

Uh oh!

thedotmack commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

jayvenn21 commented Feb 7, 2026

Summary

Problem

Changes

Why this approach

Testing

Related issues

Uh oh!

greptile-apps bot commented Feb 7, 2026

Greptile Overview

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jayvenn21 commented Feb 7, 2026

Uh oh!

JGrubb commented Feb 10, 2026

Uh oh!

jayvenn21 commented Feb 10, 2026

Uh oh!

thedotmack commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants