feat(pty): implement lifecycle PTY management for task execution by yashdev9274 · Pull Request #1364 · generalaction/emdash

yashdev9274 · 2026-03-09T13:42:44Z

Summary

Added interface and function to manage PTY processes.
Integrated PTY handling into for improved task execution and lifecycle event management.
Updated tests to reflect changes in PTY management and ensure proper handling of lifecycle events.

Fixes

Fixes #1304

Type of change

Bug fix (non-breaking change which fixes an issue)
Chore (refactoring code, technical debt, workflow improvements)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Refactor (does not change functionality, e.g. code style improvements, linting)
This change requires a documentation update

Mandatory Tasks

I have self-reviewed the code
A decent size PR without self-review might be rejected

Checklist

I have read the contributing guide
My code follows the style guidelines of this project (pnpm run format)
I have commented my code, particularly in hard-to-understand areas
I have checked if my PR needs changes to the documentation
I have checked if my changes generate no new warnings (pnpm run lint)
I have added tests that prove my fix is effective or that my feature works
I have checked if new and existing unit tests pass locally with my changes

vercel · 2026-03-09T13:42:48Z

@yashdev9274 is attempting to deploy a commit to the General Action Team on Vercel.

A member of the Team first needs to authorize it.

greptile-apps · 2026-03-09T13:45:32Z

Greptile Summary

This PR replaces the raw child_process.spawn calls inside TaskLifecycleService with a PTY-backed execution model (startLifecyclePty in ptyManager.ts), falling back to a wrapped ChildProcess when PTY support is disabled or unavailable. The intent is to give lifecycle scripts a proper terminal environment (colours, interactive tools, etc.).

While the overall direction is sound, there are three critical logic bugs introduced by the change:

runTeardown ignores PTY-managed runs — it still checks this.runProcesses.get(taskId) (the old map) instead of this.runPtys. When a PTY run is active, teardown launches immediately without waiting for the run to stop, causing a race condition between concurrent run and teardown scripts.
lifecyclePtys is never killed on clearTask or shutdown — setup/teardown PTY handles are stored in lifecyclePtys but that map is not iterated or cleared in either cleanup path, leaking OS PTY file descriptors on task removal or service shutdown.
spawnWithFallback has no 'error' event handler — the original spawn paths caught OS-level spawn errors (e.g., ENOENT) and resolved the phase promise as failed. The fallback omits this, so a spawn error silently leaves the phase status as 'running' forever and the corresponding setupInflight/teardownInflight promise never settles.

Additionally, tests were significantly weakened: EMDASH_DISABLE_PTY=1 forces all tests through the fallback path (never exercising PTY code), and two previously meaningful tests ('keeps setup failed when child emits error and exit' and 'clearTask stops in-flight setup/teardown processes') were hollowed out to the point of no longer testing their original scenarios.

Confidence Score: 1/5

Not safe to merge — contains critical race conditions and resource leaks that were not present before this change.
Three logic-level bugs: runTeardown never waits for PTY-managed runs to finish (race condition), lifecyclePtys leaks PTY resources on task clear/shutdown, and the fallback spawn path can hang forever due to missing error handling. Tests were also significantly weakened, leaving the new PTY code path entirely untested.
Primary attention required on src/main/services/TaskLifecycleService.ts (teardown wait logic, lifecyclePtys cleanup, spawnWithFallback error handling). src/test/main/TaskLifecycleService.test.ts needs the hollowed-out tests restored.

Important Files Changed

Filename	Overview
src/main/services/TaskLifecycleService.ts	Integrates PTY-based process management for run/lifecycle phases, but introduces critical bugs: `runTeardown` checks the old `runProcesses` map instead of `runPtys` so it never waits for PTY runs to stop; `lifecyclePtys` is never cleaned up in `clearTask`/`shutdown`; and `spawnWithFallback` drops the `'error'` event handler causing potential infinite hangs.
src/main/services/ptyManager.ts	Adds `LifecyclePtyHandle` interface and `startLifecyclePty` function; functionally reasonable but spawns the shell with `-il` (interactive + login) flags which source profile scripts and can contaminate lifecycle output and slow startup.
src/test/main/TaskLifecycleService.test.ts	Sets `EMDASH_DISABLE_PTY=1` to force fallback paths in tests; several tests were significantly weakened or hollowed out (e.g., the setup error/exit test and clearTask in-flight test no longer validate their originally intended behavior).

Sequence Diagram

sequenceDiagram
    participant Caller
    participant TaskLifecycleService
    participant ptyManager
    participant FallbackSpawn

    Caller->>TaskLifecycleService: startRun(taskId, ...)
    TaskLifecycleService->>ptyManager: startLifecyclePty({ id, command, cwd, env })
    alt PTY available
        ptyManager-->>TaskLifecycleService: LifecyclePtyHandle
        TaskLifecycleService->>TaskLifecycleService: runPtys.set(taskId, handle)
    else PTY unavailable (EMDASH_DISABLE_PTY=1 or error)
        ptyManager--xTaskLifecycleService: throws Error
        TaskLifecycleService->>FallbackSpawn: spawnWithFallback(id, script, cwd, env)
        FallbackSpawn-->>TaskLifecycleService: LifecyclePtyHandle (wraps ChildProcess)
        TaskLifecycleService->>TaskLifecycleService: runPtys.set(taskId, handle)
    end
    TaskLifecycleService-->>Caller: { ok: true }

    Note over TaskLifecycleService: handle.onData → emitLifecycleEvent 'line'
    Note over TaskLifecycleService: handle.onExit → update state, emitLifecycleEvent 'exit'

    Caller->>TaskLifecycleService: stopRun(taskId)
    TaskLifecycleService->>TaskLifecycleService: stopIntents.add(taskId)
    TaskLifecycleService->>TaskLifecycleService: ptyHandle.kill()
    TaskLifecycleService->>TaskLifecycleService: state.run = idle (EAGER - before actual exit)
    TaskLifecycleService-->>Caller: { ok: true }

    Caller->>TaskLifecycleService: runTeardown(taskId, ...)
    TaskLifecycleService->>TaskLifecycleService: existingRun = runProcesses.get(taskId)
    Note over TaskLifecycleService: ⚠️ BUG: always undefined when PTY is used
    TaskLifecycleService->>TaskLifecycleService: runFinite(..., 'teardown') — no wait for PTY exit

Comments Outside Diff (3)

src/main/services/TaskLifecycleService.ts, line 520-538 (link)

runTeardown never waits for PTY-managed run to stop

runTeardown still guards against an active run process by checking this.runProcesses.get(taskId) (line 520), but when startLifecyclePty succeeds the run handle is stored in this.runPtys, not runProcesses. So existingRun is always undefined when a PTY-based run is active, and teardown immediately proceeds while the run process is still alive. This is a race condition that can corrupt state and cause conflicts between the concurrent run and teardown scripts.

The fix requires also checking runPtys and awaiting its exit before continuing. Because LifecyclePtyHandle has no once('exit', …) API, a small helper (e.g., wrapping onExit in a Promise) is needed:
```
// Ensure PTY-managed run is stopped before teardown starts.
const existingPty = this.runPtys.get(taskId);
if (existingPty) {
  this.stopRun(taskId);
  await new Promise<void>((resolve) => {
    const timer = setTimeout(() => {
      log.warn('Timed out waiting for run PTY to exit before teardown', { taskId });
      resolve();
    }, 10_000);
    existingPty.onExit(() => {
      clearTimeout(timer);
      resolve();
    });
  });
}

// (existing runProcesses block stays for the fallback path)
const existingRun = this.runProcesses.get(taskId);
if (existingRun) { … }
```
src/main/services/TaskLifecycleService.ts, line 559-601 (link)

lifecyclePtys not cleaned up in clearTask or shutdown

lifecyclePtys is populated for every setup/teardown phase invocation (this.lifecyclePtys.set(ptyId, ptyHandle)), and the entry is removed in the onExit callback. However, clearTask() and shutdown() iterate over runPtys and runProcesses but never touch lifecyclePtys.

If a task is cleared (or the service shuts down) while a setup or teardown PTY is in-flight, those PTY processes will keep running in the background, leaking both OS-level PTY file descriptors and the ptys map entries in ptyManager.ts.
src/test/main/TaskLifecycleService.test.ts, line 265-278 (link)

Test no longer exercises the intended behavior

The original test 'clearTask stops in-flight setup/teardown processes' verified that calling clearTask while a setup process was in-flight would kill the running child and clean up finiteProcesses. The new version only calls getState (which creates empty state) and then clearTask, and asserts that the state is deleted — which is trivially true and exercises no meaningful lifecycle behavior.

The key regression is that the test no longer validates that in-flight PTY (or fallback process) resources are actually killed when clearTask is called mid-execution. The lifecyclePtys cleanup gap noted elsewhere makes this especially risky to leave untested.

_{Last reviewed commit: 84cef58}

greptile-apps · 2026-03-09T13:45:42Z

src/main/services/TaskLifecycleService.ts

    });
  }

+  private spawnWithFallback(
+    id: string,
+    script: string,
+    cwd: string,
+    env: NodeJS.ProcessEnv
+  ): LifecyclePtyHandle {
+    const child = spawn(script, {
+      cwd,
+      shell: true,
+      env,
+      detached: true,
+    });
+    this.trackFiniteProcess(id, child);
+    const dataCallbacks: ((data: string) => void)[] = [];
+    const exitCallbacks: ((exitCode: number | null, signal: string | null) => void)[] = [];
+
+    const onData = (buf: Buffer) => {
+      const line = buf.toString();
+      for (const cb of dataCallbacks) {
+        cb(line);
+      }
+    };
+    child.stdout?.on('data', onData);
+    child.stderr?.on('data', onData);
+
+    child.on('exit', (code) => {
+      for (const cb of exitCallbacks) {
+        cb(code, null);
+      }
+    });
+
+    return {
+      onData: (cb) => dataCallbacks.push(cb),
+      onExit: (cb) => exitCallbacks.push(cb),
+      kill: () => {
+        this.killProcessTree(child, 'SIGTERM');
+      },
+    };
+  }
+
  async runSetup(
    taskId: string,
    taskPath: string,


spawnWithFallback drops the 'error' event handler

The original spawn path attached a child.on('error', …) listener that called finish({ ok: false, … }) when the process failed to start (e.g., ENOENT, permission denied). spawnWithFallback omits this listener entirely.

If the shell process fails to spawn (or crashes at the OS level before writing to stdout/stderr), neither exit nor data will ever fire, and the finish callback inside runFinite will never be called. This leaves the task's phase status stuck at 'running' indefinitely and causes setupInflight / teardownInflight promises to hang forever.

greptile-apps · 2026-03-09T13:45:43Z

src/main/services/TaskLifecycleService.ts

  stopRun(taskId: string): LifecycleResult {
+    const ptyHandle = this.runPtys.get(taskId);
+    if (ptyHandle) {
+      this.stopIntents.add(taskId);
+      try {
+        ptyHandle.kill();
+        const cur = this.ensureState(taskId);
+        cur.run = {
+          ...cur.run,
+          status: 'idle',
+          finishedAt: this.nowIso(),
+          pid: null,
+          error: null,
+        };
+        return { ok: true };
+      } catch (error) {
+        this.stopIntents.delete(taskId);
+        const message = error instanceof Error ? error.message : String(error);
+        log.warn('Failed to stop run PTY', { taskId, error: message });
+        return { ok: false, error: message };
+      }


Premature state update in stopRun before PTY actually exits

When ptyHandle.kill() is called (line 457), stopRun immediately updates cur.run.status to 'idle' and returns { ok: true } — before the PTY process has actually terminated. PTY processes can take a moment to die after SIGTERM is delivered (or may ignore it entirely).

Meanwhile, the onExit callback registered in startRun will fire later with wasStopped = true and overwrite status with 'idle' again (harmless, but inconsistent). More importantly, if kill() silently swallows an error and the process does not exit, the state remains 'idle' while the process is still running.

Consider updating state in onExit (where it already happens correctly via the wasStopped flag) rather than eagerly here, and removing the early state mutation from stopRun.

greptile-apps · 2026-03-09T13:45:44Z

src/main/services/ptyManager.ts

+  const proc = pty.spawn(defaultShell, ['-ilc', command], {
+    name: 'xterm-256color',
+    cols: 120,
+    rows: 32,
+    cwd: cwd || os.homedir(),
+    env: useEnv,
+  });


Interactive shell flags may interfere with script output

The PTY is spawned as pty.spawn(defaultShell, ['-ilc', command], …). The -i (interactive) and -l (login) flags cause shells like bash/zsh to source profile files (~/.bashrc, ~/.zshrc, /etc/profile, etc.). This can:

Produce extra output (prompts, greeting messages, echo in profile files) that contaminates the data delivered to onData callbacks and appears as lifecycle log lines.

Significantly slow startup on systems with heavyweight profile scripts.

Alter environment variables (e.g., PATH) in unexpected ways that differ from the env object explicitly passed to startLifecyclePty.

For non-interactive lifecycle scripts, -c alone (or --norc --noprofile combined with -c) is typically sufficient.

yashdev9274 · 2026-03-10T03:13:03Z

hey @arnestrickmann do review this !

- Added interface and function to manage PTY processes. - Integrated PTY handling into for improved task execution and lifecycle event management. - Updated tests to reflect changes in PTY management and ensure proper handling of lifecycle events.

yashdev9274 · 2026-03-14T03:21:43Z

hey @arnestrickmann any update on this PR

greptile-apps bot reviewed Mar 9, 2026

View reviewed changes

yashdev9274 force-pushed the feat-yd-3 branch from 84cef58 to e377376 Compare March 11, 2026 04:48

ckafrouni mentioned this pull request Mar 11, 2026

Bug: Lifecycle scripts use child_process.spawn instead of PTY, breaking interactive tools #1304

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(pty): implement lifecycle PTY management for task execution#1364

feat(pty): implement lifecycle PTY management for task execution#1364
yashdev9274 wants to merge 1 commit intogeneralaction:mainfrom
yashdev9274:feat-yd-3

yashdev9274 commented Mar 9, 2026

Uh oh!

vercel bot commented Mar 9, 2026

Uh oh!

greptile-apps bot commented Mar 9, 2026 •

edited

Loading

Important Files Changed

Comments Outside Diff (3)

Uh oh!

greptile-apps bot Mar 9, 2026

Uh oh!

greptile-apps bot Mar 9, 2026

Uh oh!

greptile-apps bot Mar 9, 2026

Uh oh!

yashdev9274 commented Mar 10, 2026

Uh oh!

yashdev9274 commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yashdev9274 commented Mar 9, 2026

Summary

Fixes

Type of change

Mandatory Tasks

Checklist

Uh oh!

vercel bot commented Mar 9, 2026

Uh oh!

greptile-apps bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 1/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (3)

Uh oh!

greptile-apps bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

yashdev9274 commented Mar 10, 2026

Uh oh!

yashdev9274 commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps bot commented Mar 9, 2026 •

edited

Loading