Skip to content

fix(cli): add missing subprocess.run() timeouts in doctor and status#3693

Open
dieutx wants to merge 1 commit intoNousResearch:mainfrom
dieutx:fix/cli-subprocess-timeouts
Open

fix(cli): add missing subprocess.run() timeouts in doctor and status#3693
dieutx wants to merge 1 commit intoNousResearch:mainfrom
dieutx:fix/cli-subprocess-timeouts

Conversation

@dieutx
Copy link
Copy Markdown
Contributor

@dieutx dieutx commented Mar 29, 2026

Summary

4 subprocess.run() calls in CLI utilities have no timeout parameter and can hang indefinitely if the child process blocks. The rest of the CLI already uses timeouts (clipboard.py: 3-15s, banner.py: 5-10s, doctor.py npm audit: 30s) — these 4 are the only ones missing.

Same class of bug as #3469 (git/ripgrep subprocess calls hanging on large repos).

Root Cause

subprocess.run() defaults to timeout=None, meaning it waits forever. If the docker daemon is unresponsive, systemctl is waiting for D-Bus, or an SSH host is unreachable past its ConnectTimeout, hermes doctor and hermes status hang with no feedback.

Fix

  • hermes_cli/doctor.py:409docker info: add timeout=10
  • hermes_cli/doctor.py:429ssh ... echo ok: add timeout=15 (slightly above SSH's own ConnectTimeout=5)
  • hermes_cli/status.py:285systemctl --user is-active: add timeout=5
  • hermes_cli/status.py:296launchctl list: add timeout=5

Each call site catches subprocess.TimeoutExpired and treats it as failure, consistent with how non-zero return codes are already handled.

Tests

1 new AST-based regression test in tests/hermes_cli/test_subprocess_timeouts.py — parameterized across all 4 CLI modules that use subprocess.run(). Parses each file and asserts every call has a timeout keyword. Catches future regressions automatically.

11 pre-existing doctor/status tests + 4 new = 15 passed.

Add timeout parameters to 4 subprocess.run() calls that could hang
indefinitely if the child process blocks (e.g., unresponsive docker
daemon, systemctl waiting for D-Bus):

- doctor.py: docker info (timeout=10), ssh check (timeout=15)
- status.py: systemctl is-active (timeout=5), launchctl list (timeout=5)

Each call site now catches subprocess.TimeoutExpired and treats it as
a failure, consistent with how non-zero return codes are already handled.

Add AST-based regression test that verifies every subprocess.run() call
in CLI modules specifies a timeout keyword argument.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant