Skip to content

fix: use 'gateway run' on Windows, not 'gateway start' (fixes port-spiral bug)#623

Closed
a2735444 wants to merge 1 commit into
EKKOLearnAI:mainfrom
a2735444:fix/windows-gateway-run-mode
Closed

fix: use 'gateway run' on Windows, not 'gateway start' (fixes port-spiral bug)#623
a2735444 wants to merge 1 commit into
EKKOLearnAI:mainfrom
a2735444:fix/windows-gateway-run-mode

Conversation

@a2735444
Copy link
Copy Markdown

Problem

On Windows, hermes gateway start only registers a Task Scheduler entry — it does not actually launch the gateway process or wait for it to be ready. This causes the GatewayManager's health check to always time out, which triggers resolvePort() to increment the port number in config.yaml on each retry (the "port-spiral" bug). The user sees the gateway as "stopped" indefinitely.

Root Cause

detectInitSystem() correctly detects Windows and returns 'windows-service'. But needsRunMode is defined as:

const needsRunMode = !['systemd', 'launchd', 'windows-service'].includes(initSystem)

Because 'windows-service' is in the exclusion list, Windows takes the gateway start code path instead of the gateway run code path.

Fix

Remove 'windows-service' from the exclusion list so Windows also uses hermes gateway run as a detached child process (spawn + unref), just like WSL and Docker environments. This is safe on Windows because:

  • spawn(..., { detached: true, windowsHide: true }) is fully supported
  • process.kill(pid, 'SIGTERM') on Windows calls TerminateProcess
  • process.kill(pid, 0) works for liveliness checks

Change

Line 131: - 'windows-service' removed from needsRunMode exclusion array.

Testing

  • On Windows: hermes gateway run spawns a detached process, writes gateway.pid, health check passes → port stays stable
  • On macOS/Linux: unchanged behavior, still uses gateway start/stop via systemd/launchd

On Windows, `hermes gateway start` only registers a scheduled task
and does not actually launch the process. This causes the GatewayManager's
health check to always time out, triggering the port-finding spiral bug.

Fix: remove 'windows-service' from the needsRunMode exclusion list so
Windows also uses `hermes gateway run` as a detached child process,
just like WSL and Docker environments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants