Skip to content

Conversation

@faraquet
Copy link
Contributor

@faraquet faraquet commented Oct 23, 2025

The tiny HTTP health-check server that now runs as a supervised process.

  • Endpoints:
    • / and /health:
      • Returns 200 OK with body OK when the supervisor and all supervised processes (workers, dispatchers, scheduler, and the health server itself) have fresh heartbeats.
      • Returns 503 Service Unavailable with body Unhealthy if any supervised process (or the supervisor) has a stale heartbeat.
    • Any other path: returns 404 Not Found
    • Configure via config/queue.yml under health_server:. Both host and port are required.

Enable and configure via process configuration:

production:
  health_server:
    host: 0.0.0.0
    port: 9393

Note:

  • This runs under the supervisor just like workers/dispatchers.
  • When the Puma plugin is active (plugin :solid_queue in puma.rb), the configured health server is skipped to avoid running multiple HTTP servers in the same process tree. A warning is logged. If you need the health server, run Solid Queue outside Puma (for example, via bin/jobs) or disable the plugin on that instance.

@faraquet
Copy link
Contributor Author

faraquet commented Oct 23, 2025

@rosa could you please have a look? This is a simple and safe change, but I believe it will be useful for many people, including myself.
I'm sure the failing tests are unrelated, as I've seen the same issue in other branches.

Happy to discuss if needed

@Th3-M4jor
Copy link
Contributor

Feel free to disagree with me on this, but what about adding some kind of check to ensure that this can't be run while also using the Puma plugin?

@faraquet
Copy link
Contributor Author

Thanks for the hint, @Th3-M4jor 👍

I agree with you and will add an extra check.

@faraquet faraquet force-pushed the aaa/health_server branch 2 times, most recently from d371408 to 177afd4 Compare October 24, 2025 00:13
@faraquet
Copy link
Contributor Author

Added ✅

@faraquet
Copy link
Contributor Author

I've revised the idea in favour of launching via supervisor and setting it up via a config file, just like workers

@faraquet faraquet force-pushed the aaa/health_server branch 2 times, most recently from 00575cf to 13d3414 Compare October 27, 2025 21:24
@rosa
Copy link
Member

rosa commented Oct 28, 2025

Hey @faraquet, thanks for working on this! However, I'm not sure the separate web server process guarantees anything about the health of the workers and dispatchers and the other processes beyond the supervisor being alive, which you can check via the pid 🤔

Another question I have is why the possibility of running multiple health server processes per supervisor.

@faraquet
Copy link
Contributor Author

Thanks a lot for the feedback, @rosa!

Another question I have is why the possibility of running multiple health server processes per supervisor.

I was thinking about doing something similar with the workers and reusing their configuration, but I completely agree - there's no reason to run multiple processes, so I changed it to use just one.

I'm not sure the separate web server process guarantees anything about the health of the workers and dispatchers and the other processes beyond the supervisor being alive, which you can check via the pid

In our container setup, it's quite inconvenient to check via pid, and having an HTTP response would make things much easier.

What do you think about making the server a bit more advanced and actually checking the supervisor's state? I've made the POC version already. If you think it looks promising, I can continue developing it and add more tests.

@faraquet faraquet changed the title Add configurable HTTP health-check server Add HTTP health-check server Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants