Skip to content

Docker Deployment For Kelvin Evaluator#825

Merged
Kobzol merged 6 commits intomrlvsb:masterfrom
JersyJ:docker-evaluator
Feb 23, 2026
Merged

Docker Deployment For Kelvin Evaluator#825
Kobzol merged 6 commits intomrlvsb:masterfrom
JersyJ:docker-evaluator

Conversation

@JersyJ
Copy link
Contributor

@JersyJ JersyJ commented Feb 9, 2026

  • Evaluator Docker Image Docker deployment #503
  • Docker Compose Evaluator Scheduler service definition - configuration with scheduler (1 rq worker)
  • Docker Compose Evaluator CPU Workers service definition - configuration with CPU workers (32 by default , EVALUATOR_CPU_REPLICAS)
  • Docker Compose Evaluator GPU Workers service definition - configuration with GPU workers (32 by default, EVALUATOR_CUDA_REPLICAS)
    EVALUATOR_REDIS__HOST
    EVALUATOR_REDIS__PORT
  • Document undocumented environment variable and add logic for running Evaluator inside container (API_INTERNAL_BASEURL)

Copilot AI review requested due to automatic review settings February 9, 2026 15:03
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Docker-deployable “Kelvin Evaluator” component (image + compose services) and updates backend logic/config so evaluator workers can run inside containers and communicate using an internal base URL.

Changes:

  • Introduces an evaluator Docker image target and an entrypoint that runs evaluator image builds before starting workers.
  • Extends docker-compose.yml with evaluator scheduler/CPU/GPU worker services and a Docker socket TCP proxy.
  • Updates backend utilities for Docker-internal URL building and evaluator job temp directory handling; updates CI to build/push the evaluator image.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
evaluator/evaluator-entrypoint.sh New evaluator entrypoint that builds evaluator sub-images, then runs manage.py commands.
docker-compose.yml Adds evaluator services and docker socket proxy; adds internal base URL env var; adjusts pull policies.
common/utils.py Adds API_INTERNAL_BASEURL handling with a production safety guard.
common/evaluate.py Disables TLS verification in DEBUG and changes evaluation temp dir base.
Dockerfile Switches build base image approach; adds evaluator target with Docker tooling and entrypoint.
.github/workflows/ci.yml Builds, uploads, loads, and pushes the new kelvin-evaluator image in CI/deploy.
.env.example Documents evaluator-related env vars and API_INTERNAL_BASEURL.
.dockerignore Expands ignore patterns for cleaner Docker build contexts.
Comments suppressed due to low confidence (1)

docker-compose.yml:15

  • pull_policy: never for the app service prevents pulling prebuilt images from GHCR and can cause prod deployments to use stale images or fail when the image tag isn’t present locally. If the deployment flow relies on ${APP_IMAGE_TAG} (as the comment suggests), this should remain always (or be configurable via an env var) rather than hard-coded to never.
    image: "ghcr.io/mrlvsb/kelvin:${APP_IMAGE_TAG:-latest}" # Interpolation for Deployment Service
    pull_policy: always

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@Kobzol Kobzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments.

common/utils.py Outdated
# If the URL is the default Docker-internal one, only use it in DEBUG mode.
# This prevents Production from accidentally using the internal container hostname
# instead of the public domain, unless explicitly forced.
if base_uri == "https://nginx" and not settings.DEBUG:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs more explanation somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added description

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think that I understand the issue now. There are a few things that are not ideal here:

  1. We require HTTPS in nginx in local deployment (but this is more related to the s.verify = False thing, not this, though it is also related to this
  2. We set the default value of the internal API to nginx; it is not clear if that is indeed a good default, especially given that we don't really expect to run the evaluators in the same Docker network as the web in production.
  3. This function looks like it should be used for any general URL generation, but in reality it is only used for evaluators.

So I would suggest this:

  1. Put API_INTERNAL_BASEURL=https://nginx to .env.example, so that it is used by default in local Docker deployment, without it being specified in docker-compose.yml
  2. Remove the :-https://nginx default in docker-compose.yml; make the variable required
  3. Move the variable loading to settings.py, an environment variable shouldn't be accessed randomly in one of Kelvin's functions, but rather be centralized in the website configuration
  4. Rename the variable e.g. to EVALUATION_LINK_BASEURL
  5. Only use the variable when generating an URL for evaluators (including LLM evaluation), not for "normal" public URL links, such as for e-mails. I think that this actually already happens, as the non-evaluator part of Kelvin uses request.build_absolute_uri directly. So it is enough to rename build_absolute_uri to e.g. build_evaluation_download_uri, or something like that, to make it clear for what it should (and shouldn't!) be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed mostly based on your suggestions, please review it, if it is acceptable

- KELVIN__HOST_URL=${KELVIN__HOST_URL}
# - Defaults to 'https://nginx' for local docker development (to fix loopback ref to 127.0.0.1)
# - IGNORED by app if value is 'https://nginx' AND DEBUG=False
- API_INTERNAL_BASEURL=${API_INTERNAL_BASEURL:-https://nginx}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more description

common/utils.py Outdated
# If the URL is the default Docker-internal one, only use it in DEBUG mode.
# This prevents Production from accidentally using the internal container hostname
# instead of the public domain, unless explicitly forced.
if base_uri == "https://nginx" and not settings.DEBUG:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think that I understand the issue now. There are a few things that are not ideal here:

  1. We require HTTPS in nginx in local deployment (but this is more related to the s.verify = False thing, not this, though it is also related to this
  2. We set the default value of the internal API to nginx; it is not clear if that is indeed a good default, especially given that we don't really expect to run the evaluators in the same Docker network as the web in production.
  3. This function looks like it should be used for any general URL generation, but in reality it is only used for evaluators.

So I would suggest this:

  1. Put API_INTERNAL_BASEURL=https://nginx to .env.example, so that it is used by default in local Docker deployment, without it being specified in docker-compose.yml
  2. Remove the :-https://nginx default in docker-compose.yml; make the variable required
  3. Move the variable loading to settings.py, an environment variable shouldn't be accessed randomly in one of Kelvin's functions, but rather be centralized in the website configuration
  4. Rename the variable e.g. to EVALUATION_LINK_BASEURL
  5. Only use the variable when generating an URL for evaluators (including LLM evaluation), not for "normal" public URL links, such as for e-mails. I think that this actually already happens, as the non-evaluator part of Kelvin uses request.build_absolute_uri directly. So it is enough to rename build_absolute_uri to e.g. build_evaluation_download_uri, or something like that, to make it clear for what it should (and shouldn't!) be used.


# Run the image builder to ensure all required images are present
# Skip image build if running as scheduler (detected via --with-scheduler arg)
if [[ "$*" != *"--with-scheduler"* ]]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--with-scheduler doesn't tell anything about whether the evaluator will need the images or not.

Let's just inline the whole command into docker-compose.yml, seems like the simplest solution without doing similar hacks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

@Kobzol Kobzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, left one comment.

# or solve issues with socket permissions
docker_proxy:
container_name: kelvin_docker_proxy
profiles: [ prod,evaluator_cpu,evaluator_cuda ]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this means that the Docker proxy will be running on the main server, even if there will be no evaluators there. Seems safer to just not do that.

Suggested change
profiles: [ prod,evaluator_cpu,evaluator_cuda ]
profiles: [ evaluator_cpu, evaluator_cuda ]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is required for deployment service :D

Copy link
Collaborator

@Kobzol Kobzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's try. Thank you.

@Kobzol Kobzol added this pull request to the merge queue Feb 23, 2026
Merged via the queue into mrlvsb:master with commit 3ebf97b Feb 23, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants