Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
4b1e0c1
Introduce evaluator Docker-in-Docker setup, add SSL certificates, and…
JersyJ Jan 25, 2026
6b80e5b
Refactor Evaluator Docker Images, Introduces simple CI
JersyJ Feb 7, 2026
61fa33e
Remove the evaluator service from docker-compose.yml
JersyJ Feb 7, 2026
bf8ec57
Rename workflow
JersyJ Feb 7, 2026
a0b337e
Formatting, pre-commit
JersyJ Feb 7, 2026
1009628
Build context
JersyJ Feb 7, 2026
25233f1
Update name, and build context
JersyJ Feb 7, 2026
c323f44
Try to use context use default
JersyJ Feb 7, 2026
fe00a11
feat: Inject build contexts for local 'kelvin/' dependencies during i…
JersyJ Feb 7, 2026
a0d9156
Another try
JersyJ Feb 7, 2026
527eaaf
feat: explicitly tag Docker images with `:latest`
JersyJ Feb 7, 2026
a619225
build: Explicitly specify 'latest' tag for base image in Dockerfile.
JersyJ Feb 7, 2026
dfc016e
chore: Remove explicit docker context usage from workflow and strip `…
JersyJ Feb 7, 2026
76b3825
Try again
JersyJ Feb 7, 2026
dd0e1d8
Go back to the "docker" driver
JersyJ Feb 7, 2026
7537da6
ntroduce dedicated Docker Compose services for evaluator scheduler, C…
JersyJ Feb 8, 2026
bb9fec4
Migrate to prek
JersyJ Feb 8, 2026
e21081f
Try the default directory
JersyJ Feb 8, 2026
f523239
Improve internal API communication, add debug-mode SSL handling
JersyJ Feb 8, 2026
8faa012
Documentation for env.example
JersyJ Feb 8, 2026
1fe0b1f
Update UV version in CI and Dockerfiles, improve installation documen…
JersyJ Feb 8, 2026
df15c62
Refactor Docker configuration: remove unused network aliases and upda…
JersyJ Feb 8, 2026
8424e37
Implement evaluator image building via an entrypoint script
JersyJ Feb 8, 2026
0e60ace
Healtcheck Docker Status mode for Deployment Service
JersyJ Feb 8, 2026
95a0109
Move evaluator Dockerfile into a multi-stage build, Evaluator builf a…
JersyJ Feb 8, 2026
a8c126d
Fix Mypy issue
JersyJ Feb 8, 2026
254f02a
Simplify DooD socket access via socat and remove DOCKER_GROUP_ID
JersyJ Feb 8, 2026
68eea93
Add fail-fast and unhealthy state to health_check
JersyJ Feb 8, 2026
52f0e86
Add health_check_timeout to settings and deployment request model
JersyJ Feb 9, 2026
c63b6b7
Add health check timeout option to deployment and documentation
JersyJ Feb 9, 2026
0199ceb
Add documentation for evaluator images and update tests and pipeline …
JersyJ Feb 9, 2026
9ef32d5
Merge branch 'evaluator-deployment' into evaluator-refactor-images
JersyJ Feb 9, 2026
7424d46
Rename conclusion job to conclusion-images for clarity in workflow
JersyJ Feb 9, 2026
16f8252
Remove change detection and simplify image build process
JersyJ Feb 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,31 @@ submits/
submit_results/
.venv/
node_modules/

# Python
__pycache__/
*.py[cod]
*.pyd
*.pyo
*.so
.pytest_cache/
.mypy_cache/
.ruff_cache/
.coverage
htmlcov/

# Node
**/dist/
**/.vite/

# VCS / tooling
.git/

# Local data (avoid baking into images)
kelvin_data/
**/*.log

# Editor
.vscode/
.idea/
.DS_Store
20 changes: 18 additions & 2 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
### Kelvin
# ------------------------------------------------------------------------------

# !!! IMPORTANT: For Production deployments using Deployment Service, all file paths must be specified as absolute due to use of DooD (Docker out of Docker)

Expand All @@ -12,6 +13,9 @@ KELVIN__TASKS_PATH=./tasks
KELVIN__SUBMITS_PATH=./submits
# Path where submit results will be stored
KELVIN__SUBMIT_RESULTS_PATH=./submit_results
# (Optional) Internal URL used by workers. Defaults to https://nginx when running locally with Docker;
# otherwise defaults to the request URL in production or non-Docker local environments.
# API_INTERNAL_BASEURL=https://custom-internal-url

### Postgres
DATABASE__HOST=127.0.0.1
Expand Down Expand Up @@ -40,9 +44,21 @@ OPENAI__API_KEY=your_openai_api_key_here
OPENAI__API_URL=http://localhost:8080/v1
OPENAI__MODEL=openai/gpt-oss-120b

### Evaluator Workers
# ------------------------------------------------------------------------------
# Number of worker processes
EVALUATOR_CPU_REPLICAS=32
EVALUATOR_CUDA_REPLICAS=32

# Redis Connection for Evaluators
# - If running LOCALLY (same machine as app): Leave these commented out or set to 'redis' and '6379'.
# - If running DISTRIBUTED (on a different machine): Set these to the IP/Host and Port of the main server's Redis.
# EVALUATOR_REDIS__HOST=redis
# EVALUATOR_REDIS__PORT=6379


### Deployment Service
# ID of the docker group on the host machine (get it via `getent group docker | cut -d: -f3`)
DOCKER_GROUP_ID=999
# ------------------------------------------------------------------------------
SECURITY__WEBHOOK_SECRET=yoursecretvalue
SECURITY__ALLOWED_HOSTS=["localhost", "127.0.0.1", "nginx", "kelvin.cs.vsb.cz"]

Expand Down
47 changes: 47 additions & 0 deletions .github/workflows/build-evaluator-images.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Evaluator Docker Images

on:
pull_request:
merge_group:
workflow_dispatch:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue that we have with the images is not that they break after we change them, but that they break when something external changes, most often apt repositories. So it would be great to run CI periodically to detect that sooner.

One way of doing that is running them always in CI, without file change detection. That has the annoying property that it can break CI for unrelated PRs. Another possibility is to setup a cron, to run this e.g. once a week. I'd go with the cron for now (in addition to the existing triggers).


concurrency:
group: ${{ github.workflow }}-${{ (github.event_name == 'merge_group' || github.event_name == 'workflow_dispatch') && 'build' || github.sha }}
cancel-in-progress: ${{ github.event_name != 'merge_group' }}

jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v6

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
with:
driver: docker

- name: Build images
run: |
python3 evaluator/images/build.py

# Summary job to enable easier handling of required status checks.
# On PRs, we need everything to be green, while deploy jobs are skipped.
# On master, we need everything to be green.
# ALL THE PREVIOUS JOBS NEED TO BE ADDED TO THE `needs` SECTION OF THIS JOB!
conclusion-images:
needs: [ build ]
# We need to ensure this job does *not* get skipped if its dependencies fail,
# because a skipped job is considered a success by GitHub. So we have to
# overwrite `if:`. We use `!cancelled()` to ensure the job does still not get run
# when the workflow is canceled manually.
if: ${{ !cancelled() }}
runs-on: ubuntu-latest
steps:
- name: Conclusion Images
run: |
# Print the dependent jobs to see them in the CI log
jq -C <<< '${{ toJson(needs) }}'
# Check if all jobs that we depend on (in the needs array)
# were either successful or skipped.
jq --exit-status 'all(.result == "success" or .result == "skipped")' <<< '${{ toJson(needs) }}'
79 changes: 63 additions & 16 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ concurrency:
env:
# Configure a constant location for the uv cache
UV_CACHE_DIR: /tmp/.uv-cache
UV_VERSION: "0.9.20"
UV_VERSION: "0.10.0"


jobs:
Expand Down Expand Up @@ -95,6 +95,9 @@ jobs:

test-deployment-service:
runs-on: ubuntu-latest
defaults:
run:
working-directory: deployment_service/

steps:
- name: Checkout sources
Expand All @@ -114,32 +117,26 @@ jobs:
working-directory: "deployment_service"

- name: Install dependencies
working-directory: deployment_service/
run: |
uv sync --frozen

- name: Ruff Linter
working-directory: deployment_service/
run: uv run ruff check --output-format=github

- name: Ruff Formatter
if: success() || failure()
working-directory: deployment_service/
run: uv run ruff format --check

- name: Check lockfile
if: success() || failure()
working-directory: deployment_service/
run: uv lock --locked

- name: MyPy
if: success() || failure()
working-directory: deployment_service/
run: |
uv run mypy --check .

- name: Run tests
working-directory: deployment_service/
run: uv run pytest
env:
SECURITY__WEBHOOK_SECRET: "yoursecretvalue"
Expand All @@ -166,38 +163,57 @@ jobs:
- name: Build Kelvin Docker image
uses: docker/build-push-action@v6
with:
cache-from: type=registry,ref=ghcr.io/mrlvsb/kelvin-ci-cache
target: runtime
cache-from: type=gha
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was the switch made?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to maintenance of the registry storage (LRU is used there automatically) and also the readability/visibility of that registry. Also this is official way and recommendation from GitHub and Docker.

# Only write the cache in the master branch or workflow_dispatch builds
# https://github.com/docker/build-push-action/issues/845#issuecomment-1512619265
cache-to: ${{ (github.event_name == 'merge_group' || github.event_name == 'workflow_dispatch') && 'type=registry,ref=ghcr.io/mrlvsb/kelvin-ci-cache,compression=zstd' || '' }}
cache-to: ${{ (github.event_name == 'merge_group' || github.event_name == 'workflow_dispatch') && 'type=gha,mode=max' || '' }}
tags: ghcr.io/mrlvsb/kelvin:latest,ghcr.io/mrlvsb/kelvin:${{ github.sha }}
outputs: type=docker,dest=${{ runner.temp }}/kelvin.tar

- name: Build Deployment_service Docker image
uses: docker/build-push-action@v6
with:
context: "{{defaultContext}}:deployment_service"
cache-from: type=registry,ref=ghcr.io/mrlvsb/deployment-ci-cache
cache-from: type=gha
# Only write the cache in the master branch or workflow_dispatch builds
# https://github.com/docker/build-push-action/issues/845#issuecomment-1512619265
cache-to: ${{ (github.event_name == 'merge_group' || github.event_name == 'workflow_dispatch') && 'type=registry,ref=ghcr.io/mrlvsb/deployment-ci-cache,compression=zstd' || '' }}
cache-to: ${{ (github.event_name == 'merge_group' || github.event_name == 'workflow_dispatch') && 'type=gha,mode=max' || '' }}
tags: ghcr.io/mrlvsb/deployment:latest,ghcr.io/mrlvsb/deployment:${{ github.sha }}
outputs: type=docker,dest=${{ runner.temp }}/deployment.tar

- name: Share built image
- name: Build Kelvin-Evaluator Docker image
uses: docker/build-push-action@v6
with:
target: evaluator
cache-from: type=gha
# Only write the cache in the master branch or workflow_dispatch builds
# https://github.com/docker/build-push-action/issues/845#issuecomment-1512619265
cache-to: ${{ (github.event_name == 'merge_group' || github.event_name == 'workflow_dispatch') && 'type=gha,mode=max' || '' }}
tags: ghcr.io/mrlvsb/kelvin-evaluator:latest,ghcr.io/mrlvsb/kelvin-evaluator:${{ github.sha }}
outputs: type=docker,dest=${{ runner.temp }}/kelvin-evaluator.tar

- name: Share Kelvin image
uses: actions/upload-artifact@v6
with:
name: kelvin
path: ${{ runner.temp }}/kelvin.tar
retention-days: 1

- name: Share built image
- name: Share Deployment_Service image
uses: actions/upload-artifact@v6
with:
name: deployment
path: ${{ runner.temp }}/deployment.tar
retention-days: 1

- name: Share Kelvin-Evaluator image
uses: actions/upload-artifact@v6
with:
name: kelvin-evaluator
path: ${{ runner.temp }}/kelvin-evaluator.tar
retention-days: 1

build-docs:
runs-on: ubuntu-latest
steps:
Expand Down Expand Up @@ -255,19 +271,25 @@ jobs:
- name: Set up Docker
uses: docker/setup-buildx-action@v3

- name: Download built image
- name: Download Kelvin image
uses: actions/download-artifact@v6
with:
name: kelvin
path: ${{ runner.temp }}

- name: Download Deployment_service image
- name: Download Deployment_Service image
if: steps.changed-files-deployment.outputs.any_changed == 'true'
uses: actions/download-artifact@v6
with:
name: deployment
path: ${{ runner.temp }}

- name: Download Kelvin-Evaluator image
uses: actions/download-artifact@v6
with:
name: kelvin-evaluator
path: ${{ runner.temp }}

- name: Load image
id: load_image
run: |
Expand All @@ -278,6 +300,12 @@ jobs:
echo "$LOADED"
SHA_TAG=$(echo "$LOADED" | grep -v ':latest' | awk '{print $3}')
echo "app_image_tag=$SHA_TAG" >> $GITHUB_OUTPUT

LOADED_EVAL=$(docker load --input ${{ runner.temp }}/kelvin-evaluator.tar)
echo "$LOADED_EVAL"
SHA_TAG_EVAL=$(echo "$LOADED_EVAL" | grep -v ':latest' | awk '{print $3}')
echo "evaluator_image_tag=$SHA_TAG_EVAL" >> $GITHUB_OUTPUT

if [ "${{ steps.changed-files-deployment.outputs.any_changed }}" = "true" ]; then
docker load --input ${{ runner.temp }}/deployment.tar
fi
Expand All @@ -291,7 +319,9 @@ jobs:
password: ${{ secrets.GITHUB_TOKEN }}

- name: Push Docker image with SHA tag
run: docker push ${{ steps.load_image.outputs.app_image_tag }}
run: |
docker push ${{ steps.load_image.outputs.app_image_tag }}
docker push ${{ steps.load_image.outputs.evaluator_image_tag }}

- name: Trigger on-prem deployment
run: |
Expand All @@ -302,12 +332,23 @@ jobs:
--commit-sha ${{ github.sha }} \
--healthcheck-url https://kelvin.cs.vsb.cz/api/v2/health \
--url https://kelvin.cs.vsb.cz/deployment/

python3 deployment_service/deploy.py \
--service-name evaluator_scheduler \
--container-name kelvin_evaluator_scheduler \
--image ${{ steps.load_image.outputs.evaluator_image_tag }} \
--commit-sha ${{ github.sha }} \
--url https://kelvin.cs.vsb.cz/deployment/ \
--health-check-timeout 240
env:
WEBHOOK_SECRET: ${{ secrets.WEBHOOK_SECRET }}

- name: Push Kelvin Docker image with latest tag
run: docker push ghcr.io/mrlvsb/kelvin:latest

- name: Push Kelvin Evaluator Docker image with latest tag
run: docker push ghcr.io/mrlvsb/kelvin-evaluator:latest

- name: Push Deployment_service Docker image with all tags
if: steps.changed-files-deployment.outputs.any_changed == 'true'
run: docker push --all-tags ghcr.io/mrlvsb/deployment
Expand All @@ -318,6 +359,12 @@ jobs:
package-type: 'container'
min-versions-to-keep: 15

- uses: actions/delete-package-versions@v5
with:
package-name: 'kelvin-evaluator'
package-type: 'container'
min-versions-to-keep: 15

- uses: actions/delete-package-versions@v5
if: steps.changed-files-deployment.outputs.any_changed == 'true'
with:
Expand Down
9 changes: 3 additions & 6 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
rev: v6.0.0
hooks:
- id: check-yaml
args: [--allow-multiple-documents]
Expand All @@ -18,11 +18,8 @@ repos:
- id: mixed-line-ending
args: [ --fix=lf ]
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.7
rev: v0.15.0
hooks:
- id: ruff-format
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.7
hooks:
- id: ruff
- id: ruff-check
args: [ --fix ]
Loading
Loading