Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8s integration tests #2

Draft
wants to merge 51 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
4a4af10
Implement health checks and graceful termination
omus Mar 4, 2024
0b702ce
Add GitHub workflows
omus Mar 4, 2024
3d1b783
Rename endpoint functions
omus Mar 4, 2024
b3cf654
Add docstrings to health check functions
omus Mar 4, 2024
8b04c15
Add badges to README
omus Mar 5, 2024
754f117
Add manual
omus Mar 5, 2024
26a9d9b
Add both stable/dev docs badges
omus Mar 5, 2024
48ba856
Specify YAS formatting
omus Mar 5, 2024
e7b16a3
fixup! Rename endpoint functions
omus Mar 5, 2024
8634d3b
Reference localhost more
omus Mar 5, 2024
bfb54ef
Document keyword `set_entrypoint`
omus Mar 5, 2024
e696698
Comment on dead code
omus Mar 5, 2024
7b40f99
Log invalid graceful terminator requests
omus Mar 5, 2024
75d2a18
Add argument documentation to `K8sDeputy.serve!`
omus Mar 5, 2024
507519c
Add quickstart guide
omus Mar 5, 2024
a0e2670
Use default K8sDeputy.jl port in quickstart
omus Mar 5, 2024
c66f6bb
Formatting
omus Mar 5, 2024
74af1f9
Add LICENSE file
omus Mar 5, 2024
84a32de
fixup! Add LICENSE file
omus Mar 5, 2024
674f6ad
Add integration testing framework
omus Mar 6, 2024
8df941e
Initial integration test workflow
omus Mar 6, 2024
4c6d372
Use Julia version from matrix
omus Mar 6, 2024
54c476e
Make Manifest.toml optional
omus Mar 6, 2024
f7041cc
Update Dockerfile to work without Manifest.toml files
omus Mar 6, 2024
2ae0205
Disable Skaffold telemetry
omus Mar 6, 2024
f5478e4
Disable Skaffold survey prompts
omus Mar 6, 2024
74e0c48
Add integration testing scaffolding
omus Mar 6, 2024
67a0da5
Update chart to support pod or deployment
omus Mar 6, 2024
8403d4e
Preliminary working integration tests
omus Mar 7, 2024
f056caf
Refactoring
omus Mar 7, 2024
cc0141e
Refactor
omus Mar 7, 2024
c786006
Integration Docker build
omus Mar 7, 2024
172c26f
Use constants for repo/tag
omus Mar 7, 2024
3488f9e
Add section headers
omus Mar 7, 2024
1f6ca27
fixup! Use constants for repo/tag
omus Mar 7, 2024
2955a4d
Use shared termination grace period
omus Mar 7, 2024
e587b43
Formatting
omus Mar 7, 2024
b578888
Fix kube config permissions
omus Mar 7, 2024
435a121
Documentation corrections
omus Mar 8, 2024
98af4cf
Indicate mutating health functions
omus Mar 8, 2024
351eb9d
Rename `entrypoint_pid(::Integer)` to `set_entrypoint_pid`
omus Mar 8, 2024
b3b9eae
Document custom exit status
omus Mar 8, 2024
59ae13f
fixup! Indicate mutating health functions
omus Mar 8, 2024
edb8e4e
Merge branch 'cv/initial' into cv/k8s-tests
omus Mar 8, 2024
ce5569b
Add Helm chart install timeout
omus Mar 8, 2024
590b2bc
Unique chart name
omus Mar 8, 2024
84a5738
Update entrypoint.jl
omus Mar 8, 2024
3f41678
Add more jobs to matrix
omus Mar 8, 2024
bf24435
Conditionally set precompile timing
omus Mar 8, 2024
2e3763c
Julia 1.6 doesn't show PID
omus Mar 8, 2024
d2fd348
Merge branch 'main' into cv/k8s-tests
omus Mar 8, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 62 additions & 1 deletion .github/workflows/CI.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: ${{ startsWith(github.ref, 'refs/pull/') }}
jobs:
test:
unit-test:
name: Julia ${{ matrix.version }} - ${{ matrix.runs-on }} - ${{ matrix.arch }} - ${{ matrix.threads}} threads
# These permissions are needed to:
# - Delete old caches: https://github.com/julia-actions/cache#cache-retention
Expand All @@ -44,6 +44,7 @@ jobs:
threads:
- 1
env:
RUN_TESTS: unit,quality-assurance
JULIA_NUM_THREADS: ${{ matrix.threads }}
steps:
- uses: actions/checkout@v4
Expand All @@ -57,3 +58,63 @@ jobs:
- uses: codecov/codecov-action@v3
with:
file: lcov.info

integration-test:
name: Integration Test - Julia ${{ matrix.julia-version }} - K8s ${{ matrix.k8s-version }} - minikube ${{ matrix.minikube-version }}
# These permissions are needed to:
# - Delete old caches: https://github.com/julia-actions/cache#cache-retention
permissions:
actions: write
contents: read
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
julia-version:
- "1.6" # Earliest version of Julia that the package is compatible with
- "1" # Latest Julia release
# Support the latest versions of the supported releases: https://kubernetes.io/releases/.
# These must be full version numbers including the patch.
k8s-version:
- "1.27.11"
- "1.28.7"
- "1.29.2"
# https://github.com/kubernetes/minikube/releases
minikube-version:
- "1.32.0"
env:
RUN_TESTS: integration
steps:
- uses: actions/checkout@v4
- uses: julia-actions/setup-julia@v1
with:
version: ${{ matrix.julia-version }}
- uses: julia-actions/cache@v1
- uses: yokawasa/[email protected]
with:
setup-tools: |-
kubectl
helm
kubectl: "1.29.2" # https://github.com/kubernetes/kubernetes/releases
helm: "3.14.2" # https://github.com/helm/helm/releases
# Factors influencing the setup of the local Kubernetes cluster:
# - Limited resources on GitHub runners only allow running a 1 pod at a time with
# the default minikube install (additional pods are stuck as "Pending")
# - minikube restricts max CPUs per node to the number of CPUs on the host
- name: Set up minikube
uses: manusa/[email protected]
with:
minikube version: v${{ matrix.minikube-version }}
kubernetes version: v${{ matrix.k8s-version }}
driver: docker
# start args: --nodes=1 --cni=kindnet
# Fix these warnings: https://github.com/helm/helm/issues/9115
- name: Fix kube config permissions
run: |
chmod go-r ~/.kube/config
helm version
- uses: julia-actions/julia-runtest@v1
- uses: julia-actions/julia-processcoverage@v1
- uses: codecov/codecov-action@v3
with:
file: lcov.info
6 changes: 5 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,14 @@ Mocking = "0.7"
Sockets = "1"
Test = "1"
julia = "1.6"
kubectl_jll = "1.25"

[extras]
Aqua = "4c88cf16-eb10-579e-8560-4a9242c79595"
JSON3 = "0f8b85d8-7281-11e9-16c2-39a750bddbf1"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
UUIDs = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"
kubectl_jll = "ed23c2a5-89c4-5d52-b0ca-9d53aadf8c45"

[targets]
test = ["Aqua", "Test"]
test = ["Aqua", "JSON3", "Test", "UUIDs", "kubectl_jll"]
152 changes: 152 additions & 0 deletions test/integration-utils.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
using kubectl_jll
using UUIDs
using JSON3: JSON3

###
### kubectl
###

# TODO: Would be great if we could use the UID for all requests
mutable struct Pod
name::String
uid::UUID
logs::IOBuffer
logs_process::Base.Process

function Pod(name::AbstractString)
pod = new(name)
pod.uid = get_uid(pod)
pod.logs = IOBuffer()
pod.logs_process = monitor_logs(pod.logs, pod)
return pod
end
end

Base.print(io::IO, p::Pod) = print(io, "pod/", p.name)

function get_uid(p::Pod)
cmd = `$(kubectl()) get $p -o jsonpath="{.metadata.uid}"`
err = IOBuffer()
uid = readchomp(pipeline(ignorestatus(cmd); stderr=err))

err.size > 0 && error(String(take!(err)))
return parse(UUID, uid)
end

function monitor_logs(io::IO, p::Pod)
cmd = `$(kubectl()) logs -f $p`
return run(pipeline(cmd; stdout=io); wait=false)
end

function get_events(p::Pod)
cmd = `$(kubectl()) get events --field-selector involvedObject.uid=$(p.uid) -o json`
err = IOBuffer()
out = readchomp(pipeline(ignorestatus(cmd); stderr=err))

err.size > 0 && error(String(take!(err)))
return JSON3.read(out)
end

function delete(p::Pod; wait::Bool=true)
cmd = `$(kubectl()) delete $p --wait=$wait`
err = IOBuffer()
run(pipeline(ignorestatus(cmd); stdout=devnull, stderr=err))

err.size > 0 && error(String(take!(err)))
return nothing
end

function Base.wait(p::Pod)
cmd = `$(kubectl()) wait --for=delete $p`
err = IOBuffer()
run(pipeline(ignorestatus(cmd); stdout=devnull, stderr=err))

err.size > 0 && error(String(take!(err)))
return nothing
end

kubectl_context() = readchomp(`$(kubectl()) config current-context`)

###
### Helm
###

function install_chart(name::AbstractString, overrides=Dict(); quiet::Bool=true,
timeout=nothing)
chart = joinpath(@__DIR__(), "integration", "chart", "k8s-deputy")
options = `--set kind=pod`
!isnothing(timeout) && (options = `$options --timeout=$timeout`)
for (k, v) in pairs(overrides)
options = if v isa AbstractArray || v isa AbstractDict || v isa Nothing
`$options --set-json $k=$(JSON3.write(v))`
else
`$options --set-literal $k=$v`
end
end
stdout = quiet ? devnull : Base.stdout
run(pipeline(`helm uninstall $name --ignore-not-found`; stdout))
return run(pipeline(`helm install $name $chart --wait $options`; stdout))
end

function install_chart(body, name::AbstractString, overrides=Dict(); quiet::Bool=true,
timeout=nothing)
local result
stdout = quiet ? devnull : Base.stdout
install_chart(name, overrides; quiet, timeout)
try
result = body()
finally
run(pipeline(`helm uninstall $name`; stdout))
end
return result
end

###
### Docker
###

function docker_build(context_dir; dockerfile=nothing, tag=nothing, build_args=Dict())
options = ``
!isnothing(dockerfile) && (options = `$options -f $dockerfile`)
!isnothing(tag) && (options = `$options --tag $tag`)
for (k, v) in build_args
options = `$options --build-arg $k=$v`
end

build_cmd = `docker build $options $context_dir`

# When using a minikube context we need to build the image within the minikube
# environment otherwise we'll see pods fail with the reason "ErrImageNeverPull".
if kubectl_context() == "minikube" && !haskey(ENV, "MINIKUBE_ACTIVE_DOCKERD")
build_cmd = addenv(build_cmd, Dict(minikube_docker_env()))
end

return run(build_cmd)
end

function minikube_docker_env()
env_vars = Pair{String,String}[]
open(`minikube docker-env`) do f
while !eof(f)
line = readline(f)

if startswith(line, "export")
line = replace(line, r"^export " => "")
key, value = split(line, '='; limit=2)
push!(env_vars, key => unquote(value))
end
end
end

return env_vars
end

isquoted(str::AbstractString) = startswith(str, '"') && endswith(str, '"')

function unquote(str::AbstractString)
if isquoted(str)
return replace(SubString(str, 2, lastindex(str) - 1), "\\\"" => "\"")
else
throw(ArgumentError("Passed in string is not quoted"))
end
end
136 changes: 136 additions & 0 deletions test/integration.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
const K8S_DEPUTY_IMAGE = get(ENV, "K8S_DEPUTY_IMAGE", "k8s-deputy:integration")
const K8S_DEPUTY_IMAGE_REPO = first(split(K8S_DEPUTY_IMAGE, ':'; limit=2))
const K8S_DEPUTY_IMAGE_TAG = last(split(K8S_DEPUTY_IMAGE, ':'; limit=2)) # Includes image digest SHA

const CHART_NAME = "k8s-deputy-integration-$(getpid())"
const TERMINATION_GRACE_PERIOD_SECONDS = 5

# As a convenience we'll automatically build the Docker image when a user uses `Pkg.test()`.
# If the environmental variable is set we expect the Docker image has been pre-built.
if !haskey(ENV, "K8S_DEPUTY_IMAGE")
context_dir = joinpath(@__DIR__(), "..")
dockerfile = joinpath("integration", "Dockerfile")

build_args = Dict("JULIA_VERSION" => VERSION)
docker_build(context_dir; dockerfile, build_args, tag=K8S_DEPUTY_IMAGE)
end

# Verify Julia's handling of the `TERM` signal in a K8s environment
@testset "SIGTERM graceful termination" begin
overrides = Dict("image.repository" => K8S_DEPUTY_IMAGE_REPO,
"image.tag" => K8S_DEPUTY_IMAGE_TAG,
"command" => ["julia", "entrypoint.jl"],
"lifecycle" => nothing,
"terminationGracePeriodSeconds" => TERMINATION_GRACE_PERIOD_SECONDS)

local pod, delete_duration
install_chart(CHART_NAME, overrides; timeout="15s") do
pod = Pod(CHART_NAME)
delete_started = time()
delete(pod)
wait(pod)
delete_duration = time() - delete_started
return nothing
end

# # Determine when the "delete" command was received by the server
# event_items = get_events(pod).items
# delete_event = last(filter(event -> event.reason == "Killing", event_items))
# delete_event_timestamp = parse(DateTime, delete_event.lastTimestamp, dateformat"yyyy-mm-dd\THH:MM:SS\Z")

logs = String(take!(pod.logs))
@test delete_duration < TERMINATION_GRACE_PERIOD_SECONDS
@test contains(logs, "signal (15): Terminated\nin expression starting at")
@test !any(event -> event.reason == "FailedPreStopHook", get_events(pod).items)
end

# Verify Julia's handling of the `TERM` signal in a K8s environment
@testset "Ignore SIGTERM graceful termination" begin
# Child processes don't automatically get forwarded signals
overrides = Dict("image.repository" => K8S_DEPUTY_IMAGE_REPO,
"image.tag" => K8S_DEPUTY_IMAGE_TAG,
"command" => ["/bin/sh", "-c", "julia entrypoint.jl"],
"lifecycle" => nothing,
"terminationGracePeriodSeconds" => TERMINATION_GRACE_PERIOD_SECONDS)

local pod, delete_duration
install_chart(CHART_NAME, overrides; timeout="15s") do
pod = Pod(CHART_NAME)
delete_started = time()
delete(pod)
wait(pod)
delete_duration = time() - delete_started
return nothing
end

logs = String(take!(pod.logs))
@test delete_duration > TERMINATION_GRACE_PERIOD_SECONDS
@test !contains(logs, "signal (15): Terminated")
@test !any(event -> event.reason == "FailedPreStopHook", get_events(pod).items)
end

@testset "Container halts before preStop completes" begin
overrides = Dict("image.repository" => K8S_DEPUTY_IMAGE_REPO,
"image.tag" => K8S_DEPUTY_IMAGE_TAG,
"command" => ["julia", "entrypoint.jl"],
"terminationGracePeriodSeconds" => TERMINATION_GRACE_PERIOD_SECONDS)

local pod, delete_duration
install_chart(CHART_NAME, overrides; timeout="15s") do
pod = Pod(CHART_NAME)
delete_started = time()
delete(pod)
wait(pod)
delete_duration = time() - delete_started
return nothing
end

logs = String(take!(pod.logs))
@test delete_duration < TERMINATION_GRACE_PERIOD_SECONDS
@test !contains(logs, "signal (15): Terminated")
@test any(event -> event.reason == "FailedPreStopHook", get_events(pod).items)
end

@testset "Missing post Julia entrypoint delay" begin
overrides = Dict("image.repository" => K8S_DEPUTY_IMAGE_REPO,
"image.tag" => K8S_DEPUTY_IMAGE_TAG,
"command" => ["/bin/sh", "-c", "julia entrypoint.jl"],
"terminationGracePeriodSeconds" => TERMINATION_GRACE_PERIOD_SECONDS)

local pod, delete_duration
install_chart(CHART_NAME, overrides; timeout="15s") do
pod = Pod(CHART_NAME)
delete_started = time()
delete(pod)
wait(pod)
delete_duration = time() - delete_started
return nothing
end

logs = String(take!(pod.logs))
@test delete_duration < TERMINATION_GRACE_PERIOD_SECONDS
@test !contains(logs, "signal (15): Terminated")
@test any(event -> event.reason == "FailedPreStopHook", get_events(pod).items)
end

@testset "Valid" begin
overrides = Dict("image.repository" => K8S_DEPUTY_IMAGE_REPO,
"image.tag" => K8S_DEPUTY_IMAGE_TAG,
"command" => ["/bin/sh", "-c", "julia entrypoint.jl; sleep 1"],
"terminationGracePeriodSeconds" => TERMINATION_GRACE_PERIOD_SECONDS)

local pod, delete_duration
install_chart(CHART_NAME, overrides; timeout="15s") do
pod = Pod(CHART_NAME)
delete_started = time()
delete(pod)
wait(pod)
delete_duration = time() - delete_started
return nothing
end

logs = String(take!(pod.logs))
@test delete_duration < TERMINATION_GRACE_PERIOD_SECONDS
@test !contains(logs, "signal (15): Terminated")
@test !any(event -> event.reason == "FailedPreStopHook", get_events(pod).items)
end
Loading
Loading