Skip to content

Commit

Permalink
Basic pod failure scenario
Browse files Browse the repository at this point in the history
fixes #363
  • Loading branch information
kami619 authored and ahus1 committed Jun 16, 2023
1 parent 066eb72 commit 586e798
Show file tree
Hide file tree
Showing 6 changed files with 229 additions and 13 deletions.
152 changes: 152 additions & 0 deletions .github/workflows/keycloak-chaos-benchmark.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
name: Keycloak - Failure benchmark

on:
workflow_dispatch:
inputs:
clusterName:
description: 'Name of the cluster'
type: string
default: 'gh-keycloak'
scenarioName:
description: 'Name of the benchmark scenario to run'
type: choice
options:
- 'authentication.ClientSecret'
- 'authentication.AuthorizationCode'
numberOfEntitiesInRealm:
description: 'Number of entities for the scenario'
type: string
default: '10000'
numberOfEntitiesUsedInTest:
description: 'Number of entities used in test (default: all entities)'
type: string
initialUsersPerSecond:
description: 'Initial users per second'
type: string
default: '1'
skipCreateDeployment:
description: 'Skip creating Keycloak deployment'
type: boolean
default: false
skipCreateDataset:
description: 'Skip creating dataset'
type: boolean
default: false
skipDeleteProject:
description: 'Skip deleting project'
type: boolean
default: false

concurrency: cluster_${{ inputs.clusterName || format('gh-{0}', github.repository_owner) }}

env:
PROJECT_PREFIX: runner- # same as default
PROJECT: runner-keycloak

jobs:
run:
name: Run Benchmark
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v3

- name: Setup ROSA CLI
uses: ./.github/actions/rosa-cli-setup
with:
aws-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-default-region: ${{ vars.AWS_DEFAULT_REGION }}
rosa-token: ${{ secrets.ROSA_TOKEN }}

- name: Login to OpenShift cluster
uses: ./.github/actions/oc-keycloak-login
with:
clusterName: ${{ inputs.clusterName || format('gh-{0}', github.repository_owner) }}

- name: Set up JDK
uses: actions/setup-java@v3
with:
distribution: 'temurin'
java-version: '11'
cache: 'maven'

- name: Cache Maven Wrapper
uses: actions/cache@v3
with:
path: |
.mvn/wrapper/maven-wrapper.jar
key: ${{ runner.os }}-maven-wrapper-${{ hashFiles('**/maven-wrapper.properties') }}
restore-keys: |
${{ runner.os }}-maven-wrapper-
- name: Build with Maven
run: |
./mvnw -B clean install -DskipTests
tar xfvz benchmark/target/keycloak-benchmark-*.tar.gz
mv keycloak-benchmark-* keycloak-benchmark
- name: Create Keycloak deployment
if: ${{ !inputs.skipCreateDeployment }}
uses: ./.github/actions/keycloak-create-deployment
with:
projectPrefix: ${{ env.PROJECT_PREFIX }}
disableStickySessions: true

- name: Create Keycloak dataset with "${{ inputs.numberOfEntitiesInRealm }}" clients
if: ${{ !inputs.skipCreateDataset }} && inputs.scenarioName == 'authentication.ClientSecret'
uses: ./.github/actions/keycloak-create-dataset
with:
project: ${{ env.PROJECT }}
clients: ${{ inputs.numberOfEntitiesInRealm }}

- name: Create Keycloak dataset with "${{ inputs.numberOfEntitiesInRealm }}" users
if: ${{ !inputs.skipCreateDataset }} && inputs.scenarioName == 'authentication.AuthorizationCode'
uses: ./.github/actions/keycloak-create-dataset
with:
project: ${{ env.PROJECT }}
users: ${{ inputs.numberOfEntitiesInRealm }}

- name: Get URLs
uses: ./.github/actions/get-keycloak-url
with:
project: ${{ env.PROJECT }}

- name: Run "authentication.ClientSecret" failure scenario
if: inputs.scenarioName == 'authentication.ClientSecret'
run: |
bin/kcb.sh --scenario=keycloak.scenario."${{ inputs.scenarioName }}" \
--server-url=${{ env.KEYCLOAK_URL }} \
--users-per-sec=${{ inputs.initialUsersPerSecond }} \
--measurement=180 \
--realm-name=realm-0 \
--clients-per-realm=${{ inputs.numberOfEntitiesUsedInTest || inputs.numberOfEntitiesInRealm }} &
timeout 150 bin/kc-chaos.sh &
wait
working-directory: keycloak-benchmark

- name: Run "authentication.AuthorizationCode" failure scenario
if: inputs.scenarioName == 'authentication.AuthorizationCode'
run: |
bin/kcb.sh --scenario=keycloak.scenario."${{ inputs.scenarioName }}" \
--server-url=${{ env.KEYCLOAK_URL }} \
--users-per-sec=${{ inputs.initialUsersPerSecond }} \
--measurement=180 \
--realm-name=realm-0 \
--users-per-realm=${{ inputs.numberOfEntitiesUsedInTest || inputs.numberOfEntitiesInRealm }} &
timeout 150 bin/kc-chaos.sh &
wait
working-directory: keycloak-benchmark

- name: Archive Gatling reports
uses: actions/upload-artifact@v3
with:
name: gatling-results
path: keycloak-benchmark/results
retention-days: 5

- name: Delete Keycloak deployment
if: ${{ !inputs.skipDeleteProject }}
uses: ./.github/actions/keycloak-delete-deployment
with:
project: ${{ env.PROJECT }}
25 changes: 25 additions & 0 deletions benchmark/src/main/content/bin/kc-chaos.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#!/bin/bash
# Use this for simulating failures of pods when testing Keycloak's capabilities to recover.
set -e

: ${INITIAL_DELAY_SECS:=30}
: ${CHAOS_DELAY_SECS:=60}
: ${PROJECT:="runner-keycloak"}

echo -e "\033[0;31mINFO:$(date '+%F-%T-%Z') Entering Chaos mode, with an initial delay of $INITIAL_DELAY_SECS seconds"
sleep $INITIAL_DELAY_SECS
echo -e "INFO:$(date '+%F-%T-%Z') Running Chaos scenario - Delete random Keycloak pod"
while true; do
RANDOM_KC_POD=$(kubectl \
-n "${PROJECT}" \
-o 'jsonpath={.items[*].metadata.name}' \
get pods -l app=keycloak | \
tr " " "\n" | \
shuf | \
head -n 1)
echo -e "\033[0;31mINFO:$(date '+%F-%T-%Z') Killing Pod '${RANDOM_KC_POD}' and waiting for ${CHAOS_DELAY_SECS} seconds"
kubectl delete pod -n "${PROJECT}" "${RANDOM_KC_POD}" --grace-period=1
sleep "${CHAOS_DELAY_SECS}"
echo -e "\033[0m"
done

13 changes: 4 additions & 9 deletions doc/kubernetes/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,8 @@
* xref:error-messages.adoc[]
* xref:load-behavior.adoc[]
* xref:utils.adoc[]
** xref:util/sqlpad.adoc[]
** xref:util/grafana.adoc[]
** xref:util/prometheus.adoc[]
** xref:util/otel.adoc[]
** xref:util/debugging-keycloak.adoc[]
** xref:util/custom-image-for-keycloak.adoc[]
** xref:util/cryostat.adoc[]
** xref:util/manual-jfr.adoc[]
** xref:util/task.adoc[]
+
--
include::partial$util-nav.adoc[]
--
* xref:other.adoc[]
37 changes: 37 additions & 0 deletions doc/kubernetes/modules/ROOT/pages/util/kc-chaos.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
= Simulate failures of Keycloak in Kubernetes
:description: How to automate the simulation of failures Keycloak Pods in a Kubernetes environment to test the recovery of Keycloak after a failure.

{description}

== Why failure testing

There is an excellent writeup about why we need chaos testing tools in general https://redhat-chaos.github.io/krkn/#introduction[in the introduction to the chaos testing tool krkn].

== Running the failure test from the CLI

=== Preparations

* Extract the `+keycloak-benchmark-${version}.[zip|tar.gz]+` file
* xref:benchmark-guide::preparing-keycloak.adoc[]
* Make sure you can access the Kubernetes cluster from where you are planning to run the failure tests and run commands such as `kubectl get pods -n keycloak-keycloak`

=== Simulating load

Use the xref:benchmark-guide::run/running-benchmark-cli.adoc[] guide to simulate load against a specific Kubernetes environment.

=== Running the failure tests

Once there is enough load going against the Keycloak application hosted on an existing Kubernetes/OpenShift cluster, execute below command to:

[source,bash]
----
./kc-chaos.sh
----

Set the environment variables below to configure on how and where this script gets executed.

`INITIAL_DELAY_SECS`:: Time in seconds the script waits before it triggers the first failure.

`CHAOS_DELAY_SECS`:: Time in seconds the script waits between simulating failures.

`PROJECT`:: Namespace of the Keycloak pods.
5 changes: 1 addition & 4 deletions doc/kubernetes/modules/ROOT/pages/utils.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,7 @@

== List of utilities

* xref:util/sqlpad.adoc[]
* xref:util/debugging-keycloak.adoc[]
* xref:util/cryostat.adoc[]
* xref:util/grafana.adoc[]
include::partial$util-nav.adoc[]

// TODO: migrate other utilities
* xref:other.adoc[Other utilities]
10 changes: 10 additions & 0 deletions doc/kubernetes/modules/ROOT/partials/util-nav.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
** xref:util/sqlpad.adoc[]
** xref:util/grafana.adoc[]
** xref:util/prometheus.adoc[]
** xref:util/otel.adoc[]
** xref:util/debugging-keycloak.adoc[]
** xref:util/custom-image-for-keycloak.adoc[]
** xref:util/cryostat.adoc[]
** xref:util/manual-jfr.adoc[]
** xref:util/task.adoc[]
** xref:util/kc-chaos.adoc[]

0 comments on commit 586e798

Please sign in to comment.