-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
updates for optimizing failure scenario #402
updates for optimizing failure scenario #402
Conversation
@@ -34,6 +34,8 @@ JAVA_OPTS="${JAVA_OPTS} -Xmx1G -XX:+HeapDumpOnOutOfMemoryError" | |||
DEBUG_MODE="${DEBUG:-false}" | |||
DEBUG_PORT="${DEBUG_PORT:-8787}" | |||
|
|||
CHAOS_MODE="${CHAOS_MODE:-false}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For distributed runners, this mode is not very usable as we don't want to run the chaos script from all runners, but only once.
We can move the trigger back to the workflow (the log directory can be obtained also there) and then we can print the info from the pod at the end of kcb script from another workflow step. This way we would also get rid of coupling of kcb script with Openshift. Any thoughts @kami619 @ahus1 @tkyjovsk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for distributed runners we wouldn't want this to be run from kcb.sh.
If it helps you with the things you're doing at the moment, you could still merge it. The benefit would be that the folder information is passed to the script, including the right timeout parameter.
Eventually I'd like to see it removed for the reasons Michal described, or changed to an environment variable like SUT_DESCRIBE
where the caller could specify any script to do any kind of chaos (not only Kubernetes chaos). But let's see if we would ever do this / and how the chaos testing proceeds - we might eventually not use the shell script once we move to something more advanced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't think that kubectl
should be called from anywhere in kcb.sh
as the Benchark is supposed to be agnostic of the system under test's provisioning layer. Failure testing, and any other kind of testing which requires a specific provisioning platform (kubectl/oc/docker-compose) should be separated from the benchmark itself, IMO.
Same I thing applies to the stress-testing loop which runs multiple iterations of Gatling runs. In the context of distributed execution this loop would need to be outside the script that is running the benchmark in distributed mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the context of PR #295 I've placed the startup-time measuring scripts inside provisioning/minikube/measure
but I wonder whether we should instead place them in a separate test
directory like test/k8s/measure
or something like that.
Then we could also move the chaos testing scripts to this directory. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I would suggest creating a separate script which would call both kubectl
(or whatever else is needed for the test), and kcb.sh
from outside, either locally or remotely (via the ansible layer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the review comments everyone. given the majority feedback leans towards modularization of this, I shall work with Michal tomorrow to come up with a better design and implement that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in order to keep the scope intact and not let the design decisions take up more time for this PR to get merged. we are going to merge this as is created a ticket to tackle the modularization of script and solving the puzzle with distributed load tests in mind as the next increment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my review below - it is a good step into the right direction. All comments can be addressed in future PRs.
|
||
kubectl get pods -n "${PROJECT}" -l app=keycloak -o wide | ||
kubectl logs -f -n "${PROJECT}" "${RANDOM_KC_POD}" > "$LOGS_DIR/${ATTEMPT}-${RANDOM_KC_POD}.log" 2>&1 & | ||
kubectl describe -n "${PROJECT}" pod "${RANDOM_KC_POD}" > "$LOGS_DIR/${ATTEMPT}-${RANDOM_KC_POD}-complete-resource.log" 2>&1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of retrieving the text format, I'd suggest to also pull the JSON format via
kubectl get -n ... pod -o json
as we can analyze that better using jq
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, we had this in mind to parse the information and create a json file akin to what you did for SUT information, but as the first pass we are storing in the log files. I would suggest at-least this part to be done in a subsequent PR to speed things up.
@@ -34,6 +34,8 @@ JAVA_OPTS="${JAVA_OPTS} -Xmx1G -XX:+HeapDumpOnOutOfMemoryError" | |||
DEBUG_MODE="${DEBUG:-false}" | |||
DEBUG_PORT="${DEBUG_PORT:-8787}" | |||
|
|||
CHAOS_MODE="${CHAOS_MODE:-false}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for distributed runners we wouldn't want this to be run from kcb.sh.
If it helps you with the things you're doing at the moment, you could still merge it. The benefit would be that the folder information is passed to the script, including the right timeout parameter.
Eventually I'd like to see it removed for the reasons Michal described, or changed to an environment variable like SUT_DESCRIBE
where the caller could specify any script to do any kind of chaos (not only Kubernetes chaos). But let's see if we would ever do this / and how the chaos testing proceeds - we might eventually not use the shell script once we move to something more advanced.
Co-authored-by: Alexander Schwartz <[email protected]>
fixes #393
Client Secret Failure run -https://github.com/kami619/keycloak-benchmark/actions/runs/5402389905/jobs/9813602924
Auth Code Failure run - https://github.com/kami619/keycloak-benchmark/actions/runs/5404050859/jobs/9817708034
co-author : @mhajas