-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: adding testing trigger for running our e2e suite against the NodeAutoProvisioning Preview #178
Conversation
update fix: fix blah test: nap suite consolidation test no need for acr fix: adding preview cli fix: removing make az-perm as we are using nap which should set this up
Pull Request Test Coverage Report for Build 8232173689Details
💛 - Coveralls |
c10c0e9
to
90c2ea4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/test-nap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/test-nap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/nap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/nap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/nap
…because it's always true
See the tests pass for nap https://github.com/Azure/karpenter-provider-azure/actions/runs/8193216788/job/22475958956 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/test
Validating standard tests wtill work and standard workflow triggers still work |
running bash hack/boilerplate.sh is not generating the right configuration for the consolidation header for some reason |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial review. Didn't get to go over the files within .github
yet. Will be completing a second review for that.
Overall looking good, but I'd like a couple small adjustments, the ability to signal running the E2E tests for NAP or not via input params, and I think for cleanliness/focus it makes sense to split this into 3 PRs:
- workflow change, and required supporting
Makefile
//test
changes - new consolidation test suite
- general unrelated cleanup to the e2e test files.
@@ -3,17 +3,19 @@ ifeq ($(CODESPACES),true) | |||
AZURE_RESOURCE_GROUP ?= $(CODESPACE_NAME) | |||
AZURE_ACR_NAME ?= $(subst -,,$(CODESPACE_NAME)) | |||
else | |||
AZURE_RESOURCE_GROUP ?= karpenter | |||
AZURE_RESOURCE_GROUP ?= karpeeer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why changing karpenter
to karpeeer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not on purpose :O
@@ -36,6 +38,16 @@ az-mkaks: az-mkacr ## Create test AKS cluster (with --vm-set-type AvailabilitySe | |||
az aks get-credentials --name $(AZURE_CLUSTER_NAME) --resource-group $(AZURE_RESOURCE_GROUP) --overwrite-existing | |||
skaffold config set default-repo $(AZURE_ACR_NAME).azurecr.io/karpenter | |||
|
|||
az-mkaks-cilium-nap: | |||
az extension add --name aks-preview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I slightly question auto-adding this. It should be a pre-requirement the user has for running this command in my opinion.
Having the side effect of running this make command be that we add the aks-preview
extension to the users cli
seems questionable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need it because there is no way to registere specific versions of the preview cli in the github action from what I could find. Added a makefile command following the existing pattern used for creating the cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still be wary on adding it to this make command. Separating it out as an az-apply-nap-prerequirements
make command along with the feature registration as I mention in the below comment I think would be good. That way in the action we can call az-apply-nap-prerequirements
, but anyone doing local work, or such can avoid having the prerequirements
side effects.
@@ -36,6 +38,16 @@ az-mkaks: az-mkacr ## Create test AKS cluster (with --vm-set-type AvailabilitySe | |||
az aks get-credentials --name $(AZURE_CLUSTER_NAME) --resource-group $(AZURE_RESOURCE_GROUP) --overwrite-existing | |||
skaffold config set default-repo $(AZURE_ACR_NAME).azurecr.io/karpenter | |||
|
|||
az-mkaks-cilium-nap: | |||
az extension add --name aks-preview | |||
az feature register --namespace "Microsoft.ContainerService" --name "NodeAutoProvisioningPreview" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also highly question having this side effect as well. I think again this is registering NodeAutoProvisioningPreview
feature for the subscription.
A possible solution here would be to have an az-apply-nap-prerequirements
with a note on it that it will update the cli to have the aks-preview
extension, plus register the sub.
This way it can be something that users only have to apply once, but if it comes up in some pipeline, etc can always be called as a prerequirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sub registration could be ran once, but we need the add of the aks-preview extension to be updated to the latest preview cli
--enable-oidc-issuer --enable-managed-identity --node-provisioning-mode Auto | ||
|
||
az aks get-credentials --name $(AZURE_CLUSTER_NAME) --resource-group $(AZURE_RESOURCE_GROUP) --overwrite-existing | ||
kubectl taint nodes CriticalAddonsOnly=true:NoSchedule --all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't add this taint here. If you are wanting this within the e2e pipeline, put it into the workflows instead.
The makefiles I think should be a place for general commands, and if a user were to create a NAP cluster should they actually be applying this taint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They should ALWAYS taint the systempool. NAP should create all of the userpool nodes imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if we should add something about it to the docs than?
So far tainting has been something we've done within the e2e
test workflow themselves not the make commands.
However, if it is a best practice we want to recommend than I could see adding it to the make command, and/or documentation.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should! Maybe referencing the existing systempool documentation makes sense and explaining a bit more about the role of the system-surge pool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Little mixed on how much to duplicate the workflow files, but I think at least a few of them can definitely be merged with the regular flow:
- approval-comment-nap.yaml (merge with regular)
- cleanup-nap/action.yaml (merge with regular)
I'd try and merge based on some form of flag:
- e2e-matrix-trigger-nap.yaml
- e2e-matrix-nap.yaml
I think that e2e-nap.yaml might be able to be consolidated, but I'm wondering if its right to as it somewhat seems right that a "nap e2e test is different". I think at least through the point of we are selecting the matrix of tests to run, and how to run them seems reasonable.
I think keeping a separate create action for nap does make sense though.
Not certain on all the functionality available for getting some sort of flag for this, and if it'd be too messy in the code, but those are my general thoughts on reducing duplication for now.
@@ -32,4 +32,4 @@ jobs: | |||
secrets: | |||
E2E_CLIENT_ID: ${{ secrets.E2E_CLIENT_ID }} | |||
E2E_TENANT_ID: ${{ secrets.E2E_TENANT_ID }} | |||
E2E_SUBSCRIPTION_ID: ${{ secrets.E2E_SUBSCRIPTION_ID }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming this was an accidental change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not, than add it to the general e2e cleanup PR to keep files being touched focused?
run: az account set --subscription ${{ inputs.subscription-id }} | ||
- name: delete cluster ${{ inputs.cluster_name }} | ||
shell: bash | ||
run: az aks delete --name ${{ inputs.cluster_name }} --resource-group ${{ inputs.resource_group }} --yes --no-wait |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just doesn't have the delete acr
? set. Could simply remove that from the general cleanup since resource group delete should be enough. I think that's worth removing the need for an almost duplicate file.
* revert changes to non .github/ files * remove new test suite * Revert "remove new test suite" This reverts commit 0e7f3ab. * Revert "revert changes to non .github/ files" This reverts commit 40a6377. * update nap triggers to be based on same approval comment --------- Co-authored-by: Charlie McBride <[email protected]>
test ping @Azure/aks-karpenter |
Fixes #195
Description
This PR adds additional functionality to our testing toolkit to run the e2e suites on node auto provisioning, alongside the consolidation e2e suite which only works against node auto provisioning and not fully properly with unmanaged karpenter.
How was this change tested?
See Runs:
https://github.com/Azure/karpenter-provider-azure/actions/runs/8193216788/job/22475958956
https://github.com/Azure/karpenter-provider-azure/actions/runs/8150047908/job/22275681008
Then runs validating the standard e2es functionality haven't been tarnished: https://github.com/Azure/karpenter-provider-azure/actions/runs/8232044776
Does this change impact docs?
Release Note