Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added exempted user to guardrails #3319

Merged
merged 4 commits into from
Jan 3, 2024

Conversation

yjst2012
Copy link
Contributor

@yjst2012 yjst2012 commented Dec 8, 2023

Which issue this PR addresses:

Fixes
ARO-4861
ARO-4862
https://redhat-internal.slack.com/archives/C02ULBRS68M/p1702007716321979

What this PR does / why we need it:

find a way to not deploy guardrails onto clusters that already installed gatekeeper
fix the Guardrails problem that turning off managed did not delete the gatekeeper deployment
add one more exempted user for guardrails policy

Test plan for issue:

passed basic tests on my environment

$ go test -v
=== RUN TestGuardRailsReconciler
=== RUN TestGuardRailsReconciler/disabled
=== RUN TestGuardRailsReconciler/managed
=== RUN TestGuardRailsReconciler/managed,no_pullspec&namespace(uses_default)
=== RUN TestGuardRailsReconciler/managed,_GuardRails_does_not_become_ready
=== RUN TestGuardRailsReconciler/managed,CreateOrUpdate()fails
=== RUN TestGuardRailsReconciler/managed=false
(removal)
=== RUN TestGuardRailsReconciler/managed=false
(removal),_Remove()fails
WARN[0000] failed to remove deployer with error failed delete
=== RUN TestGuardRailsReconciler/managed=blank
(no_action)
--- PASS: TestGuardRailsReconciler (0.00s)
--- PASS: TestGuardRailsReconciler/disabled (0.00s)
--- PASS: TestGuardRailsReconciler/managed (0.00s)
--- PASS: TestGuardRailsReconciler/managed,no_pullspec&namespace(uses_default) (0.00s)
--- PASS: TestGuardRailsReconciler/managed,_GuardRails_does_not_become_ready (0.00s)
--- PASS: TestGuardRailsReconciler/managed,CreateOrUpdate()fails (0.00s)
--- PASS: TestGuardRailsReconciler/managed=false
(removal) (0.00s)
--- PASS: TestGuardRailsReconciler/managed=false
(removal),_Remove()fails (0.00s)
--- PASS: TestGuardRailsReconciler/managed=blank
(no_action) (0.00s)
=== RUN TestDeployCreateOrUpdateCorrectKinds
--- PASS: TestDeployCreateOrUpdateCorrectKinds (0.01s)
PASS
ok github.com/Azure/ARO-RP/pkg/operator/controllers/guardrails 0.036s

$ ./scripts/test.sh
expand constraints gkconstraints/aro-cluster-role-deny.yaml
expand constraints gkconstraints/aro-machine-config-deny.yaml
expand constraints gkconstraints/aro-machines-deny.yaml
expand constraints gkconstraints/aro-master-toleration-pod-deny.yaml
expand constraints gkconstraints/aro-privileged-namespace-deny.yaml
expand constraints gkconstraints/aro-pull-secret-deny.yaml
expand constraints gkconstraints/aro-service-account-deny.yaml
[opa test] -> gktemplates-src/library/common.rego gktemplates-src/aro-deny-delete-pull-secret/*.rego
PASS: 4/4
[gator verify] -> gktemplates-src/aro-deny-delete-pull-secret
=== RUN deny-pull-secret-delete-tests
=== RUN allow-create-pull-secret
--- PASS: allow-create-pull-secret (0.005s)
=== RUN allow-update-pull-secret
--- PASS: allow-update-pull-secret (0.003s)
=== RUN not-allow-delete-pull-secret
--- PASS: not-allow-delete-pull-secret (0.005s)
--- PASS: deny-pull-secret-delete-tests (0.017s)
ok gktemplates-src/aro-deny-delete-pull-secret/suite.yaml 0.017s
PASS

[opa test] -> gktemplates-src/library/common.rego gktemplates-src/aro-deny-labels/*.rego
PASS: 3/3
[gator verify] -> gktemplates-src/aro-deny-labels
=== RUN machines-deny
=== RUN master-machine-deny
--- PASS: master-machine-deny (0.003s)
=== RUN worker-machine-allowed
--- PASS: worker-machine-allowed (0.002s)
=== RUN no-label-machine-allowed
--- PASS: no-label-machine-allowed (0.002s)
--- PASS: machines-deny (0.011s)
ok gktemplates-src/aro-deny-labels/suite.yaml 0.011s
PASS

[opa test] -> gktemplates-src/library/common.rego gktemplates-src/aro-deny-machine-config/*.rego
PASS: 5/5
[gator verify] -> gktemplates-src/aro-deny-machine-config
=== RUN deny-cluster-machineconfig-modification-tests
=== RUN not-allow-create-cluster-machine-config
--- PASS: not-allow-create-cluster-machine-config (0.005s)
=== RUN not-allow-delete-cluster-machine-config
--- PASS: not-allow-delete-cluster-machine-config (0.007s)
=== RUN not-allow-update-cluster-machine-config
--- PASS: not-allow-update-cluster-machine-config (0.004s)
=== RUN allow-create-custom-machine-config
--- PASS: allow-create-custom-machine-config (0.003s)
=== RUN allow-delete-custom-machine-config
--- PASS: allow-delete-custom-machine-config (0.008s)
--- PASS: deny-cluster-machineconfig-modification-tests (0.031s)
ok gktemplates-src/aro-deny-machine-config/suite.yaml 0.031s
PASS

[opa test] -> gktemplates-src/library/common.rego gktemplates-src/aro-deny-master-toleration-taints/*.rego
PASS: 5/5
[gator verify] -> gktemplates-src/aro-deny-master-toleration-taints
=== RUN deny-master-toleration-taint-pods-in-nonprivileged-namespaces
=== RUN create-not-allowed-in-nonprivileged-namespaces
--- PASS: create-not-allowed-in-nonprivileged-namespaces (0.005s)
=== RUN create-allowed-in-privileged-namespaces
--- PASS: create-allowed-in-privileged-namespaces (0.004s)
=== RUN update-not-allowed-in-nonprivileged-namespaces
--- PASS: update-not-allowed-in-nonprivileged-namespaces (0.007s)
=== RUN deletion-allowed-in-nonprivileged-namespaces
--- PASS: deletion-allowed-in-nonprivileged-namespaces (0.005s)
--- PASS: deny-master-toleration-taint-pods-in-nonprivileged-namespaces (0.025s)
ok gktemplates-src/aro-deny-master-toleration-taints/suite.yaml 0.025s
PASS

[opa test] -> gktemplates-src/library/common.rego gktemplates-src/aro-deny-privileged-namespace/*.rego
PASS: 16/16
[gator verify] -> gktemplates-src/aro-deny-privileged-namespace
=== RUN privileged-namespace
=== RUN ns-allowed-pod
--- PASS: ns-allowed-pod (0.005s)
=== RUN ns-disallowed-pod
--- PASS: ns-disallowed-pod (0.005s)
=== RUN ns-disallowed-deploy
--- PASS: ns-disallowed-deploy (0.006s)
=== RUN ns-allowed-deploy
--- PASS: ns-allowed-deploy (0.007s)
--- PASS: privileged-namespace (0.028s)
ok gktemplates-src/aro-deny-privileged-namespace/suite.yaml 0.028s
PASS

[opa test] -> gktemplates-src/library/common.rego gktemplates-src/library/*.rego
[gator verify] -> gktemplates-src/library
PASS

Is there any documentation that needs to be updated for this PR?

NA

@yjst2012 yjst2012 added size-small Size small ready-for-review skippy pull requests raised by member of Team Skippy next-release To be included in the next RP release rollout labels Dec 8, 2023
@@ -74,9 +79,8 @@ func (r *Reconciler) Reconcile(ctx context.Context, request ctrl.Request) (ctrl.
// If enabled and managed=false, remove the GuardRails deployment
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's no need to comment how the code block works. The codes already explains what it does.

Copy link
Contributor

@jaitaiwan jaitaiwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have small nits but nothing critical. Just think we should add some retries if they're not already implemented in kubernetescli

Comment on lines +160 to +164
namespaces, err := r.kubernetescli.CoreV1().Namespaces().List(ctx, metav1.ListOptions{})
if err != nil {
r.log.Warnf("Error retrieving namespaces: %v", err)
return "", err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a retry here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this, didn't do that as 1) retry makes code a bit less readable, 2) in case of a failure the reconcile process will be retried.

Comment on lines +171 to +173
deployments, err := r.kubernetescli.AppsV1().Deployments(ns.Name).List(ctx, metav1.ListOptions{
LabelSelector: "gatekeeper.sh/system=yes",
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to have a try here as well

Comment on lines +130 to +140
if !r.cleanupNeeded {
if ns, err := r.getGatekeeperDeployedNs(ctx, instance); err == nil && ns != "" {
// resources were *not* created by guardrails, plus another gatekeeper deployed
//
// guardrails didn't get deployed most likely due to another gatekeeper is deployed by customer
// this is to avoid the accidental deletion of gatekeeper CRDs that were deployed by customer
// the unnamespaced gatekeeper CRDs were possibly created by a customised gatekeeper, hence cannot ramdomly delete them.
r.log.Warn("Skipping cleanup as it is not safe and may destroy customer's gatekeeper resources")
return reconcile.Result{}, nil
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can combine the if statement into 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tried that, but end up making code either less readable or have to invoke the cumbersome r.getGatekeeperDeployedNs() unnecessarily

@yjst2012 yjst2012 merged commit 5218c82 into Azure:master Jan 3, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
next-release To be included in the next RP release rollout ready-for-review size-small Size small skippy pull requests raised by member of Team Skippy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants