Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API-1835: Migrate a few existing controllers to ApplyOperatorStatus #698

Merged
merged 10 commits into from
Oct 4, 2024

Conversation

bertinatto
Copy link
Member

@bertinatto bertinatto commented Sep 26, 2024

In this PR I'm focusing on converting UpdateOperatorStatus to ApplyOperatorSTatus and measuring the overhead. There are several usages of Update and UpdateStatus that needs to be looked into, but I'll likely leave that for a second PR.

Verb usage beween e2e-operator job from this PR and the same job from #704:

verb_usage_comparison_auth

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Sep 26, 2024

@bertinatto: This pull request references API-1835 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.18.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 26, 2024
@bertinatto bertinatto changed the title API-1835: Try apply [WIP] API-1835: Try apply Sep 26, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 26, 2024
@openshift-ci openshift-ci bot requested review from ibihim and liouk September 26, 2024 17:05
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 27, 2024
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 27, 2024
@bertinatto bertinatto force-pushed the try-apply branch 3 times, most recently from f56427c to 1530f0f Compare September 27, 2024 18:53
@bertinatto bertinatto marked this pull request as draft September 27, 2024 19:37
@bertinatto
Copy link
Member Author

/test e2e-operator

1 similar comment
@bertinatto
Copy link
Member Author

/test e2e-operator

@bertinatto
Copy link
Member Author

/test e2e-operator

1 similar comment
@bertinatto
Copy link
Member Author

/test e2e-operator

@bertinatto
Copy link
Member Author

/test e2e-operator

1 similar comment
@bertinatto
Copy link
Member Author

/test e2e-operator

@bertinatto
Copy link
Member Author

/test e2e-operator

3 similar comments
@bertinatto
Copy link
Member Author

/test e2e-operator

@bertinatto
Copy link
Member Author

/test e2e-operator

@bertinatto
Copy link
Member Author

/test e2e-operator

@bertinatto
Copy link
Member Author

/retest-required

@bertinatto
Copy link
Member Author

/test e2e-operator

: TestKeycloakAsOIDCPasswordGrantCheckAndGroupSync expand_less 	10m0s
{=== RUN   TestKeycloakAsOIDCPasswordGrantCheckAndG ...  === RUN   TestKeycloakAsOIDCPasswordGrantCheckAndG ...}

Looks like it's happening in 4.17 too: https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_cluster-authentication-operator/703/pull-ci-openshift-cluster-authentication-operator-release-4.17-e2e-operator/1841111144366870528

@@ -79,8 +80,8 @@ func ControllerProgressingConditionName(controllerName string) string {
return controllerName + "Progressing"
}

func UpdateControllerConditions(ctx context.Context, operatorClient v1helpers.OperatorClient, allConditionNames sets.String, updatedConditions []operatorv1.OperatorCondition) error {
updateConditionFuncs := []v1helpers.UpdateStatusFunc{}
func UpdateControllerConditions(ctx context.Context, operatorClient v1helpers.OperatorClient, fieldManager string, allConditionNames sets.String, updatedConditions []operatorv1.OperatorCondition) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: wouldn't be better to rename the function to ApplyControllerConditions ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed

}

return nil
return operatorClient.ApplyOperatorStatus(ctx, fieldManager, status)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we skip the call when allConditionNames is empty ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a check to allConditionNames in the beginning of the function.

@@ -53,7 +56,7 @@ func NewIngressNodesAvailableController(
).
WithSync(controller.sync).
ResyncEvery(wait.Jitter(time.Minute, 1.0)).
ToController("IngressNodesAvailableController", eventRecorder)
ToController(instanceName, eventRecorder)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that in library-go we are passing c.controllerInstanceName (applies to the other changes in this PR)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, interesting. I checked ResourceSyncController and DeploymentController and in both cases they use instanceName. It seems like the only exception is APIServiceController.

IIUC instanceName is what we used to call name and it fits better here because the Degraded condition name (set because of WithSyncDegradedOnError) would not change. On the other hand, controllerInstanceName would be mostly for setting the SSA's fieldManager.

Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm fine either way but we should be consistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that for some controller we cannot change the name then maybe we should use instanceName ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my understanding as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I take it back. @deads2k created this m****s and was not willing to take into account feedback given in openshift/library-go#1794 :(.

The instanceName is the name of the operator, for example, openshift-apiserver or openshift-apiserver-operator, and the controllerInstanceName is the instanceName combined with the name of the controller.

Given this, it doesn't make sense to use the instanceName as the controller's name, as this would result in multiple controllers sharing the same name.

For the IngressNodesAvailableController, we can change its name because the previous name was not used to generate a condition. However, for controllers created with WithSyncDegradedOnError, it is not safe to change their name.

Does that make sense ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewCSIDriverNodeServiceController from the library-go PR mentioned earlier use WithSyncDegradedOnError, thus its name could not be changed.

Given how this controller is created, the instanceName will beNFSDriverNodeServiceController and the controllerInstanceName will be NFSDriverNodeServiceController-CSIDriverNodeService

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, @p0lyn0mial.
I'm worried that some variables, like instanceName, have different meanings in different places. For example, in this operator it would be oauth-apiserver, but in storage it would be NFSDriverNodeServiceController, as you pointed out.

@@ -83,7 +87,7 @@ func NewWebhookAuthenticatorController(
).ResyncEvery(wait.Jitter(time.Minute, 1.0)).
WithSync(c.sync).
WithSyncDegradedOnError(operatorClient).
ToController("WebhookAuthenticatorController", recorder.WithComponentSuffix("webhook-authenticator-controller"))
ToController(instanceName, recorder.WithComponentSuffix("webhook-authenticator-controller"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we cannot rename this controller without removing existing condition added by the WithSyncDegradedOnError.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, then please add a comment where you set the name to not change it without clearing the condition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea; done

@bertinatto
Copy link
Member Author

/retest

@@ -54,7 +57,9 @@ func NewEndpointAccessibleController(
WithSync(c.sync).
ResyncEvery(wait.Jitter(time.Minute, 1.0)).
WithSyncDegradedOnError(operatorClient).
ToController(controllerName, recorder.WithComponentSuffix(name+"endpoint-accessible-controller"))
ToController(
controllerName, // Don't change what is passed here unless you also remove the old FooDegraded condition
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is a bit strange. Degraded is prefixed by controllerName, but Available is prefixed by name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we cannot change these names I think.

Copy link
Contributor

@p0lyn0mial p0lyn0mial left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

k8s.io/client-go v0.31.1
k8s.io/component-base v0.31.1
k8s.io/klog/v2 v2.130.1
k8s.io/kube-aggregator v0.31.1
k8s.io/pod-security-admission v0.29.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no 1.31.1 version ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is a direct dependency of this module, so it needs to be updated manually.

If you prefer I can add a commit to update it to v0.31.1, but from what I can see, there are no changes, just the version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, then let's leave it on 0.29.0

@@ -54,7 +57,9 @@ func NewEndpointAccessibleController(
WithSync(c.sync).
ResyncEvery(wait.Jitter(time.Minute, 1.0)).
WithSyncDegradedOnError(operatorClient).
ToController(controllerName, recorder.WithComponentSuffix(name+"endpoint-accessible-controller"))
ToController(
controllerName, // Don't change what is passed here unless you also remove the old FooDegraded condition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we cannot change these names I think.

@@ -717,6 +717,7 @@ func prepareOauthAPIServerOperator(ctx context.Context, controllerContext *contr
)

webhookAuthController := webhookauthenticator.NewWebhookAuthenticatorController(
"openshift-authentication",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this name. Thanks.

@bertinatto
Copy link
Member Author

/retest

@p0lyn0mial
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 4, 2024
@deads2k
Copy link
Contributor

deads2k commented Oct 4, 2024

/approve

Copy link
Contributor

openshift-ci bot commented Oct 4, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bertinatto, deads2k, p0lyn0mial

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 4, 2024
Copy link
Contributor

openshift-ci bot commented Oct 4, 2024

@bertinatto: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 7ef902b into openshift:master Oct 4, 2024
11 checks passed
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: ose-cluster-authentication-operator
This PR has been included in build ose-cluster-authentication-operator-container-v4.18.0-202410041911.p0.g7ef902b.assembly.stream.el9.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants