API-1835: Migrate WorkloadController to SSA #1830

bertinatto · 2024-10-11T13:40:16Z

Proof: openshift/cluster-authentication-operator#715.

Here's a verb usage comparison between e2e-aws-single-node jobs of the PR above and openshift/cluster-authentication-operator#705.

/assign @deads2k @p0lyn0mial

openshift-ci · 2024-10-11T13:41:09Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bertinatto

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/operator/apiserver/OWNERS~~ [bertinatto]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot · 2024-10-11T14:03:56Z

@bertinatto: This pull request references API-1835 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.18.0" version, but no target version was set.

In response to this:

Proof: openshift/cluster-authentication-operator#715

/assign @deads2k @p0lyn0mial

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

bertinatto · 2024-10-11T14:04:12Z

/hold
for results from proof: openshift/cluster-authentication-operator#715

bertinatto · 2024-10-11T16:04:21Z

/retest

openshift-ci-robot · 2024-10-11T17:10:51Z

@bertinatto: This pull request references API-1835 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.18.0" version, but no target version was set.

In response to this:

Proof: openshift/cluster-authentication-operator#715.

Here's a verb usage comparison between e2e-aws-single-node jobs of the PR above and openshift/cluster-authentication-operator#705 (not merged yet).

/assign @deads2k @p0lyn0mial

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

pkg/operator/apiserver/controller/workload/workload.go

openshift-ci-robot · 2024-10-12T11:47:31Z

@bertinatto: This pull request references API-1835 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.18.0" version, but no target version was set.

In response to this:

Proof: openshift/cluster-authentication-operator#715.

Here's a verb usage comparison between e2e-aws-single-node jobs of the PR above and openshift/cluster-authentication-operator#705.

/assign @deads2k @p0lyn0mial

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

bertinatto · 2024-10-12T11:54:26Z

/hold for results from proof: openshift/cluster-authentication-operator#715

/hold cancel

…and their statuses

…able

bertinatto · 2024-10-12T16:50:01Z

pkg/operator/apiserver/controller/workload/workload_test.go

+						Type:    fmt.Sprintf("%sWorkloadDegraded", defaultControllerName),
+						Status:  operatorv1.ConditionTrue,
+						Reason:  "NoDeployment",
+						Message: "deployment/: could not be retrieved",


@deads2k @p0lyn0mial this is interesting. Forcing every branch to set the conditions revealed an underlying issue. IMO this condition was set incorrectly, but please take a look.

It looks like you were right, previously this condition was not set when the workload was missing. Setting it doesn't break anything.

p0lyn0mial · 2024-10-14T09:00:36Z

pkg/operator/apiserver/controller/workload/workload_test.go

@@ -105,8 +107,8 @@ func TestUpdateOperatorStatus(t *testing.T) {
 					{
 						Type:    fmt.Sprintf("%sWorkloadDegraded", defaultControllerName),
 						Status:  operatorv1.ConditionTrue,
-						Message: "nasty error\n",
-						Reason:  "SyncError",
+						Reason:  "NoDeployment",


We shouldn't overwrite SyncError if occurred, these errors are important. Does that make sense?

Good point. I added a commit to change that, PTAL.

p0lyn0mial · 2024-10-14T09:01:48Z

pkg/operator/apiserver/controller/workload/workload_test.go

+						Type:    fmt.Sprintf("%sWorkloadDegraded", defaultControllerName),
+						Status:  operatorv1.ConditionTrue,
+						Reason:  "NoDeployment",
+						Message: "deployment/: could not be retrieved",


It looks like you were right, previously this condition was not set when the workload was missing. Setting it doesn't break anything.

pkg/operator/apiserver/controller/workload/workload.go

p0lyn0mial · 2024-10-14T09:07:47Z

pkg/operator/apiserver/controller/workload/workload.go

+
+		workloadDegradedCondition = workloadDegradedCondition.
+			WithStatus(operatorv1.ConditionTrue).
+			WithReason("PreconditionNotFulfilled").


we don't know if the workload is degraded but because now we always have to set all conditions we need to do something about it.

maybe we should set the status to Unknown?

IMO a pre-condition is a requirement, and if we know a requirement isn't met, we can be certain the workload won't be available. Maybe an older workload (with different pre-conditions) is available, but not the workload tied to this version. So I think at least Available and Progressing could be false and true, respectively.

TBH it feels like it should be Degraded too, but I don't have a strong opinion.

pkg/operator/apiserver/controller/workload/workload.go

p0lyn0mial · 2024-10-14T09:15:04Z

pkg/operator/apiserver/controller/workload/workload.go

@@ -268,18 +289,21 @@ func (c *Controller) updateOperatorStatus(ctx context.Context, previousStatus *o
 	// If the workload is up to date, then we are no longer progressing
 	workloadAtHighestGeneration := workload.ObjectMeta.Generation == workload.Status.ObservedGeneration
 	workloadIsBeingUpdated := workload.Status.UpdatedReplicas < desiredReplicas
-	workloadIsBeingUpdatedTooLong, err := isUpdatingTooLong(previousStatus, deploymentProgressingCondition.Type)
+	workloadIsBeingUpdatedTooLong, err := isUpdatingTooLong(previousStatus, *deploymentProgressingCondition.Type)


I think that *deploymentProgressingCondition.Type will always be set.

p0lyn0mial · 2024-10-14T09:17:18Z

pkg/operator/apiserver/controller/workload/workload.go

+			WithResource("deployments").
+			WithNamespace(workload.Namespace).
+			WithName(workload.Name).
+			WithLastGeneration(workload.Generation),


not setting the generation will set it to 0, right ?

Yes, it should be set to the zero value

…dy set

openshift-ci · 2024-10-14T18:57:18Z

@bertinatto: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 11, 2024

openshift-ci bot requested review from p0lyn0mial and tkashem October 11, 2024 13:41

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 11, 2024

bertinatto changed the title ~~WIP: Migrate WorkloadController to SSA~~ API-1835: Migrate WorkloadController to SSA Oct 11, 2024

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 11, 2024

openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Oct 11, 2024

bertinatto force-pushed the try-apply-workload branch from d4ff2af to 076e850 Compare October 11, 2024 14:16

deads2k reviewed Oct 11, 2024

View reviewed changes

pkg/operator/apiserver/controller/workload/workload.go Outdated Show resolved Hide resolved

deads2k reviewed Oct 11, 2024

View reviewed changes

pkg/operator/apiserver/controller/workload/workload.go Outdated Show resolved Hide resolved

bertinatto force-pushed the try-apply-workload branch from d393851 to a95167d Compare October 12, 2024 11:37

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 12, 2024

bertinatto force-pushed the try-apply-workload branch 2 times, most recently from 99e8e37 to 9df235d Compare October 12, 2024 15:02

workloadscontroller: migrate to SSA

e9be617

bertinatto force-pushed the try-apply-workload branch from 9df235d to 1135acc Compare October 12, 2024 16:39

bertinatto added 2 commits October 12, 2024 13:45

workloadscontroller: ensure all code paths explicitly set conditions …

4c81d5f

…and their statuses

workloadscontroller: set generations if workload information is avail…

0f89461

…able

bertinatto commented Oct 12, 2024

View reviewed changes

p0lyn0mial reviewed Oct 14, 2024

View reviewed changes

bertinatto force-pushed the try-apply-workload branch 2 times, most recently from cc634a2 to 1798c3a Compare October 14, 2024 18:44

workloadscontroller: don't overwrite degraded condition if it's alrea…

cd8cf9a

…dy set

bertinatto force-pushed the try-apply-workload branch from 1798c3a to cd8cf9a Compare October 14, 2024 18:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API-1835: Migrate WorkloadController to SSA #1830

API-1835: Migrate WorkloadController to SSA #1830

bertinatto commented Oct 11, 2024 •

edited

Loading

openshift-ci bot commented Oct 11, 2024

openshift-ci-robot commented Oct 11, 2024 •

edited by openshift-ci bot

Loading

bertinatto commented Oct 11, 2024

bertinatto commented Oct 11, 2024

openshift-ci-robot commented Oct 11, 2024 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Oct 12, 2024 •

edited by openshift-ci bot

Loading

bertinatto commented Oct 12, 2024

bertinatto Oct 12, 2024

p0lyn0mial Oct 14, 2024

p0lyn0mial Oct 14, 2024

bertinatto Oct 14, 2024

p0lyn0mial Oct 14, 2024

p0lyn0mial Oct 14, 2024

bertinatto Oct 14, 2024

p0lyn0mial Oct 14, 2024

p0lyn0mial Oct 14, 2024

bertinatto Oct 14, 2024

openshift-ci bot commented Oct 14, 2024

API-1835: Migrate WorkloadController to SSA #1830

Are you sure you want to change the base?

API-1835: Migrate WorkloadController to SSA #1830

Conversation

bertinatto commented Oct 11, 2024 • edited Loading

openshift-ci bot commented Oct 11, 2024

openshift-ci-robot commented Oct 11, 2024 • edited by openshift-ci bot Loading

bertinatto commented Oct 11, 2024

bertinatto commented Oct 11, 2024

openshift-ci-robot commented Oct 11, 2024 • edited by openshift-ci bot Loading

openshift-ci-robot commented Oct 12, 2024 • edited by openshift-ci bot Loading

bertinatto commented Oct 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-ci bot commented Oct 14, 2024

bertinatto commented Oct 11, 2024 •

edited

Loading

openshift-ci-robot commented Oct 11, 2024 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Oct 11, 2024 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Oct 12, 2024 •

edited by openshift-ci bot

Loading