-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster: GetPlatform: replace CSV list with OperatorExists calls #1051
cluster: GetPlatform: replace CSV list with OperatorExists calls #1051
Conversation
pkg/cluster/cluster_config.go
Outdated
err := cli.List(context.TODO(), clusterCsvs) | ||
// limit request to own namespace or since the operator is seen in all it can be too big | ||
// ignore error, "" in this case is acceptable for functionality | ||
ns, _ := GetOperatorNamespace() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use OperatorConditions or some other resources to identify the platform?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever works :)
Do you mean get the list of OperatorConditions and check the names for opendatahub-operator or rhods-operator similar way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OperatorExists() which calls OperatorConditions will only tell if it is ODH or RHOAI, does not say much if self-manager or managered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OperatorExists() which calls OperatorConditions will only tell if it is ODH or RHOAI, does not say much if self-manager or managered.
But it will work here in this branch, won't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OperatorExists() which calls OperatorConditions will only tell if it is ODH or RHOAI, does not say much if self-manager or managered.
Thanks for pointing to the function, missed it
Add OLM's[1] OperatorCondition/OperatorConditionList to the external CRDs. Will be used in future patches. [1] https://github.com/operator-framework/operator-lifecycle-manager/blob/master/deploy/upstream/manifests/0.18.3/0000_50_olm_00-operatorconditions.crd.yaml Signed-off-by: Yauheni Kaliuta <[email protected]>
Following patches will change initialization to list the objects. Signed-off-by: Yauheni Kaliuta <[email protected]>
/retest-required |
i've created https://issues.redhat.com/browse/RHOAIENG-8483 which can be used as reference |
Jira: https://issues.redhat.com/browse/RHOAIENG-8483 Depending of the way operators installed CSVs can be seen in all namespace so listing of them causes producing N * M results (where N is number of namespaces and M is number of CSVs) which is not scalable in general and in practice causes timeouts on such large clusters. The function basically checks if ODH or RHOAI operator installed and there is already such function in the package, OperatorExists(). So, reuse it. Signed-off-by: Yauheni Kaliuta <[email protected]>
/retest-required |
pkg/cluster/cluster_config.go
Outdated
// isManagedRHODS checks if CRD add-on exists and contains string ManagedRhods. | ||
func isManagedRHODS(cli client.Client) (Platform, error) { | ||
// detectManagedRHODS checks if CRD add-on exists and contains string ManagedRhods. | ||
func detectManagedRHODS(cli client.Client) (Platform, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we do like this
if cs.Name == string("addon-managed-odh-catalog") {
return ManagedRhods, nil
}
then in
const (
// ManagedRhods defines expected addon catalogsource.
ManagedRhods Platform = "Managed Red Hat OpenShift AI Managed"
// SelfManagedRhods defines display name in csv.
SelfManagedRhods Platform = "Self Managed Red Hat OpenShift AI"
// OpenDataHub defines display name in csv.
OpenDataHub Platform = "Open Data Hub Operator"
// Unknown indicates that operator is not deployed using OLM.
Unknown Platform = ""
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed. But probably it's better to have that changes (last two) as a separate PR(s) due to squashing policy. They are not related to the original issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed. But probably it's better to have that changes (last two) as a separate PR(s) due to squashing policy. They are not related to the original issue.
done on top on this PR #1054
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, just bring this up not necessary be in the same PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: VaishnaviHire The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
261bbab
into
opendatahub-io:incubation
…ndatahub-io#1051) * tests: envtest: Add OperatorCondition CRD Add OLM's[1] OperatorCondition/OperatorConditionList to the external CRDs. Will be used in future patches. [1] https://github.com/operator-framework/operator-lifecycle-manager/blob/master/deploy/upstream/manifests/0.18.3/0000_50_olm_00-operatorconditions.crd.yaml Signed-off-by: Yauheni Kaliuta <[email protected]> * tests: dscinitialization: add OperatorCondition CRD to schema Following patches will change initialization to list the objects. Signed-off-by: Yauheni Kaliuta <[email protected]> * cluster: GetPlatform: replace CSV list with OperatorExists calls Jira: https://issues.redhat.com/browse/RHOAIENG-8483 Depending of the way operators installed CSVs can be seen in all namespace so listing of them causes producing N * M results (where N is number of namespaces and M is number of CSVs) which is not scalable in general and in practice causes timeouts on such large clusters. The function basically checks if ODH or RHOAI operator installed and there is already such function in the package, OperatorExists(). So, reuse it. Signed-off-by: Yauheni Kaliuta <[email protected]> --------- Signed-off-by: Yauheni Kaliuta <[email protected]>
…ndatahub-io#1051) * tests: envtest: Add OperatorCondition CRD Add OLM's[1] OperatorCondition/OperatorConditionList to the external CRDs. Will be used in future patches. [1] https://github.com/operator-framework/operator-lifecycle-manager/blob/master/deploy/upstream/manifests/0.18.3/0000_50_olm_00-operatorconditions.crd.yaml Signed-off-by: Yauheni Kaliuta <[email protected]> * tests: dscinitialization: add OperatorCondition CRD to schema Following patches will change initialization to list the objects. Signed-off-by: Yauheni Kaliuta <[email protected]> * cluster: GetPlatform: replace CSV list with OperatorExists calls Jira: https://issues.redhat.com/browse/RHOAIENG-8483 Depending of the way operators installed CSVs can be seen in all namespace so listing of them causes producing N * M results (where N is number of namespaces and M is number of CSVs) which is not scalable in general and in practice causes timeouts on such large clusters. The function basically checks if ODH or RHOAI operator installed and there is already such function in the package, OperatorExists(). So, reuse it. Signed-off-by: Yauheni Kaliuta <[email protected]> --------- Signed-off-by: Yauheni Kaliuta <[email protected]> (cherry picked from commit 261bbab)
…ndatahub-io#1051) * tests: envtest: Add OperatorCondition CRD Add OLM's[1] OperatorCondition/OperatorConditionList to the external CRDs. Will be used in future patches. [1] https://github.com/operator-framework/operator-lifecycle-manager/blob/master/deploy/upstream/manifests/0.18.3/0000_50_olm_00-operatorconditions.crd.yaml Signed-off-by: Yauheni Kaliuta <[email protected]> * tests: dscinitialization: add OperatorCondition CRD to schema Following patches will change initialization to list the objects. Signed-off-by: Yauheni Kaliuta <[email protected]> * cluster: GetPlatform: replace CSV list with OperatorExists calls Jira: https://issues.redhat.com/browse/RHOAIENG-8483 Depending of the way operators installed CSVs can be seen in all namespace so listing of them causes producing N * M results (where N is number of namespaces and M is number of CSVs) which is not scalable in general and in practice causes timeouts on such large clusters. The function basically checks if ODH or RHOAI operator installed and there is already such function in the package, OperatorExists(). So, reuse it. Signed-off-by: Yauheni Kaliuta <[email protected]> --------- Signed-off-by: Yauheni Kaliuta <[email protected]> (cherry picked from commit 261bbab)
* feat: increase QPS and Burst for client (opendatahub-io#1031) - we might see throttling in some cluster, this is just to uplift the default value Signed-off-by: Wen Zhou <[email protected]> (cherry picked from commit 54ee87d) * kserve: configure servicemesh before deploying manifests (opendatahub-io#1019) * kserve: configure servicemesh before deploying manifests Jira: https://issues.redhat.com/browse/RHOAIENG-7312 kserve depends on odh-model-controller which it starts by deploying manifests of Dependent Operator. The controller behaviour depends of configuration (authorino) which is later deployed by configuring servicemesh features. Here is a race, there are 2 checks in the odh-model-controller for the presence of AuthorizationPolicy (which is deployed by servicemesh configuration): 1) to add a type to the schema 2) to watch the objects of that type. If the object appears in between odh-model-controller complains: ``` 2024-05-16T06:46:03Z ERROR Reconciler error {"controller": "inferenceservice", "controllerGroup": "serving.kserve.io", "controllerKind": "InferenceService", "InferenceService": {"name":"xf","namespace":"single-model-test"}, "namespace": "single-model-test", "name": "xf", "reconcileID": "e6f42f44-1866-45d4-836a-69e4e93edef4", "error": "1 error occurred:\n\t* could not GET authconfig single-model-test/xf. cause no kind is registered for the type v1beta2.AuthConfig in scheme \"pkg/runtime/scheme.go:100\"\n\n"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 ``` Move servicemesh configuration before deploying the manifests to narrow the race window. odh-model-controller will change their check in favor of checking Authorino CRD to avoid the race completely. Checking AuthorizationPolicy existance in operator/kserve would fix it as well, but it's a delay for the reconcile loop (vs odh-model-controller where it's done only on startup), so since odh-model-controller is going to reimplement the check, keep it as it is. Also modelmeshserving component can deploy odh-model-controller (thanks Vedant for pointing) if it is enabled. The order is unspecified by due to implementation it will happen before kserve configuration (order of field in the Components structure). Signed-off-by: Yauheni Kaliuta <[email protected]> * kserve: get rid of extra enabled check for setupKserveConfig() To avoid linter report: components/kserve/kserve.go:97:1: cyclomatic complexity 31 of func `(*Kserve).ReconcileComponent` is high (> 30) (gocyclo) move setupKserveConfig() call under "if enabled" branch below. Signed-off-by: Yauheni Kaliuta <[email protected]> --------- Signed-off-by: Yauheni Kaliuta <[email protected]> * chore: adds godoc to Feature builder (opendatahub-io#1013) As the first step of improving Feature DSL this PR brings godoc explaining purpose of each method in the builder chain. (cherry picked from commit 5ce6306) * fix: wrong path when use devFlag + wrong default value + special name in (opendatahub-io#1024) trustyai Signed-off-by: Wen Zhou <[email protected]> (cherry picked from commit d9e78b4) * chore: moves operator/subscriptions operations to pkg/cluster (opendatahub-io#1027) They do not belong to pkg/deploy as they are about reading/writing cluster resources rather than deploying resources. (cherry picked from commit d90983b) * chore: Open up util functions for context propagation (opendatahub-io#1033) context should be determined by the caller and propagated down the call chain. (cherry picked from commit 105adae) * fix: missing label "opendatahub.io/generated-namespace" on auth (opendatahub-io#1038) Signed-off-by: Wen Zhou <[email protected]> (cherry picked from commit d6b108b) * chore: append ownerRef to resources owned by Features (opendatahub-io#1039) * rename / chg FeatureOwner func * remove unused old ownerref func (cherry picked from commit 6583645) * Revert "chore: append ownerRef to resources owned by Features (opendatahub-io#1039)" This reverts commit 6583645. opendatahub-io#1039 (comment) Signed-off-by: Yauheni Kaliuta <[email protected]> (cherry picked from commit 952795a) * Update Owners-aliases list (opendatahub-io#1040) (cherry picked from commit cc9aecb) * RHOAIENG-5426: Updated pull request template with prerequisites (opendatahub-io#1042) * RHOAIENG-5426: Updated pull request template with prerequisites * PR template changes (cherry picked from commit 244ca13) * chore: renames manifests source to location (opendatahub-io#1050) - Source is already used elsewhere in Feature DSL, so we might want to avoid confusion - Location fits the purpose of this field better (cherry picked from commit 8921839) * Fix trustyai changes * cluster: GetPlatform: replace CSV list with OperatorExists calls (opendatahub-io#1051) * tests: envtest: Add OperatorCondition CRD Add OLM's[1] OperatorCondition/OperatorConditionList to the external CRDs. Will be used in future patches. [1] https://github.com/operator-framework/operator-lifecycle-manager/blob/master/deploy/upstream/manifests/0.18.3/0000_50_olm_00-operatorconditions.crd.yaml Signed-off-by: Yauheni Kaliuta <[email protected]> * tests: dscinitialization: add OperatorCondition CRD to schema Following patches will change initialization to list the objects. Signed-off-by: Yauheni Kaliuta <[email protected]> * cluster: GetPlatform: replace CSV list with OperatorExists calls Jira: https://issues.redhat.com/browse/RHOAIENG-8483 Depending of the way operators installed CSVs can be seen in all namespace so listing of them causes producing N * M results (where N is number of namespaces and M is number of CSVs) which is not scalable in general and in practice causes timeouts on such large clusters. The function basically checks if ODH or RHOAI operator installed and there is already such function in the package, OperatorExists(). So, reuse it. Signed-off-by: Yauheni Kaliuta <[email protected]> --------- Signed-off-by: Yauheni Kaliuta <[email protected]> (cherry picked from commit 261bbab) * chore: remove duplicated platform call in each component (opendatahub-io#1055) - get in DSC and pass into compoment Signed-off-by: Wen Zhou <[email protected]> (cherry picked from commit 1b04761) * chore: update toolbox sdk version and remove duplicated addtoschema (opendatahub-io#1061) Signed-off-by: Wen Zhou <[email protected]> (cherry picked from commit 1b23c9f) * chore: funcs to create kustomize plugins (opendatahub-io#1062) The actual use with resMap is inlined and moved to the caller. This way we can construct plugins on demand and use them as building blocks instead. (cherry picked from commit 7bf56a4) * Fix linter errors * Update scheme --------- Signed-off-by: Yauheni Kaliuta <[email protected]> Co-authored-by: Wen Zhou <[email protected]> Co-authored-by: Yauheni Kaliuta <[email protected]> Co-authored-by: Bartosz Majsak <[email protected]> Co-authored-by: Aslak Knutsen <[email protected]> Co-authored-by: Cameron Garrison <[email protected]> Co-authored-by: Saravana Balaji Srinivasan <[email protected]>
Partial application of 261bbab ("cluster: GetPlatform: replace CSV list with OperatorExists calls (opendatahub-io#1051)") Confused, the only usage came with an old patch 099c1a4 ("Add check on dependent operators") Signed-off-by: Yauheni Kaliuta <[email protected]>
Partial application of 261bbab ("cluster: GetPlatform: replace CSV list with OperatorExists calls (opendatahub-io#1051)") Confused, the only usage came with an old patch 099c1a4 ("Add check on dependent operators") Signed-off-by: Yauheni Kaliuta <[email protected]>
Jira: https://issues.redhat.com/browse/RHOAIENG-8483
Depending of the way operators installed CSVs can be seen in all
namespace so listing of them causes producing N * M results (where N
is number of namespaces and M is number of CSVs) which is not
scalable in general and in practice causes timeouts on such large
clusters.
The function basically checks if ODH or RHOAI operator installed and
there is already such function in the package, OperatorExists(). So,
reuse it.
Many thanks for submitting your Pull Request ❤️!
Please make sure that your PR meets the following requirements:
Description
JIRA issue:
How Has This Been Tested?
Screenshot or short clip:
Merge criteria: