KEP: Multi-cluster workload scheduling & balancing #31

yue9944882 · 2021-10-28T06:24:50Z

openshift-ci · 2021-10-28T06:24:52Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: yue9944882
To complete the pull request process, please assign deads2k after the PR has been reviewed.
You can assign the PR to them by writing /assign @deads2k in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

qiujian16

This is a good idea. But I think we may also consider whether the proportion distribution can be done in placement API as the alternative and what is the pros and cons.

For example, we could have a replica field in placement API and in each placementDecision, have a field to specify the replica for a cluster. It seems possible, because in this case, placement is to define how to put N replicas to M clusters, and each decision result tells how many replicas should be put in selected clusters. WDYT?

qiujian16 · 2021-10-28T06:45:20Z

enhancements/sig-architecture/18-workload-scheduling/README.md

+  # The target namespace to deploy the workload in the spoke cluster.
+  spokeNamespace: default
+  # The content of target workload, supporting:
+  # - Inline: Embedding a static manifest.


I think we should limit the type of allowed resource here? For example, it can only be resources that can scale

u mean clarify the limit in the comment/doc? in the implementation, we can check whether a resource has /scale by api-discovery, the RESTMapper in the native client library requires only group-version-kind to verify the precondition so i guess it's not necessary to assert the resource metadata explicitly in the api spec?

Should we have an admission control for allowed resources, or what it a user specify a resource here that cannot scale?

qiujian16 · 2021-10-28T06:46:25Z

enhancements/sig-architecture/18-workload-scheduling/README.md

+kind: ElasticWorkload
+spec:
+  # The target namespace to deploy the workload in the spoke cluster.
+  spokeNamespace: default


should we just let this workload to be deployed on spoke in the same ns of this resource on hub?

practically i think that will work for most cases b/c we are usually managing one namespace per application. but am not sure if that will apply to all cases.

qiujian16 · 2021-10-28T06:49:54Z

enhancements/sig-architecture/18-workload-scheduling/README.md

+  # - Even: Filling the min replicas upon every round. i.e. max-min
+  # - Weighted: Setting a default weight and overriding the weight for a
+  #             few clusters on demand.
+  distributionStrategy:


If a cluster is added into or removed from the decision of the related placement, will the distribution be recalculated?

yes i think so

how can we ensure that limitRange.min is satisfied in this case? I think the API can only ensure how evenly the replicas are distributed.

my original idea is the final distribution is calculated in two phases (1) initial-distribution i.e. distributionStrategy (2) second-time re-distribution i.e. balanceStrategy. so it the initial result from even distribution strategy doesnt conform to the requirement by .limitRange.min. the final distributed result will be round up to .limitRange.min. and additionally if .limitRange.min * selectedClusters >= the expected total replicas, the reconcile loop should be returning w/o applying any actual changes.

qiujian16 · 2021-10-28T06:53:23Z

enhancements/sig-architecture/18-workload-scheduling/README.md

+  #             few clusters on demand.
+  distributionStrategy:
+    totalReplicas: 10
+    type: [ Even | Propotional ]


typo: Proportional

How does proportional be specified? and is Even/proportional are hard or soft limit? Should we mimic the pod spreading policy in kube with a MaxSurge, so MaxSurge=1 actually means an even distribution.

revised to Weighted. as for MaxSurge, am a bit leaning on Weighted b/c it looks more intuitive from user's perspective.

qiujian16 · 2021-10-28T07:10:22Z

enhancements/sig-architecture/18-workload-scheduling/README.md

+the kubernetes community, our multi-cluster workload controller should not raise any
+additional requirement on the managing workload API except for enabling the standard
+[scale](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#scale-subresource)
+subresource via the CRD. Hence, to scale up or down the local workload, the controller


I think the issue is manifestwork is to apply "Any" resources, and most of them dost not scale. To support his, we probably need to have field in the manifestwork to override the replica path in the manifests of manifestwork.

am not sure what kind of built-in support we want from the manifestwork api in the current phase. a random idea i can think of is to add a new types of remediation type e.g. UpdateScaleSubresource which optionally updates local delivered resources via /scale subresource iff the replicas is the only difference from the existing state and expectation.

enhancements/sig-architecture/18-workload-scheduling/README.md

qiujian16 · 2021-10-28T07:30:08Z

@elgnay @suigh you might be interested in this.

qiujian16 · 2021-10-28T08:30:03Z

enhancements/sig-architecture/18-workload-scheduling/README.md

+  # - Even: Filling the min replicas upon every round. i.e. max-min
+  # - Weighted: Setting a default weight and overriding the weight for a
+  #             few clusters on demand.
+  distributionStrategy:


how can we ensure that limitRange.min is satisfied in this case? I think the API can only ensure how evenly the replicas are distributed.

qiujian16 · 2021-10-28T08:31:14Z

enhancements/sig-architecture/18-workload-scheduling/README.md

+  #             those clusters under the "min" will be the primary choices.
+  #    * "max": the controller will exclude the cluster exceeding the "max"
+  #             from the list of candidates upon re-scheduling.
+  # - Classful: A classful prioritized rescheduling policy.


it seems Classful cover all the cases in LimitRange. Why we need the two options?

i think in the first stage, we will leave the Classful unimplemented, just None and LimitRange in the alpha api should be sufficient

lxs137 · 2021-12-03T03:05:35Z

enhancements/sig-architecture/18-workload-scheduling/README.md

+  #             few clusters on demand.
+  distributionStrategy:
+    totalReplicas: 10
+    type: [ Even | Weighted ]


how to specify cluster weight, at "Placement.prioritizerPolicy"?

for clarification "Placement.prioritizerPolicy" only takes effect during cluster selection, while the Weighted distribution indicates the distribution of replicas for the workload. the a sample of weighted distribution will be something like:

spec: distributionStrategy: totalReplicas: 10 type: Weighted weighted: defaultWeight: 10 overrides: - clusterName: xx weight: 100

openshift-ci bot requested review from deads2k and qiujian16 October 28, 2021 06:24

qiujian16 reviewed Oct 28, 2021

View reviewed changes

kep: multi-cluster workload

1ce0011

yue9944882 force-pushed the kep/replicas-scheduling branch from fc42cb6 to 1ce0011 Compare October 28, 2021 07:56

qiujian16 reviewed Oct 28, 2021

View reviewed changes

lxs137 reviewed Dec 3, 2021

View reviewed changes

qiujian16 mentioned this pull request Dec 7, 2021

Can I create work by using native K8s API? open-cluster-management-io/ocm#12

Closed

aii-nozomu-oki mentioned this pull request Jun 1, 2023

Placement API to allow multiple ManifestWorks to be assigned to the same cluster open-cluster-management-io/ocm#155

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KEP: Multi-cluster workload scheduling & balancing #31

KEP: Multi-cluster workload scheduling & balancing #31

yue9944882 commented Oct 28, 2021

openshift-ci bot commented Oct 28, 2021

qiujian16 left a comment

qiujian16 Oct 28, 2021

yue9944882 Oct 28, 2021

qiujian16 Oct 28, 2021

qiujian16 Oct 28, 2021

yue9944882 Oct 28, 2021

qiujian16 Oct 28, 2021

yue9944882 Oct 28, 2021

qiujian16 Oct 28, 2021

yue9944882 Dec 8, 2021

qiujian16 Oct 28, 2021

yue9944882 Oct 28, 2021

qiujian16 Oct 28, 2021

yue9944882 Oct 28, 2021

qiujian16 commented Oct 28, 2021

qiujian16 Oct 28, 2021

qiujian16 Oct 28, 2021

yue9944882 Dec 8, 2021

lxs137 Dec 3, 2021

yue9944882 Dec 8, 2021

KEP: Multi-cluster workload scheduling & balancing #31

Are you sure you want to change the base?

KEP: Multi-cluster workload scheduling & balancing #31

Conversation

yue9944882 commented Oct 28, 2021

openshift-ci bot commented Oct 28, 2021

qiujian16 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qiujian16 commented Oct 28, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment