multi-cluster autoscaler #94

aii-nozomu-oki · 2023-04-18T06:08:08Z

Hi, are there any plans to develop a multi-cluster autoscaler?

Perhaps I can build it myself:

Prepare the metrics provider.
Create a mechanism to increase spec.numberOfClusters of Placement if the threshold is exceeded.

There are several possible problems:

Conflicts with autoscalers in clusters.
Placement must be referenced from only one ManifestWork.

I would like to know if there is an OCM community work on this.

The text was updated successfully, but these errors were encountered:

qiujian16 · 2023-04-24T09:17:12Z

Hi, this is an interesting idea. For the issues

Conflicts with autoscalers in clusters.

you could use server side apply in the manifestwork, so autoscaler in the clusters will update replica while work agent will not. See "Resource Race and Adoption" section here https://open-cluster-management.io/concepts/manifestwork/

Placement must be referenced from only one ManifestWork.
We have a new API called ManifestworkReplicaSet which is basically a combination of Manifestwork and placement. It will be included in the coming ocm release: https://github.com/open-cluster-management-io/api/blob/main/work/v1alpha1/types_manifestworkreplicaset.go

We currently do not have an going work to provide autoscaler yet, but it makes sense to me a lot.

aii-nozomu-oki · 2023-04-24T23:59:26Z

Thanks for your reply.

you could use server side apply in the manifestwork, so autoscaler in the clusters will update replica while work agent will not. See "Resource Race and Adoption" section here https://open-cluster-management.io/concepts/manifestwork/

Sorry for the lack of words.
Both multi-cluster autoscalers and in-cluster autoscalers (HPA, KEDA, etc.) would increase or decrease resources.
This wouldn't optimize resources, which I described as "conflicts".
I have not come up with a "smart" solution for this.
Perhaps one new scaler should be responsible for both in-cluster and multi-cluster scaling.

We have a new API called ManifestworkReplicaSet which is basically a combination of Manifestwork and placement. It will be included in the coming ocm release: https://github.com/open-cluster-management-io/api/blob/main/work/v1alpha1/types_manifestworkreplicaset.go

Yes, ManifestWorkReplicaSet is a great work. But it doesn't solve my problem.
As far as I know, ManifestWorkReplicaSet refers to Placements. And Placement has numberOfClusters.
If there is ManifestWorkReplicaSet A and ManifestworkReplicaSet B that reference the same Placement A, the scaling of ManifestWorkReplicaSet A will affect ManifestworkReplicaSet B.
To prevent this, it is necessary to create a Placement for each application, but I don't think it is the original intended use of Placement.
If scaling is considered, ManifestWorkReplicaSet should have numberOfClusters.

And my another idea is for Placement (or maybe the next version of ManifestworkReplicaSet) to implement scale sub resource.
It allows HPA, KEDA, or other scalers to scale ManifestWork through standard methods.

qiujian16 · 2023-04-25T01:24:26Z

< Both multi-cluster autoscalers and in-cluster autoscalers (HPA, KEDA, etc.) would increase or decrease resources.
This wouldn't optimize resources, which I described as "conflicts".

I would think multi-cluster autoscalers is to scale the number of clusters of related placement, while hpa is to scale the real replicas of the deployment in a certain clusters.

< If there is ManifestWorkReplicaSet A and ManifestworkReplicaSet B that reference the same Placement A, the scaling of ManifestWorkReplicaSet A will affect ManifestworkReplicaSet B.

On the other hand, is it a valid case? You can bundle multiple ManifestworkReplicaSets as one "scaling group". So scaling the placement will scale workloads in all related ManifestworkReplicaSets.

< And my another idea is for Placement (or maybe the next version of ManifestworkReplicaSet)
I agree. I think placement should have a scale sub resource.

aii-nozomu-oki · 2023-04-25T02:35:51Z

I would think multi-cluster autoscalers is to scale the number of clusters of related placement, while hpa is to scale the real replicas of the deployment in a certain clusters.

You're correct, but I want "smart scaling" as in the following scenarios:
(In fact, these may be outside the scope of the multi-cluster autoscaler)

If cluster A is running out of resources, scale to cluster B

We can address this by using resource-usage-collect-addon to detect resource shortages, or by detecting Pod Pending events and having the multi-cluster autoscaler process after HPA.

First, spread to as many clusters as possible, then scale out within each cluster

This is not the problem I am facing and may be a "non-existent" problem.
In this case, the multi-cluster autoscaler needs to work in preference to HPA, but I have not come up with a way to do that.

On the other hand, is it a valid case? You can bundle multiple ManifestworkReplicaSets as one "scaling group". So scaling the placement will scale workloads in all related ManifestworkReplicaSets.

"scaling group" is a case I didn't consider. Thinking about it, it is reasonable for placement to have numberOfClusters and accept "Placement per application".

I agree. I think placement should have a scale sub resource.

👍

aii-nozomu-oki · 2023-04-26T09:36:35Z

I did some further research on the scale subresource.
As far as I know, labelSelectorPath is not required for kubectl scale, but is required for HPA and KEDA.
And it is unclear if the HPA can be used for resources not associated with Pods.
https://kubernetes.slack.com/archives/C09R1LV8S/p1615326957007600
https://medium.com/@thescott111/autoscaling-kubernetes-custom-resource-using-the-hpa-957d00bb7993
kedacore/keda#3582
https://github.com/orgs/strimzi/discussions/6871

aii-nozomu-oki mentioned this issue May 31, 2023

Placement API to allow multiple ManifestWorks to be assigned to the same cluster open-cluster-management-io/ocm#155

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-cluster autoscaler #94

multi-cluster autoscaler #94

aii-nozomu-oki commented Apr 18, 2023

qiujian16 commented Apr 24, 2023

aii-nozomu-oki commented Apr 24, 2023

qiujian16 commented Apr 25, 2023

aii-nozomu-oki commented Apr 25, 2023

aii-nozomu-oki commented Apr 26, 2023 •

edited

Loading

multi-cluster autoscaler #94

multi-cluster autoscaler #94

Comments

aii-nozomu-oki commented Apr 18, 2023

qiujian16 commented Apr 24, 2023

aii-nozomu-oki commented Apr 24, 2023

qiujian16 commented Apr 25, 2023

aii-nozomu-oki commented Apr 25, 2023

aii-nozomu-oki commented Apr 26, 2023 • edited Loading

aii-nozomu-oki commented Apr 26, 2023 •

edited

Loading