Skip to content

Commit

Permalink
Create ScalingSchedule collector
Browse files Browse the repository at this point in the history
This commit adds two new collectors to the adapter:
- ClusterScalingScheduleCollector; and
- ScalingScheduleCollector

Also, it introduces the required collectors plugins, initialization
logic in the server startup, documentation and deployment example
(including the helm chart). A new config flag is created,
`-scaling-schedule`, and allows to enable and to disable the collection
of such metrics. It's disabled by default.

This collectors are the required logic to utilise the CRDs introduced in
the #284 pull request. It makes use of the kubernetes go-client
implementations of a [Store][0] and [Reflector][1].

[0]: https://pkg.go.dev/k8s.io/client-go/tools/cache#Store
[1]: https://pkg.go.dev/k8s.io/client-go/tools/cache#Reflector

Signed-off-by: Jonathan Juares Beber <[email protected]>
  • Loading branch information
jonathanbeber committed May 21, 2021
1 parent 7a68304 commit 4644d0d
Show file tree
Hide file tree
Showing 17 changed files with 1,835 additions and 5 deletions.
2 changes: 2 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
FROM registry.opensource.zalan.do/library/alpine-3.12:latest
LABEL maintainer="Team Teapot @ Zalando SE <[email protected]>"

RUN apk add --no-cache tzdata

# add binary
ADD build/linux/kube-metrics-adapter /

Expand Down
115 changes: 115 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -671,3 +671,118 @@ metric-config.<metricType>.<metricName>.<collectorType>/interval: "30s"

The default is `60s` but can be reduced to let the adapter collect metrics more
often.

## ScalingSchedule Collectors

The `ScalingSchedule` and `ClusterScalingSchedule` collectors allow
collecting time-based metrics from the respective CRD objects specified
in the HPA.

### Supported metrics

| Metric | Description | Type | K8s Versions |
| ---------- | -------------- | ------- | -- |
| ObjectName | The metric is calculated and stored for each `ScalingSchedule` and `ClusterScalingSchedule` referenced in the HPAs | `ScalingSchedule` and `ClusterScalingSchedule` | `>=1.16` |

### Example

This is an example of using the ScalingSchedule collectors to collect
metrics from a deployed kind of the CRD. First, the schedule object:

```yaml
apiVersion: zalando.org/v1
kind: ClusterScalingSchedule
metadata:
name: "scheduling-event"
spec:
schedules:
- type: OneTime
date: "2021-10-02T08:08:08+02:00"
durationMinutes: 30
value: 100
- type: Repeating
durationMinutes: 10
value: 120
period:
startTime: "15:45"
timezone: "Europe/Berlin"
days:
- Mon
- Wed
- Fri
```

This resource defines a scheduling event named `scheduling-event` with
two schedules of the kind `ClusterScalingSchedule`.

`ClusterScalingSchedule` objects aren't namespaced, what means it can be
referenced by any HPA in any namespace in the cluster. `ScalingSchedule`
have the exact same fields and behavior, but can be referenced just by
HPAs in the same namespace. The schedules can have the type `Repeating`
or `OneTime`.

This example configuration will generate the following result: at
`2021-10-02T08:08:08+02:00` for 30 minutes a metric with the value of
100 will be returned. Every Monday, Wednesday and Friday, starting at 15
hours and 45 minutes (Berlin time), a metric with the value of 120 will
be returned. It's not the case of this example, but if multiple
schedules collide in time, the biggest value is returned.

Check the CRDs definitions
([ScalingSchedule](./docs/scaling_schedules_crd.yaml),
[ClusterScalingSchedule](./docs/cluster_scaling_schedules_crd.yaml)) for
a better understanding of the possible fields and their behavior.

An HPA can reference the deployed `ClusterScalingSchedule` object as
this example:

```yaml
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: "myapp-hpa"
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 1
maxReplicas: 15
metrics:
- type: Object
object:
describedObject:
apiVersion: zalando.org/v1
kind: ClusterScalingSchedule
name: "scheduling-event"
metric:
name: "scheduling-event"
target:
type: AverageValue
averageValue: "10"
```

The name of the metric is equal to the name of the referenced object.
The `target.averageValue` in this example is set to 10. This value will
be used by the HPA controller to define the desired number of pods,
based on the metric obtained (check the [HPA algorithm
details](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details)
for more context). This HPA configuration explicitly says that each pod
of this application supports 10 units of the `ClusterScalingSchedule`
metric. Multiple applications can share the same
`ClusterScalingSchedule` or `ScalingSchedule` event and have a different
number of pods based on its `target.averageValue` configuration.

In our specific example at `2021-10-02T08:08:08+02:00` as the metric has
the value 100, this application will scale to 10 pods (100/10). Every
Monday, Wednesday and Friday, starting at 15 hours and 45 minutes
(Berlin time) the application will scale to 12 pods (120/10). Note that
those are just considering these custom metrics, the normal HPA behavior
still applies, such as: in case of multiple metrics the biggest number
of pods is the utilized one, HPA max and min replica configuration,
autoscaling policies, etc.

These collectors are disabled by default, you have to start the server
with the `--scaling-schedule` flag to enable it. Remember to deploy the CRDs
`ScalingSchedule` and `ClusterScalingSchedule` and allow the service
account used by the server to read, watch and list them.
1 change: 1 addition & 0 deletions docs/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ spec:
- --prometheus-server=http://prometheus.kube-system.svc.cluster.local
- --skipper-ingress-metrics
- --aws-external-metrics
- --scaling-schedule
env:
- name: AWS_REGION
value: eu-central-1
Expand Down
2 changes: 1 addition & 1 deletion docs/helm/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
apiVersion: v2
name: kube-metrics-adapter
version: 0.1.10
version: 0.1.11
description: kube-metrics-adapter helm chart
home: https://github.com/zalando-incubator/kube-metrics-adapter
maintainers:
Expand Down
119 changes: 119 additions & 0 deletions docs/helm/templates/cluster_scaling_schedules_crd.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
{{- if .Values.scalingSchedule.enabled }}
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.5.0
creationTimestamp: null
name: clusterscalingschedules.zalando.org
spec:
group: zalando.org
names:
kind: ClusterScalingSchedule
listKind: ClusterScalingScheduleList
plural: clusterscalingschedules
singular: clusterscalingschedule
scope: Cluster
versions:
- name: v1
schema:
openAPIV3Schema:
description: ClusterScalingSchedule describes a cluster scoped time based
metric to be used in autoscaling operations.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: ScalingScheduleSpec is the spec part of the ScalingSchedule.
properties:
schedules:
description: Schedules is the list of schedules for this ScalingSchedule
resource. All the schedules defined here will result on the value
to the same metric. New metrics require a new ScalingSchedule resource.
items:
description: Schedule is the schedule details to be used inside
a ScalingSchedule.
properties:
date:
description: Defines the starting date of a OneTime schedule.
It has to be a RFC3339 formated date.
format: date-time
type: string
durationMinutes:
description: The duration in minutes that the configured value
will be returned for the defined schedule.
type: integer
period:
description: Defines the details of a Repeating schedule.
properties:
days:
description: The days that this schedule will be active.
items:
description: ScheduleDay represents the valid inputs for
days in a SchedulePeriod.
enum:
- Sun
- Mon
- Tue
- Wed
- Thu
- Fri
- Sat
type: string
type: array
startTime:
description: The startTime has the format HH:MM
pattern: (([0-1][0-9])|([2][0-3])):([0-5][0-9])
type: string
timezone:
description: The location name corresponding to a file in
the IANA Time Zone database, like Europe/Berlin.
type: string
required:
- days
- startTime
- timezone
type: object
type:
description: Defines if the schedule is a OneTime schedule or
Repeating one. If OneTime, date has to be defined. If Repeating,
Period has to be defined.
enum:
- OneTime
- Repeating
type: string
value:
description: The metric value that will be returned for the
defined schedule.
type: integer
required:
- durationMinutes
- type
- value
type: object
type: array
required:
- schedules
type: object
required:
- spec
type: object
served: true
storage: true
status:
acceptedNames:
kind: ""
plural: ""
conditions: []
storedVersions: []
{{- end}}
3 changes: 3 additions & 0 deletions docs/helm/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,9 @@ spec:
{{- if .Values.zmon.tokenName }}
- --zmon-token-name={{ .Values.zmon.tokenName }}
{{- end}}
{{- if .Values.scalingSchedule.enabled }}
- --scaling-schedule
{{- end}}
resources:
limits:
cpu: {{ .Values.resources.limits.cpu }}
Expand Down
11 changes: 11 additions & 0 deletions docs/helm/templates/rbac.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,17 @@ rules:
- get
- list
- watch
{{- if .Values.scalingSchedule.enabled }}
- apiGroups:
- zalando.org
resources:
- clusterscalingschedules
- scalingschedules
verbs:
- get
- list
- watch
{{- end}}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
Expand Down
Loading

0 comments on commit 4644d0d

Please sign in to comment.