Create ScalingSchedule collector

This commit adds two new collectors to the adapter: - ClusterScalingScheduleCollector; and - ScalingScheduleCollector Also, it introduces the required collectors plugins, initialization logic in the server startup, documentation and deployment example (including the helm chart). A new config flag is created, `-scaling-schedule`, and allows to enable and to disable the collection of such metrics. It's disabled by default. This collectors are the required logic to utilise the CRDs introduced in the #284 pull request. It makes use of the kubernetes go-client implementations of a [Store][0] and [Reflector][1]. [0]: https://pkg.go.dev/k8s.io/client-go/tools/cache#Store [1]: https://pkg.go.dev/k8s.io/client-go/tools/cache#Reflector Signed-off-by: Jonathan Juares Beber <[email protected]>
zalando-incubator · May 21, 2021 · 4644d0d · 4644d0d
1 parent 7a68304
commit 4644d0d
Show file tree

Hide file tree

Showing 17 changed files with 1,835 additions and 5 deletions.
diff --git a/Dockerfile b/Dockerfile
@@ -1,6 +1,8 @@
 FROM registry.opensource.zalan.do/library/alpine-3.12:latest
 LABEL maintainer="Team Teapot @ Zalando SE <[email protected]>"
 
+RUN apk add --no-cache tzdata
+
 # add binary
 ADD build/linux/kube-metrics-adapter /
 

diff --git a/README.md b/README.md
@@ -671,3 +671,118 @@ metric-config.<metricType>.<metricName>.<collectorType>/interval: "30s"
 
 The default is `60s` but can be reduced to let the adapter collect metrics more
 often.
+
+## ScalingSchedule Collectors
+
+The `ScalingSchedule` and `ClusterScalingSchedule` collectors allow
+collecting time-based metrics from the respective CRD objects specified
+in the HPA.
+
+### Supported metrics
+
+| Metric | Description | Type | K8s Versions |
+| ---------- | -------------- | ------- | -- |
+| ObjectName | The metric is calculated and stored for each `ScalingSchedule` and `ClusterScalingSchedule` referenced in the HPAs | `ScalingSchedule` and `ClusterScalingSchedule` | `>=1.16` |
+
+### Example
+
+This is an example of using the ScalingSchedule collectors to collect
+metrics from a deployed kind of the CRD. First, the schedule object:
+
+```yaml
+apiVersion: zalando.org/v1
+kind: ClusterScalingSchedule
+metadata:
+  name: "scheduling-event"
+spec:
+  schedules:
+  - type: OneTime
+    date: "2021-10-02T08:08:08+02:00"
+    durationMinutes: 30
+    value: 100
+  - type: Repeating
+    durationMinutes: 10
+    value: 120
+    period:
+      startTime: "15:45"
+      timezone: "Europe/Berlin"
+      days:
+      - Mon
+      - Wed
+      - Fri
+```
+
+This resource defines a scheduling event named `scheduling-event` with
+two schedules of the kind `ClusterScalingSchedule`.
+
+`ClusterScalingSchedule` objects aren't namespaced, what means it can be
+referenced by any HPA in any namespace in the cluster. `ScalingSchedule`
+have the exact same fields and behavior, but can be referenced just by
+HPAs in the same namespace. The schedules can have the type `Repeating`
+or `OneTime`.
+
+This example configuration will generate the following result: at
+`2021-10-02T08:08:08+02:00` for 30 minutes a metric with the value of
+100 will be returned. Every Monday, Wednesday and Friday, starting at 15
+hours and 45 minutes (Berlin time), a metric with the value of 120 will
+be returned. It's not the case of this example, but if multiple
+schedules collide in time, the biggest value is returned.
+
+Check the CRDs definitions
+([ScalingSchedule](./docs/scaling_schedules_crd.yaml),
+[ClusterScalingSchedule](./docs/cluster_scaling_schedules_crd.yaml)) for
+a better understanding of the possible fields and their behavior.
+
+An HPA can reference the deployed `ClusterScalingSchedule` object as
+this example:
+
+```yaml
+apiVersion: autoscaling/v2beta2
+kind: HorizontalPodAutoscaler
+metadata:
+  name: "myapp-hpa"
+spec:
+  scaleTargetRef:
+    apiVersion: apps/v1
+    kind: Deployment
+    name: myapp
+  minReplicas: 1
+  maxReplicas: 15
+  metrics:
+  - type: Object
+    object:
+      describedObject:
+        apiVersion: zalando.org/v1
+        kind: ClusterScalingSchedule
+        name: "scheduling-event"
+      metric:
+        name: "scheduling-event"
+      target:
+        type: AverageValue
+        averageValue: "10"
+```
+
+The name of the metric is equal to the name of the referenced object.
+The `target.averageValue` in this example is set to 10. This value will
+be used by the HPA controller to define the desired number of pods,
+based on the metric obtained (check the [HPA algorithm
+details](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details)
+for more context). This HPA configuration explicitly says that each pod
+of this application supports 10 units of the `ClusterScalingSchedule`
+metric. Multiple applications can share the same
+`ClusterScalingSchedule` or `ScalingSchedule` event and have a different
+number of pods based on its `target.averageValue` configuration.
+
+In our specific example at `2021-10-02T08:08:08+02:00` as the metric has
+the value 100, this application will scale to 10 pods (100/10). Every
+Monday, Wednesday and Friday, starting at 15 hours and 45 minutes
+(Berlin time) the application will scale to 12 pods (120/10). Note that
+those are just considering these custom metrics, the normal HPA behavior
+still applies, such as: in case of multiple metrics the biggest number
+of pods is the utilized one, HPA max and min replica configuration,
+autoscaling policies, etc.
+
+These collectors are disabled by default, you have to start the server
+with the `--scaling-schedule` flag to enable it. Remember to deploy the CRDs
+`ScalingSchedule` and `ClusterScalingSchedule` and allow the service
+account used by the server to read, watch and list them.
diff --git a/docs/deployment.yaml b/docs/deployment.yaml
@@ -28,6 +28,7 @@ spec:
         - --prometheus-server=http://prometheus.kube-system.svc.cluster.local
         - --skipper-ingress-metrics
         - --aws-external-metrics
+        - --scaling-schedule
         env:
         - name: AWS_REGION
           value: eu-central-1

diff --git a/docs/helm/Chart.yaml b/docs/helm/Chart.yaml
@@ -1,6 +1,6 @@
 apiVersion: v2
 name: kube-metrics-adapter
-version: 0.1.10
+version: 0.1.11
 description: kube-metrics-adapter helm chart
 home: https://github.com/zalando-incubator/kube-metrics-adapter
 maintainers:

diff --git a/docs/helm/templates/cluster_scaling_schedules_crd.yaml b/docs/helm/templates/cluster_scaling_schedules_crd.yaml
@@ -0,0 +1,119 @@
+{{- if .Values.scalingSchedule.enabled }}
+apiVersion: apiextensions.k8s.io/v1
+kind: CustomResourceDefinition
+metadata:
+  annotations:
+    controller-gen.kubebuilder.io/version: v0.5.0
+  creationTimestamp: null
+  name: clusterscalingschedules.zalando.org
+spec:
+  group: zalando.org
+  names:
+    kind: ClusterScalingSchedule
+    listKind: ClusterScalingScheduleList
+    plural: clusterscalingschedules
+    singular: clusterscalingschedule
+  scope: Cluster
+  versions:
+  - name: v1
+    schema:
+      openAPIV3Schema:
+        description: ClusterScalingSchedule describes a cluster scoped time based
+          metric to be used in autoscaling operations.
+        properties:
+          apiVersion:
+            description: 'APIVersion defines the versioned schema of this representation
+              of an object. Servers should convert recognized schemas to the latest
+              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
+            type: string
+          kind:
+            description: 'Kind is a string value representing the REST resource this
+              object represents. Servers may infer this from the endpoint the client
+              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
+            type: string
+          metadata:
+            type: object
+          spec:
+            description: ScalingScheduleSpec is the spec part of the ScalingSchedule.
+            properties:
+              schedules:
+                description: Schedules is the list of schedules for this ScalingSchedule
+                  resource. All the schedules defined here will result on the value
+                  to the same metric. New metrics require a new ScalingSchedule resource.
+                items:
+                  description: Schedule is the schedule details to be used inside
+                    a ScalingSchedule.
+                  properties:
+                    date:
+                      description: Defines the starting date of a OneTime schedule.
+                        It has to be a RFC3339 formated date.
+                      format: date-time
+                      type: string
+                    durationMinutes:
+                      description: The duration in minutes that the configured value
+                        will be returned for the defined schedule.
+                      type: integer
+                    period:
+                      description: Defines the details of a Repeating schedule.
+                      properties:
+                        days:
+                          description: The days that this schedule will be active.
+                          items:
+                            description: ScheduleDay represents the valid inputs for
+                              days in a SchedulePeriod.
+                            enum:
+                            - Sun
+                            - Mon
+                            - Tue
+                            - Wed
+                            - Thu
+                            - Fri
+                            - Sat
+                            type: string
+                          type: array
+                        startTime:
+                          description: The startTime has the format HH:MM
+                          pattern: (([0-1][0-9])|([2][0-3])):([0-5][0-9])
+                          type: string
+                        timezone:
+                          description: The location name corresponding to a file in
+                            the IANA Time Zone database, like Europe/Berlin.
+                          type: string
+                      required:
+                      - days
+                      - startTime
+                      - timezone
+                      type: object
+                    type:
+                      description: Defines if the schedule is a OneTime schedule or
+                        Repeating one. If OneTime, date has to be defined. If Repeating,
+                        Period has to be defined.
+                      enum:
+                      - OneTime
+                      - Repeating
+                      type: string
+                    value:
+                      description: The metric value that will be returned for the
+                        defined schedule.
+                      type: integer
+                  required:
+                  - durationMinutes
+                  - type
+                  - value
+                  type: object
+                type: array
+            required:
+            - schedules
+            type: object
+        required:
+        - spec
+        type: object
+    served: true
+    storage: true
+status:
+  acceptedNames:
+    kind: ""
+    plural: ""
+  conditions: []
+  storedVersions: []
+{{- end}}
diff --git a/docs/helm/templates/deployment.yaml b/docs/helm/templates/deployment.yaml
@@ -182,6 +182,9 @@ spec:
             {{- if .Values.zmon.tokenName }}
             - --zmon-token-name={{ .Values.zmon.tokenName }}
             {{- end}}
+            {{- if .Values.scalingSchedule.enabled }}
+            - --scaling-schedule
+            {{- end}}
           resources:
             limits:
               cpu: {{ .Values.resources.limits.cpu }}

diff --git a/docs/helm/templates/rbac.yaml b/docs/helm/templates/rbac.yaml
@@ -73,6 +73,17 @@ rules:
   - get
   - list
   - watch
+{{- if .Values.scalingSchedule.enabled }}
+- apiGroups:
+  - zalando.org
+  resources:
+  - clusterscalingschedules
+  - scalingschedules
+  verbs:
+  - get
+  - list
+  - watch
+{{- end}}
 ---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRoleBinding