Skip to content

Commit

Permalink
Merge pull request #552 from zalando-incubator/hostname-rps-collector
Browse files Browse the repository at this point in the history
Add hostname RPS metric collector
  • Loading branch information
lucastt committed May 26, 2023
2 parents a103a32 + 35e3fe8 commit c2179a3
Show file tree
Hide file tree
Showing 6 changed files with 569 additions and 35 deletions.
58 changes: 58 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -402,6 +402,64 @@ the `backend` label under `matchLabels` for the metric. The ingress annotation
where the backend weights can be obtained can be specified through the flag
`--skipper-backends-annotation`.

## External RPS collector

The External RPS collector, like Skipper collector, is a simple wrapper around the Prometheus collector to
make it easy to define an HPA for scaling based on the RPS measured for a given hostname. When
[skipper](https://github.com/zalando/skipper) is used as the ingress
implementation in your cluster everything should work automatically, in case another reverse proxy is used as ingress, like [Nginx](https://github.com/kubernetes/ingress-nginx) for example, its necessary to configure which prometheus metric should be used through `--external-rps-metric-name <metric-name>` flag. Assuming `skipper-ingress` is being used or the appropriate metric name is passed using the flag mentioned previously this collector provides the correct Prometheus queries out of the
box so users don't have to define those manually.

### Supported metrics

| Metric | Description | Type | Kind | K8s Versions |
| ------------ | -------------- | ------- | -- | -- |
| `requests-per-second` | Scale based on requests per second for a certain hostname. | External | | `>=1.12` |

### Example: External Metric

This is an example of an HPA that will scale based on `requests-per-second` for the RPS measured in the hostnames called: `www.example1.com` and `www.example2.com`; and weighted by 42%.

```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
annotations:
metric-config.external.example-rps.requests-per-second/hostname: www.example1.com,www.example2.com
metric-config.external.example-rps.requests-per-second/weight: "42"
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: custom-metrics-consumer
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metric:
name: example-rps
selector:
matchLabels:
type: requests-per-second
target:
type: AverageValue
averageValue: "42"
```
### Multiple hostnames per metric

This metric supports a relation of n:1 between hostnames and metrics. The way it works is the measured RPS is the sum of the RPS rate of each of the specified hostnames. This value is further modified by the weight parameter explained bellow.

### Metric weighting based on backend

There are ingress-controllers, like skipper-ingress, that supports sending traffic to different backends based on some kind of configuration, in case of skipper annotations
present on the `Ingress` object, or weights on the RouteGroup backends. By
default the number of replicas will be calculated based on the full traffic
served by these components. If however only the traffic being routed to
a specific hostname should be used then the weight for the configured hostname(s) might be specified via the `weight` annotation `metric-config.external.<metric-name>.request-per-second/weight` for the metric being configured.


## InfluxDB collector

The InfluxDB collector maps [Flux](https://github.com/influxdata/flux) queries to metrics that can be used for scaling.
Expand Down
128 changes: 128 additions & 0 deletions pkg/collector/external_rps_collector.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
package collector

import (
"fmt"
"regexp"
"strconv"
"strings"
"time"

autoscalingv2 "k8s.io/api/autoscaling/v2"
)

const (
ExternalRPSMetricType = "requests-per-second"
ExternalRPSQuery = `scalar(sum(rate(%s{host=~"%s"}[1m])) * %.4f)`
)

type ExternalRPSCollectorPlugin struct {
metricName string
promPlugin CollectorPlugin
pattern *regexp.Regexp
}

type ExternalRPSCollector struct {
interval time.Duration
promCollector Collector
}

func NewExternalRPSCollectorPlugin(
promPlugin CollectorPlugin,
metricName string,
) (*ExternalRPSCollectorPlugin, error) {
if metricName == "" {
return nil, fmt.Errorf("failed to initialize hostname collector plugin, metric name was not defined")
}

p, err := regexp.Compile("^[a-zA-Z0-9.-]+$")
if err != nil {
return nil, fmt.Errorf("failed to create regular expression to match hostname format")
}

return &ExternalRPSCollectorPlugin{
metricName: metricName,
promPlugin: promPlugin,
pattern: p,
}, nil
}

// NewCollector initializes a new skipper collector from the specified HPA.
func (p *ExternalRPSCollectorPlugin) NewCollector(
hpa *autoscalingv2.HorizontalPodAutoscaler,
config *MetricConfig,
interval time.Duration,
) (Collector, error) {
if config == nil {
return nil, fmt.Errorf("metric config not present, it is not possible to initialize the collector")
}
// Need to copy config and add a promQL query in order to get
// RPS data from a specific hostname from prometheus. The idea
// of the copy is to not modify the original config struct.
confCopy := *config

if _, ok := config.Config["hostnames"]; !ok {
return nil, fmt.Errorf("Hostname is not specified, unable to create collector")
}

hostnames := strings.Split(config.Config["hostnames"], ",")
if p.pattern == nil {
return nil, fmt.Errorf("plugin did not specify hostname regex pattern, unable to create collector")
}
for _, h := range hostnames {
if ok := p.pattern.MatchString(h); !ok {
return nil, fmt.Errorf(
"invalid hostname format, unable to create collector: %s",
h,
)
}
}

weight := 1.0
if w, ok := config.Config["weight"]; ok {
num, err := strconv.ParseFloat(w, 64)
if err != nil {
return nil, fmt.Errorf("could not parse weight annotation, unable to create collector: %s", w)
}
weight = num / 100.0
}



confCopy.Config = map[string]string{
"query": fmt.Sprintf(
ExternalRPSQuery,
p.metricName,
strings.ReplaceAll(strings.Join(hostnames, "|"), ".", "_"),
weight,
),
}

c, err := p.promPlugin.NewCollector(hpa, &confCopy, interval)
if err != nil {
return nil, err
}

return &ExternalRPSCollector{
interval: interval,
promCollector: c,
}, nil
}

// GetMetrics gets hostname metrics from Prometheus
func (c *ExternalRPSCollector) GetMetrics() ([]CollectedMetric, error) {
v, err := c.promCollector.GetMetrics()
if err != nil {
return nil, err
}

if len(v) != 1 {
return nil, fmt.Errorf("expected to only get one metric value, got %d", len(v))
}
return v, nil
}

// Interval returns the interval at which the collector should run.
func (c *ExternalRPSCollector) Interval() time.Duration {
return c.interval
}

Loading

0 comments on commit c2179a3

Please sign in to comment.