Skip to content
This repository has been archived by the owner on Apr 14, 2021. It is now read-only.

Prometheus monitoring

bpicode edited this page Jul 18, 2018 · 2 revisions

The API exposed by fritzctl allows us to easily write an exporter for the Prometheus Monitoring System. What follows shall be regarded as a proof-of-concept. The project jayme-github/fritzbox_smarthome_exporter provides a more complete treatment than pursued here.

Prerequisites and assumptions

We will need...

  • A FRITZ!Box and a thermostat.
  • docker.
  • docker-compose.
  • A little knowledge of go and docker.
  • A little knowledge of Prometheus and Grafana, at least a working mental model what they do.

FRITZ!Box exporter

Writing an exporter is pretty easy. Create a file main.go with content:

package main

import (
	"log"
	"net/http"
	"strconv"
	"sync"
	"time"

	"github.com/bpicode/fritzctl/fritz"

	"github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/promhttp"
)

type client struct {
	fritz.HomeAuto
	*sync.Mutex
}

var temperatureGauge = prometheus.NewGauge(
	prometheus.GaugeOpts{
		Namespace: "fritzbox",
		Name:      "temperature",
		Help:      "Temperature measured by my AHA Device",
	},
)

var fritzClient = client{
	HomeAuto: fritz.NewHomeAuto(
		fritz.SkipTLSVerify(),
		fritz.Credentials("", "password"),
	),
	Mutex: &sync.Mutex{},
}

type fritzCollector struct {
}

func (fc fritzCollector) Collect(c chan<- prometheus.Metric) {
	fritzClient.Lock()
	defer fritzClient.Unlock()
	l, err := fritzClient.List()
	if err != nil {
		log.Println("Unable to collect data:", err)
		return
	}
	ts := l.Thermostats()
	if len(ts) == 0 {
		return
	}
	cs := ts[0].Temperature.FmtCelsius()
	tc, err := strconv.ParseFloat(cs, 64)
	if err != nil {
		log.Println("Unable to parse temperature data:", err)
		return
	}
	temperatureGauge.Set(tc)
	c <- temperatureGauge
}

func (fc fritzCollector) Describe(c chan<- *prometheus.Desc) {
	temperatureGauge.Describe(c)
}

func main() {
	fc := fritzCollector{}
	go func() {
		for {
			fritzClient.Lock()
			err := fritzClient.Login()
			if err != nil {
				log.Println("Login refresh failed:", err)
			}
			fritzClient.Unlock()
			time.Sleep(10 * time.Minute)
		}
	}()

	err := prometheus.Register(fc)
	if err != nil {
		log.Fatalln(err)
	}
	http.Handle("/metrics", promhttp.Handler())

	if err := http.ListenAndServe(":9103", nil); err != nil {
		log.Fatalln(err)
	}
}

This small program is doing the following:

  • Serve Prometheus metrics at localhost:9103/metrics. It uses the first HKR device to generate a metric "fritzbox_temperature". Once running, it can be tested with
    curl localhost:9103/metrics
    and should produce output in the form of
    ...
    # HELP fritzbox_temperature Temperature measured by my AHA Device
    # TYPE fritzbox_temperature gauge
    fritzbox_temperature 22.5
    ...
    
  • The login of the exporter is renewed every 10 minutes.
  • The /metrics endpoint returns fresh temperature data by using the AVM Home Automation HTTP Interface.
  • We tacitly assume default networking configuration, in particular the FRITZ!Box should respond to requests targeting https://fritz.box:443. It works on most setups, cf. fritzctl godoc on generalizations.

When applying this to your home setup, make sure to replace "password" by the correct one.

Containerize the exporter

For convenience, we run the exporter in a docker container. To this end create a Dockerfile

FROM golang:1.10.1-alpine

COPY main.go /main.go

RUN apk add --no-cache git
RUN go get -v -u github.com/bpicode/fritzctl/fritz
RUN go get -v -u github.com/prometheus/client_golang/prometheus

EXPOSE 9103

ENTRYPOINT ["go", "run", "/main.go"]

and build the image

docker build -t prom/fritz .

Prepare Prometheus configuration

We provide the Prometheus setting in a file prometheus.yml:

global:
  scrape_interval:     15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'promfritz'
    scrape_interval: 7m
    scrape_timeout: 60s
    static_configs:
      - targets: ['promfritz:9103']

Here we just use one static scraping target. The URL promfritz:9103 will be resolvable as we are going to use a dockerized variant of Prometheus and docker-compose to wire the containers.

Composing containers

The container landscape will be spanned by three instances, one for Prometheus, one for Grafana and one for our promfritz exporter. We need to write a docker-compose.yml:

promfritz:
  image: prom/fritz
  ports:
    - 9103:9103

prometheus:
  image: prom/prometheus
  ports:
    - 9090:9090
  links:
    - promfritz:promfritz
  volumes:
    - ./prometheus.yml:/etc/prometheus/prometheus.yml

grafana-ui:
  image: grafana/grafana
  ports:
    - 3000:3000
  environment:
    - GF_SECURITY_ADMIN_PASSWORD=secret
  links:
    - prometheus:prometheus

Run

To spin up the containers, run

docker-compose -f docker-compose.yml up

If everything went well, docker ps should display three running instances

docker ps

CONTAINER ID  IMAGE            COMMAND                 CREATED        STATUS        PORTS                   NAMES
3a551d4c2a4b  grafana/grafana  "/run.sh"               5 seconds ago  Up 5 seconds  0.0.0.0:3000->3000/tcp  promfritz_grafana-ui_1
34c6f34935ed  prom/prometheus  "/bin/prometheus -co…"  5 seconds ago  Up 5 seconds  0.0.0.0:9090->9090/tcp  promfritz_prometheus_1
670148a8f080  prom/fritz       "go run /main.go"       5 seconds ago  Up 5 seconds  0.0.0.0:9103->9103/tcp  promfritz_promfritz_1

Also take note of the IP addresses assigned to the containers

docker ps -q | xargs docker inspect --format '{{ .Name }} - {{ .NetworkSettings.IPAddress }}'

/promfritz_grafana-ui_1 - 172.17.0.4
/promfritz_prometheus_1 - 172.17.0.3
/promfritz_promfritz_1 - 172.17.0.2

Configure Grafana

Visit the Grafana frontend in a browser, in this case 172.17.0.4:3000.

  • The login is 'admin' with password 'secret'.
  • Navigate to the "Data Sources" and add a data source of type 'Prometheus', URL 'http://prometheus:9090' and access 'proxy'.

Now we can go on and create fancy dashboards. This is one example that one can import:

{
  "__inputs": [
    { "name": "DS_PROMETHEUS", "label": "prometheus", "description": "", "type": "datasource", "pluginId": "prometheus", "pluginName": "Prometheus" }
  ],
  "__requires": [
    { "type": "grafana", "id": "grafana", "name": "Grafana", "version": "5.0.4" },
    { "type": "panel", "id": "graph", "name": "Graph", "version": "5.0.0" },
    { "type": "datasource", "id": "prometheus", "name": "Prometheus", "version": "5.0.0" },
    { "type": "panel", "id": "singlestat", "name": "Singlestat", "version": "5.0.0" }
  ],
  "annotations": { "list": [ { "builtIn": 1, "datasource": "-- Grafana --", "enable": true, "hide": true, "iconColor": "rgba(0, 211, 255, 1)", "name": "Annotations & Alerts", "type": "dashboard" } ] },
  "description": "FRITZ!Box Home Automation",
  "editable": true,
  "gnetId": null,
  "graphTooltip": 0,
  "id": null,
  "links": [],
  "panels": [
    {
      "cacheTimeout": null,
      "colorBackground": false,
      "colorValue": false,
      "colors": [ "#64b0c8", "#7eb26d", "#d44a3a" ],
      "datasource": "${DS_PROMETHEUS}",
      "decimals": 1,
      "description": "Measured Temperature of my HKR Device",
      "format": "celsius",
      "gauge": { "maxValue": 40, "minValue": 0, "show": true, "thresholdLabels": false, "thresholdMarkers": true },
      "gridPos": { "h": 9, "w": 12, "x": 0, "y": 0 },
      "id": 2,
      "interval": null,
      "links": [],
      "mappingType": 1,
      "mappingTypes": [ { "name": "value to text", "value": 1 }, { "name": "range to text", "value": 2 } ],
      "maxDataPoints": 100,
      "nullPointMode": "connected",
      "nullText": null,
      "postfix": "",
      "postfixFontSize": "50%",
      "prefix": "",
      "prefixFontSize": "50%",
      "rangeMaps": [ { "from": "null", "text": "N/A", "to": "null" } ],
      "sparkline": { "fillColor": "rgba(31, 118, 189, 0.18)", "full": false, "lineColor": "rgb(31, 120, 193)", "show": true },
      "tableColumn": "",
      "targets": [ { "expr": "\nfritzbox_temperature{}", "format": "time_series", "hide": false, "instant": false, "intervalFactor": 1, "legendFormat": "", "refId": "A" } ],
      "thresholds": "15,30",
      "title": "Temperature",
      "transparent": false,
      "type": "singlestat",
      "valueFontSize": "80%",
      "valueMaps": [ { "op": "=", "text": "N/A", "value": "null" } ],
      "valueName": "current"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_PROMETHEUS}",
      "decimals": 2,
      "fill": 1,
      "gridPos": { "h": 9, "w": 12, "x": 12, "y": 0 },
      "id": 4,
      "legend": { "alignAsTable": true, "avg": true, "current": true, "max": true, "min": true, "rightSide": false, "show": true, "total": false, "values": true },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [ { "expr": "fritzbox_temperature{}", "format": "time_series", "interval": "", "intervalFactor": 1, "legendFormat": "My Thermostat", "refId": "A" } ],
      "thresholds": [],
      "timeFrom": null,
      "timeShift": null,
      "title": "Temperature over time",
      "tooltip": { "shared": true, "sort": 0, "value_type": "individual" },
      "type": "graph",
      "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] },
      "yaxes": [ { "format": "celsius", "label": "T", "logBase": 1, "max": "30", "min": "0", "show": true }, { "decimals": null, "format": "celsius", "label": "T", "logBase": 1, "max": null, "min": null, "show": true } ]
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_PROMETHEUS}",
      "fill": 1,
      "gridPos": { "h": 9, "w": 24, "x": 0, "y": 9 },
      "id": 6,
      "legend": { "alignAsTable": true, "avg": true, "current": true, "max": true, "min": true, "show": true, "total": false, "values": true },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "percentage": false,
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [ { "expr": "scrape_duration_seconds{instance=\"promfritz:9103\",job=\"promfritz\"}", "format": "time_series", "intervalFactor": 1, "legendFormat": "promfritz", "refId": "A" } ],
      "thresholds": [],
      "timeFrom": null,
      "timeShift": null,
      "title": "Scrape Duration",
      "tooltip": { "shared": true, "sort": 0, "value_type": "individual" },
      "type": "graph",
      "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] },
      "yaxes": [ { "decimals": null, "format": "s", "label": "", "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ]
    }
  ],
  "refresh": "5m",
  "schemaVersion": 16,
  "style": "dark",
  "tags": [],
  "templating": { "list": [] },
  "time": { "from": "now-6h", "to": "now" },
  "timepicker": {
    "refresh_intervals": [ "5s", "10s", "30s", "1m", "5m", "15m", "30m", "1h", "2h", "1d" ],
    "time_options": [ "5m", "15m", "1h", "6h", "12h", "24h", "2d", "7d", "30d" ]
  },
  "timezone": "",
  "title": "Home Automation",
  "uid": "8-IIDKMik",
  "version": 18
}

Here is how it looks like after a some hours of operation:

Outlook

This was just a proof-of-concept, a serious monitoring system has to be extended:

  • We took several shortcuts in the promfritz exporter.
    • One should generalize it to support multiple devices and serve more metrics. There are a lot of possibilities, see fritzctl godoc.
    • The hard-coded configuration better be replaced by reading it from a file with live-reloading.
    • Putting scraping interval to 7 minutes was a "best-guess". Temperatures didn't flicker too much, it is probably fine to put 15 or 30 minutes.
  • We worked with insecure defaults.
    • No TLS between containers.
    • fritz.SkipTLSVerify() really shouldn't be used.
  • Automate all the things, in particular the datasource and dashboard configuration of Grafana.