It is a small daemon that aggregates custom monitoring metrics collected from Google Dataflow workers that use metrics-flow library and exposes them to Prometheus.
Google Dataflow workers with metrics-flow plugged in pre-aggregate custom monitoring metrics using pipeline windowing functions and dump
the results into a Google Pub/Sub topic. The metrics flow daemon polls a subscription to the topic, converts received metric update events to Prometheus format and exposes them through /metrics endpoint.
+------------+                                                      
| Dataflow 1 +----+                                                 
+------------+    |                                                 
                  |                                                 
+------------+    |                       +------------+            
| Dataflow i |----|->(Google Pub/Sub)---->|   mflowd   |            
+------------+    |                       +------------+            
                  |                              ^                  
+------------+    |                              |                  
| Dataflow N |----+                       +-------------+           
+------------+                            | Prometheus  |           
                                          +-------------+           
% go get github.com/QubitProducts/mflowd
- Make sure you have depinstalled (if you don't know how to install it, follow this link)
- make bootstrap
- make test
- make mflowd
- 
Create a pub/sub topic you will use for publishing metrics from your Dataflow workers (if you don't have one already). 
- 
Create a pull subscription to the topic 
- 
Make sure you are authorized to use the subscription (if not sure, use gcloud auth login) 
- 
Run the daemon % ./mflowd [-v] -p <port> -s pubsub <subscription_id>
Where
- portis a port where- /metricsendpoint will be exposed
- subscription_pathis a subscription identifier which usually looks like- projects/<project_name>/subscriptions/<subcsciption_name>
- use optional -vflag to run the daemon in verbose mode
You can easily build a "containerized" version of mflowd and run it on mesos or kubernetes.
% make docker
% docker images | grep mflowd
mflowd                                             latest                                            e9cbac93f703
...
Before you can run the image you need to set up a Google Cloud API service account to allow mflowd use the subscription you have created. So
- 
Create a service account for mflowd 
- 
Create an empty directory on your host machine (say, % mkdir ~/.mflowd)
- 
Download the service account key in JSON format and put it to the created directory 
- 
Finally run the countainer: % docker run -e "MFLOWD_SUB=<subscription_id>" -v $HOME/.mflowd:/etc/mflowd 'mflowd:latest'
You can also run both mflowd and prometheus docker images using docker-compose:
% cd ~/go/src/github.com/QubitProducts/mflowd
% mkdir gcp
# download your service account JSON key to gcp directory
% cat > .env
MFLOWD_SUB=<subscription_id>
MFLOWD_VERBOSE=0 # set to 1 to turn verbose mode on
^C
% docker-compose up
Follow http://localhost:9090 to get to Prometheus UI