Kubernetes Implementation

Maintainer: @lwander

Note - this is meant to serve as a technical guide to the Kubernetes implementation. A more general walkthrough can (soon) be found on spinnaker.io.

Authentication

The provider specification is as follows:

kubernetes:
  enabled:               # boolean indicating whether or not to use kubernetes as a provider
  accounts:              # list of kubernetes accounts
    - name:              # required unique name for this account
      kubeconfigFile:    # optional location of the kube config file
      namespaces:        # optional list of namespace to manage
      user:              # optional user to authenticate as that must exist in the provided kube config file
      cluster:           # optional cluster to connect to that must exist in the provdied kube config file
      dockerRegistries:  # required (at least 1) docker registry accounts used as a source of images
        - accountName:   # required name of the docker registry account
          namespaces:    # optional list of namespaces this docker registry can deploy to

Authentication is handled by the Clouddriver microservice, and was introduced by in #214, and refined in clouddriver/pull#335.

The Kubernetes provider authenticates with any valid Kubernetes cluster using details found in a provided kubeconfig file. By default, the kubeconfig file at ~/.kube/config is used, unless the field kubeconfigFile is specified. The user, cluster, and singleton namespace are derived from the current-context field in the kubeconfig file, unless their respective fields are provided. If no namespace is found in either namespaces or in the current-context field of the kubeconfig file, then the value ["default"] is used. Any namespaces that do not exist will be created.

The Docker Registry accounts referred to by the above configuration are also configured inside Clouddriver. The details of that implementation can be found here. The Docker authentication details (username, password, email, endpoint address), are read from each listed Docker Registry account, and configured as an image pull secret, implemented in clouddriver/pull#285. The namespaces field of the dockerRegistry subblock defaults to the full list of namespaces, and is used by the Kubernetes provider to determine which namespaces to register the image pull secrets with. Every created pod is given the full list of image pull secrets available to its containing namespace.

The Kubernetes provider will periodically (every 30 seconds) attempt to fetch every provided namespace to see if the cluster is still reachable.

Infrastructure Operations

Server Groups

Spinnaker Server Groups are Kubernetes Replication Controllers. This is a straightforward mapping since both represent sets of managed, identical, immutable computing resources. However, there are a few caveats:

Replication Controllers manage Pods, which unlike VMs, can house multiple container images with the promise that all images in a Pod will be collocated. Notice, the intent here is not to place all of your application's containers into a single pod, but to instead collocate containers that form a logical unit and benefit from sharing resources. Design patterns, and a more thorough explanation can be found here.
Each Pod is in charge of managing it's own health checks, as opposed to the typical Spinnaker pattern of having health checks performed by Load Balancers. The ability to add these to replication controllers was added in clouddriver/pull#359.

Below are the server group operations and their implementations.

Create

Clouddriver component: clouddriver/pull#227.
Deck components:
- Ad-hoc creation deck/pull#1881.
- Pipeline deploy stage: deck/pull#2015.
- Pipeline find image stage: deck/pull#2025.

This operation creates a Replication Controller with the specified containers and their respective configurations.

Clone

Clouddriver component: clouddriver/pull#245.
Deck component: deck/pull#1950.

This operation takes a source Replication Controller as an argument, and creates it while overriding any attributes with the values provided in the request.

Resize

Clouddriver component: clouddriver/pull#361.
Deck components:
- Ad-hoc & pipeline stage: deck/pull#2058.

This stage takes a source Replication Controller, and a target size (can be 0), and attempts to set the given Replication Controller to that size.

Enable/Disable

Clouddriver component: clouddriver/pull#383.
Deck component: Coming Q1 2016

These stages take a source Replication Controller and either enable or disable traffic to them through their associated Services. The way the association with Services is maintained is explained in more detail in the below Load Balancers section.

Destroy

Coming Q1 2016

Autoscaling

Coming Q2 2016 - Will be implemented as Horizontal Pod Autoscalers.

Load Balancers

In Spinnaker, Load Balancers are durable units of infrastructure used as the entry point to a set of instances. The Service resource serves a similar function in a Kubernetes cluster, in addition to providing extra features such as service discovery. For this reason, Kubernetes Services are Spinnaker Load Balancers.

Services forward traffic to any pods that have labels matching their label selector. More information on labels can be found here. Since Spinnaker allows an M:N relationship between instances and load balancers, we roughly assign labels and selectors like so:

service:
  name: service-a
  selectors:
    - load-balancer-service-a: true    # bound to pod-x, pod-y

service:
  name: service-b
  selectors:
    - load-balancer-service-b: true    # bound to pod-x

pod:
  name: pod-x
  labels:
    - load-balancer-service-a: true    # bound to service-a
    - load-balancer-service-b: true    # bound to service-b

pod:
  name: pod-y
  labels:
    - load-balancer-service-a: true    # bound to service-a

pod:
  name: pod-z
  labels:
    - load-balancer-service-b: false   # bound to no services

In the above example, it is clear how an M:N relationship between Services and Pods exists. Furthermore, pod-z may not be serving traffic, but it can be re-enabled by changing the value of its first label to true.

Below are the load balancer operations and their implementations.

Upsert

Clouddriver component: clouddriver/pull#307.
Deck component: deck/pull#1986.

Upsert either creates, or updates an existing load balancer.

Destroy

Coming Q1 2016

Security Groups

Coming Q1 2016

Caching and Presenting Infrastructure

Clouddriver is in charge of caching all infrastructure for the accounts it manages. In order to scale Clouddriver while sharing the caching work, it is recommended to use the Netflix CATS API.

It allows the consumer of the API to define a number of "caching agents", identifiable by some ID string. Every 30 seconds, each agent will independently try to acquire a lock under its ID. The agent that gets the lock will cache the resources it has declared it is in charge of, and then give up the lock signifying that no other agents with that ID need to run, until the next cycle.

The agent must also declare for each resource type (instance, server group, etc...) whether it is INFORMATIVE or AUTHORITATIVE. Agents marked as INFORMATIVE for some type will only have visibility over a subset of that type's resources. For example, a load balancer caching agent will only be aware of and cache the instances that are attached to load balancers, therefore the caching framework should not flush instances from the cache no longer reported by the load balancer caching agent (it is possible that an instance was removed from a load balancer, even though it still exists). So while a load balancer caching agent is INFORMATIVE for type instances, it is AUTHORITATIVE for type load balancers, as it should report every load balancer for the account is is associated with, and if any load balancer is no longer reported by the caching agent, it can be flushed from the cache.

The initial caching work for Kubernetes was implemented here:

Instances, server groups, applications, clusters: clouddriver/pull#276.
Load balancers: clouddriver/pull#312.

In order to allow for parts of the cache to be updated on demand, (it is nice to see a server group show up shortly after creation), additional work was done to support cache safe cache updates here clouddriver/pull#290. This alleviates the case following race condition:

             _____________________________________________________________________             
            |                                  __                                 |
            |                                 |  |                                |
<-----------+---------------------------------+--+--------------------------------+--------------------->
     loadData() starts and         onDemand(R) is started                  loadData() finishes, and 
     retrieves the state of        and immediately after writes            writes the state of resource
     resource R at time t0         the state of resource R at              R at time t0 (Bad!)
                                   time t1

Since loadData() is in charge of reading and storing far more data than a single onDemand(R) update, this is a pretty common race condition.

Deck is in charge of presenting the data retrieved from the cache, and the relevant work can be found here:

Instance details: deck/pull#1956.
Server group details: deck/pull#1942.
Load balancer details: deck/pull#1986.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes Implementation

Authentication

Infrastructure Operations

Server Groups

Create

Clone

Resize

Enable/Disable

Destroy

Autoscaling

Load Balancers

Upsert

Destroy

Security Groups

Caching and Presenting Infrastructure

Clone this wiki locally