-
Notifications
You must be signed in to change notification settings - Fork 0
Kubernetes Implementation
Maintainer: @lwander
Note - this is meant to serve as a technical guide to the Kubernetes implementation. A more general walkthrough can (soon) be found on spinnaker.io.
The provider specification is as follows:
kubernetes:
enabled: # boolean indicating whether or not to use kubernetes as a provider
accounts: # list of kubernetes accounts
- name: # required unique name for this account
kubeconfigFile: # optional location of the kube config file
namespaces: # optional list of namespace to manage
user: # optional user to authenticate as that must exist in the provided kube config file
cluster: # optional cluster to connect to that must exist in the provdied kube config file
dockerRegistries: # required (at least 1) docker registry accounts used as a source of images
- accountName: # required name of the docker registry account
namespaces: # optional list of namespaces this docker registry can deploy to
Authentication is handled by the Clouddriver microservice, and was introduced by in #214, and refined in clouddriver/pull#335.
The Kubernetes provider authenticates with any valid Kubernetes cluster using details found in a provided kubeconfig file. By default, the kubeconfig file at ~/.kube/config
is used, unless the field kubeconfigFile
is specified. The user, cluster, and singleton namespace are derived from the current-context
field in the kubeconfig file, unless their respective fields are provided. If no namespace is found in either namespaces
or in the current-context
field of the kubeconfig file, then the value ["default"]
is used. Any namespaces that do not exist will be created.
The Docker Registry accounts referred to by the above configuration are also configured inside Clouddriver. The details of that implementation can be found here. The Docker authentication details (username, password, email, endpoint address), are read from each listed Docker Registry account, and configured as an image pull secret, implemented in clouddriver/pull#285. The namespaces
field of the dockerRegistry
subblock defaults to the full list of namespaces, and is used by the Kubernetes provider to determine which namespaces to register the image pull secrets with. Every created pod is given the full list of image pull secrets available to its containing namespace.
The Kubernetes provider will periodically (every 30 seconds) attempt to fetch every provided namespace to see if the cluster is still reachable.
Spinnaker Server Groups are Kubernetes Replication Controllers. This is a straightforward mapping since both represent sets of managed, identical, immutable computing resources. However, there are a few caveats:
-
Replication Controllers manage Pods, which unlike VMs, can house multiple container images with the promise that all images in a Pod will be collocated. Notice, the intent here is not to place all of your application's containers into a single pod, but to instead collocate containers that form a logical unit and benefit from sharing resources. Design patterns, and a more thorough explanation can be found here.
-
Each Pod is in charge of managing it's own health checks, as opposed to the typical Spinnaker pattern of having health checks performed by Load Balancers. The ability to add these to replication controllers was added in clouddriver/pull#359.
Below are the server group operations and their implementations.
-
Clouddriver component: clouddriver/pull#227.
-
Deck components:
- Ad-hoc creation deck/pull#1881.
- Pipeline deploy stage: deck/pull#2015.
- Pipeline find image stage: deck/pull#2025.
This operation creates a Replication Controller with the specified containers and their respective configurations.
-
Clouddriver component: clouddriver/pull#245.
-
Deck component: deck/pull#1950.
This operation takes a source Replication Controller as an argument, and creates it while overriding any attributes with the values provided in the request.
-
Clouddriver component: clouddriver/pull#361.
-
Deck components:
- Ad-hoc & pipeline stage: deck/pull#2058.
This stage takes a source Replication Controller, and a target size (can be 0), and attempts to set the given Replication Controller to that size.
-
Clouddriver component: clouddriver/pull#383.
-
Deck component: Coming Q1 2016
These stages take a source Replication Controller and either enable or disable traffic to them through their associated Services. The way the association with Services is maintained is explained in more detail in the below Load Balancers section.
Coming Q1 2016
Coming Q2 2016 - Will be implemented as Horizontal Pod Autoscalers.
In Spinnaker, Load Balancers are durable units of infrastructure used as the entry point to a set of instances. The Service resource serves a similar function in a Kubernetes cluster, in addition to providing extra features such as service discovery. For this reason, Kubernetes Services are Spinnaker Load Balancers.
Services forward traffic to any pods that have labels matching their label selector. More information on labels can be found here. Since Spinnaker allows an M:N
relationship between instances and load balancers, we roughly assign labels and selectors like so:
service:
name: service-a
selectors:
- load-balancer-service-a: true # bound to pod-x, pod-y
service:
name: service-b
selectors:
- load-balancer-service-b: true # bound to pod-x
pod:
name: pod-x
labels:
- load-balancer-service-a: true # bound to service-a
- load-balancer-service-b: true # bound to service-b
pod:
name: pod-y
labels:
- load-balancer-service-a: true # bound to service-a
pod:
name: pod-z
labels:
- load-balancer-service-b: false # bound to no services
In the above example, it is clear how an M:N
relationship between Services and Pods exists. Furthermore, pod-z
may not be serving traffic, but it can be re-enabled by changing the value of its first label to true
.
Below are the load balancer operations and their implementations.
-
Clouddriver component: clouddriver/pull#307.
-
Deck component: deck/pull#1986.
Upsert either creates, or updates an existing load balancer.
Coming Q1 2016
Coming Q1 2016
Clouddriver is in charge of caching all infrastructure for the accounts it manages. In order to scale Clouddriver while sharing the caching work, it is recommended to use the Netflix CATS API.
It allows the consumer of the API to define a number of "caching agents", identifiable by some ID string. Every 30 seconds, each agent will independently try to acquire a lock under its ID. The agent that gets the lock will cache the resources it has declared it is in charge of, and then give up the lock signifying that no other agents with that ID need to run, until the next cycle.
The agent must also declare for each resource type (instance, server group, etc...) whether it is INFORMATIVE
or AUTHORITATIVE
. Agents marked as INFORMATIVE
for some type will only have visibility over a subset of that type's resources. For example, a load balancer caching agent will only be aware of and cache the instances that are attached to load balancers, therefore the caching framework should not flush instances from the cache no longer reported by the load balancer caching agent (it is possible that an instance was removed from a load balancer, even though it still exists). So while a load balancer caching agent is INFORMATIVE
for type instances, it is AUTHORITATIVE
for type load balancers, as it should report every load balancer for the account is is associated with, and if any load balancer is no longer reported by the caching agent, it can be flushed from the cache.
The initial caching work for Kubernetes was implemented here:
- Instances, server groups, applications, clusters: clouddriver/pull#276.
- Load balancers: clouddriver/pull#312.
In order to allow for parts of the cache to be updated on demand, (it is nice to see a server group show up shortly after creation), additional work was done to support cache safe cache updates here clouddriver/pull#290. This alleviates the case following race condition:
_____________________________________________________________________
| __ |
| | | |
<-----------+---------------------------------+--+--------------------------------+--------------------->
loadData() starts and onDemand(R) is started loadData() finishes, and
retrieves the state of and immediately after writes writes the state of resource
resource R at time t0 the state of resource R at R at time t0 (Bad!)
time t1
Since loadData()
is in charge of reading and storing far more data than a single onDemand(R)
update, this is a pretty common race condition.
Deck is in charge of presenting the data retrieved from the cache, and the relevant work can be found here:
- Instance details: deck/pull#1956.
- Server group details: deck/pull#1942.
- Load balancer details: deck/pull#1986.