Skip to content

Commit

Permalink
Merge branch 'main' into prs/replace-helm-bin
Browse files Browse the repository at this point in the history
Signed-off-by: Tobias Wolf <[email protected]>
  • Loading branch information
NotTheEvilOne committed Jan 20, 2024
2 parents 9268403 + a04233f commit a8aceaa
Show file tree
Hide file tree
Showing 209 changed files with 21,155 additions and 141 deletions.
9 changes: 9 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,12 @@ jobs:
env:
GO111MODULE: "on"
run: make test-unit

- name: Running integration tests workloadcluster
env:
GIT_PROVIDER: github
GIT_ORG_NAME: SovereignCloudStack
GIT_REPOSITORY_NAME: cluster-stacks
GO111MODULE: "on"
GIT_ACCESS_TOKEN: ${{ secrets.GIT_ACCESS_TOKEN }}
run: make test-integration-workloadcluster
16 changes: 13 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -313,17 +313,27 @@ $(WORKER_CLUSTER_KUBECONFIG):

KUBEBUILDER_ASSETS ?= $(shell $(SETUP_ENVTEST) use --use-env --bin-dir $(abspath $(TOOLS_BIN_DIR)) -p path $(KUBEBUILDER_ENVTEST_KUBERNETES_VERSION))

.PHONY: test-integration
test-integration: test-integration-workloadcluster test-integration-github
echo done

.PHONY: test-unit
test-unit: $(SETUP_ENVTEST) $(GOTESTSUM) ## Run unit
@mkdir -p $(shell pwd)/.coverage
KUBEBUILDER_ASSETS="$(KUBEBUILDER_ASSETS)" $(GOTESTSUM) --junitfile=.coverage/junit.xml --format testname -- -mod=vendor \
CREATE_KIND_CLUSTER=true KUBEBUILDER_ASSETS="$(KUBEBUILDER_ASSETS)" $(GOTESTSUM) --junitfile=.coverage/junit.xml --format testname -- -mod=vendor \
-covermode=atomic -coverprofile=.coverage/cover.out -p=4 ./internal/controller/...

.PHONY: test-integration-workloadcluster
test-integration-workloadcluster: $(SETUP_ENVTEST) $(GOTESTSUM)
@mkdir -p $(shell pwd)/.coverage
CREATE_KIND_CLUSTER=true KUBEBUILDER_ASSETS="$(KUBEBUILDER_ASSETS)" $(GOTESTSUM) --junitfile=.coverage/junit.xml --format testname -- -mod=vendor \
-covermode=atomic -coverprofile=.coverage/cover.out -p=1 ./internal/test/integration/workloadcluster/...

.PHONY: test-integration-github
test-integration-github: $(SETUP_ENVTEST) $(GOTESTSUM)
@mkdir -p $(shell pwd)/.coverage
KUBEBUILDER_ASSETS="$(KUBEBUILDER_ASSETS)" $(GOTESTSUM) --junitfile=../.coverage/junit.xml --format testname -- -mod=vendor \
-covermode=atomic -coverprofile=../.coverage/cover.out -p=1 ./internal/test/integration/github/...
CREATE_KIND_CLUSTER=false KUBEBUILDER_ASSETS="$(KUBEBUILDER_ASSETS)" $(GOTESTSUM) --junitfile=.coverage/junit.xml --format testname -- -mod=vendor \
-covermode=atomic -coverprofile=.coverage/cover.out -p=1 ./internal/test/integration/github/...

##@ Verify
##########
Expand Down
83 changes: 12 additions & 71 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,86 +8,27 @@ The operator can be used with any repository that contains releases of cluster s

To try out this operator and cluster stacks, have a look at this [demo](https://github.com/SovereignCloudStack/cluster-stacks-demo).

## What is the Cluster Stack Operator?
## Why Cluster Stacks?

The Cluster Stack Operator facilitates the manual work that needs to be done to use cluster stacks.
Kubernetes and Cluster API enable self-service Kubernetes. But do they take care of everything? No! Both tools solve one specific purpose perfectly and leave other tasks out of scope.

There are three components of a cluster stack:
Therefore, a user has to answer questions like these: how do I get node images? How can I manage core cluster components (e.g. CCM, CNI)? How can I safely and efficiently upgrade Kubernetes clusters?

1. Cluster addons: The cluster addons (CNI, CSI, CCM) have to be applied in each workload cluster that the user starts
2. Cluster API objects: The `ClusterClass` object makes it easier to use Cluster-API. The cluster stack contains a `ClusterClass` object and other Cluster-API objects that are necessary in order to use the `ClusterClass`. These objects have to be applied in the management cluster.
3. Node images: Node images can be provided to the user in different form. They are released and tested together with the other two components of the cluster stack.
The Cluster Stacks give an answer by working hand-in-hand with Cluster API to facilitate self-service Kubernetes. They provide a framework and tools for managing a fully open-source self-service Kubernetes infrastructure with ease. They integrate seamlessly in the Cluster API cosmos.

The first two are handled by this operator here. The node images, on the other hand, have to be handled by separate provider integrations, similar to the ones that [Cluster-API uses](https://cluster-api.sigs.k8s.io/developer/providers/implementers-guide/overview).
The Cluster Stack operator enables an “Infrastructure as Software” approach for managing Kubernetes clusters in self-service.

## Implementing a provider integration
The Cluster Stacks are very generic and can be adapted to many use cases.

Further information and documentation on how to implement a provider integration will follow soon.
### Are Cluster Stacks relevant to you?

## Developing Cluster Stack Operator
Are you interested in setting up Kubernetes in your company based on open-source software? Do you not want to rely on other providers but own your Kubernetes clusters? Do you want to manage Kubernetes clusters for others? Do you plan on using Cluster API?

Developing our operator is quite easy. First, you need to install some base requirements: Docker and Go. Second, you need to configure your environment variables. Then you can start developing with the local Kind cluster and the Tilt UI to create a workload cluster that is already pre-configured.
In all of these cases, the Cluster Stacks are for you!

## Setting Tilt up
1. Install Docker and Go. We expect you to run on a Linux OS.
2. Create an ```.envrc``` file and specify the values you need. See the .envrc.sample for details.
They make it easy to build a self-service Kubernetes infrastructure for internal use, as well as a to create a Managed Kubernetes offering.

## Developing with Tilt

<p align="center">
<img alt="tilt" src="./docs/pics/tilt.png" width=800px/>
</p>
## Further documentation

Operator development requires a lot of iteration, and the “build, tag, push, update deployment” workflow can be very tedious. Tilt makes this process much simpler by watching for updates and automatically building and deploying them. To build a kind cluster and to start Tilt, run:

```shell
make tilt-up
```
> To access the Tilt UI please go to: `http://localhost:10350`

You should make sure that everything in the UI looks green. If not, e.g. if the clusterstack has not been synced, you can trigger the Tilt workflow again. In case of the clusterstack button this might be necessary, as it cannot be applied right after startup of the cluster and fails. Tilt unfortunately does not include a waiting period.

If everything is green, then you can already check for your clusterstack that has been deployed. You can use a tool like k9s to have a look at the management cluster and its custom resources.

In case your clusterstack shows that it is ready, you can deploy a workload cluster. This could be done through the Tilt UI, by pressing the button in the top right corner "Create Workload Cluster". This triggers the `make create-workload-cluster-docker`, which uses the environment variables and the cluster-template.

In case you want to change some code, you can do so and see that Tilt triggers on save. It will update the container of the operator automatically.

If you want to change something in your ClusterStack or Cluster custom resources, you can have a look at `.cluster.yaml` and `.clusterstack.yaml`, which Tilt uses.

To tear down the workload cluster press the "Delete Workload Cluster" button. After a few minutes the resources should be deleted.

To tear down the kind cluster, use:

```shell
$ make delete-bootstrap-cluster
```

If you have any trouble finding the right command, then you can use `make help` to get a list of all available make targets.

## Troubleshooting

Check the latest events:

```shell
kubectl get events -A --sort-by=.lastTimestamp
```

Check the conditions:

```shell
go run github.com/guettli/check-conditions@latest all
```

Check with `clusterctl`:

```shell
clusterctl describe cluster -n cluster my-cluster
```

Check the logs. List all logs from all deployments. Show the logs of the last ten minutes:

```shell
kubectl get deployment -A --no-headers | while read -r ns d _; do echo; echo "====== $ns $d"; kubectl logs --since=10m -n $ns deployment/$d; done
```
Please have a look at our [docs](docs/README.md) to find more information about the architecture, how to get started, how to develop this operator or provider integrations, and much more.
2 changes: 1 addition & 1 deletion api/v1alpha1/clusterstack_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ type ClusterStackSpec struct {

// Channel specifies the release channel of the cluster stack. Defaults to 'stable'.
// +kubebuilder:default:=stable
// +kubebuilder:validation:enum=stable;alpha;beta;rc
// +kubebuilder:validation:Enum=stable;custom
Channel version.Channel `json:"channel,omitempty"`

// Versions is a list of version of the cluster stack that should be available in the management cluster.
Expand Down
3 changes: 3 additions & 0 deletions config/crd/bases/clusterstack.x-k8s.io_clusterstacks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,9 @@ spec:
default: stable
description: Channel specifies the release channel of the cluster
stack. Defaults to 'stable'.
enum:
- stable
- custom
type: string
kubernetesVersion:
description: KubernetesVersion is the Kubernetes version in the format
Expand Down
2 changes: 1 addition & 1 deletion config/cso/cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ spec:
cidrBlocks: ["192.168.0.0/16"]
serviceDomain: "cluster.local"
topology:
class: docker-ferrol-1-27-v1
class: docker-ferrol-1-27-v2
controlPlane:
metadata: {}
replicas: 1
Expand Down
2 changes: 1 addition & 1 deletion config/cso/clusterstack.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ spec:
autoSubscribe: false
noProvider: true
versions:
- v1
- v2
31 changes: 31 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Documentation Index

## General
- [Concept](concept.md)
- [Terminology](terminology.md)

## Quickstart
- [Quickstart](topics/quickstart.md)
- [Cluster API quick start](https://cluster-api.sigs.k8s.io/user/quick-start.html)

### Architecture
- [Overview](architecture/overview.md)
- [User flow](architecture/user-flow.md)
- [Workflow - Node images](architecture/node-image-flow.md)
- [Workflow - Management Cluster](architecture/mgt-cluster-flow.md)
- [Workflow - Workload Cluster](architecture/workload-cluster-flow.md)

### Topics
- [Managing ClusterStack resources](topics/managing-clusterstacks.md)
- [Upgrade flow](topics/upgrade-flow.md)
- [Troubleshooting](topics/troubleshoot.md)

### Developing
- [Development guide](develop/develop.md)
- [Develop provider integrations](develop/provider-integration.md)

### Reference
- [General](reference/README.md)
- [ClusterStack](reference/clusterstack.md)
- [ClusterStackRelease](reference/clusterstackrelease.md)
- [ClusterAddon](reference/clusteraddon.md)
11 changes: 11 additions & 0 deletions docs/architecture/mgt-cluster-flow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Management Cluster flow

In a Cluster API management cluster, the Cluster API operators run. In our management cluster, there are also the Cluster Stack operators.

The user controls workload clusters via custom resources. As the Cluster Stack approach uses `ClusterClasses`, the user has to create only a `Cluster` object and refer to a `ClusterClass`.

However, in order for this to work, the `ClusterClass` has to be applied as well as all other Cluster API objects that are referenced by the `ClusterClass`, such as `MachineTemplates`, etc.

These Cluster API objects are packaged in a Helm Chart that is part of every cluster stack. The clusterstackrelease-controller is responsible for applying this Helm chart, which is done by first calling `helm template` and then the "apply" method of the Kubernetes go-client.

The main resource is always the `ClusterClass` that follows a very specific naming pattern and is called in the exact same way as the `ClusterStackRelease` object that manages it. For example, `docker-ferrol-1-27-v1`, which refers to all defining properties of a specific release of a cluster stack for a certain provider.
4 changes: 4 additions & 0 deletions docs/architecture/node-image-flow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Node image flow

The node image flow depends on each provider. There are various ways in which providers allow the use of custom images. We have documented the options in the [cluster stacks repo](https://github.com/SovereignCloudStack/cluster-stacks#film_strip-node-images).

65 changes: 65 additions & 0 deletions docs/architecture/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Architecture

![Cluster Stacks](../pics/syself-cluster-stacks-web.png)

## Cluster stacks

The cluster stacks are opinionated templates of clusters in which all configuration and all core components are defined. They can be implemented on any provider.

There can be multiple cluster stacks that acknowledge the many ways in which a cluster can be set up. There is no right or wrong and cluster stacks make sure that the flexibility is not lost.

At the same time, they offer ready-made templates for users, who do not have to spend a lot of thought on how to build clusters so that everything works well together.

Cluster stacks are implemented by two Helm charts. The first one contains all Cluster API objects and is applied in the management cluster. The second Helm chart contains the cluster addons, i.e. the core components every cluster needs, and is installed in the workload clusters.

Furthermore, there are node images that can look quite different depending on the provider.

To sum up, there are three components of a cluster stack:

1. Cluster addons: The cluster addons (CNI, CSI, CCM) have to be applied in each workload cluster that the user starts
2. Cluster API objects: The `ClusterClass` object makes it easier to use Cluster-API. The cluster stack contains a `ClusterClass` object and other Cluster-API objects that are necessary in order to use the `ClusterClass`. These objects have to be applied in the management cluster.
3. Node images: Node images can be provided to the user in different form. They are released and tested together with the other two components of the cluster stack.

More information about cluster stacks and their three parts can be found in https://github.com/SovereignCloudStack/cluster-stacks/blob/main/README.md.

## Cluster Stack Operator

The Cluster Stack Operator takes care of all steps that have to be done in order to use a certain cluster stack implementation.

It has to be installed in the management cluster and can be interacted with by applying custom resources. It extends the functionality of the Cluster API operators.

The Cluster Stack Operator mainly applies the two Helm charts from a cluster stack implementation. It is also able to automatically fetch a remote Github repository to see whether there are new releases of a certain cluster stack.

The first and second component of a cluster stack are handled by the Cluster Stack Operator.

The node images, on the other hand, have to be handled by separate provider integrations, similar to the ones that [Cluster-API uses](https://cluster-api.sigs.k8s.io/developer/providers/implementers-guide/overview).

## Cluster Stack Provider Integrations

The Cluster Stack Operator is accompanied by Cluster Stack Provider Integrations. A provider integration is also an operator that works together with the Cluster Stack Operator in a specific way, which is described in the docs about building [provider integrations](../develop/provider-integration.md).

A provider integration makes sure that the node images are taken care of and made available to the user.

If there is no work to be done for node images, then the Cluster Stack Operator can work in `noProvider` mode and this Cluster Stack Provider Integration can be omitted.

## Steps to make cluster stacks ready to use

There are many steps that are needed in order to make cluster stacks ready to use. In order to understand the full flow better and to get an idea of how much work there is and how many personas are involved, we will give an overview of how to start from scratch with a new cluster stack and provider.

We will assume that this operator exists, but that you want to use a new cluster stack and provider.

### Defining a cluster stack

First, you need to define your cluster stack. Which cluster addons do you need? How do your node images look like? You need to take these decisions and write them down.

### Implementing a cluster stack

The next step is to implement your cluster stack for your provider. You can take existing implementations as reference, but need to think of how the provider-specific custom resources are called and how the respective Cluster API Provider Integration works.

### Implementing a Cluster Stack Provider Integration

We assume that you need to do some manual tasks in order to make node images accessible on your provider. These steps should be implemented in a Cluster Stack Provider Integration, which of course has to work together with the details of how you implemented your cluster stack.

### Using everything

Finally, you can use the new cluster stack you defined and implemented on the infrastructure of your provider. Enjoy!
44 changes: 44 additions & 0 deletions docs/architecture/user-flow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Deep dive: User flow

It is essential to understand the flow of what you have to do as a user and what happens in the background.

The [Quickstart guide](quickstart.md) goes over all small steps you have to do to. If you are just interested in getting started, then have a look there.

In the following, we will not go into the detail of every command, but will focus more on a high-level of what you have to do and of what happens in the background.

## Steps to create a workload cluster

### Get the right cluster stacks

The first step would be to make sure that you have the cluster stacks implemented that you want to use. Usually, you will use cluster stacks that have been implemented by others for the provider that you want to use. However, you can also build your own cluster stacks.

### Apply cluster stack resource

If you have everything available, you can start your management cluster / bootstrap cluster. In this cluster, you have to apply the `ClusterStack` custom resource with your individual desired configuration.

Depending on your configuration, you will have to wait until all steps are done in the background.

The operator will perform all necessary steps to provide you with node images. If all node images are ready, it will apply the Cluster API resources that are required.

At the end, you will have node images and Cluster API objects ready to use. There is only one step more to create a cluster.

### Use the ClusterClasses

That the previous step is done, you can see in the status of the `ClusterStack` object. However, you can also just check if you have certain `ClusterClass` objects. The `ClusterClass` objects will be applied by the Cluster Stack Operator as well. They follow a certain naming pattern. If you have the cluster stack "ferrol" for the docker provider and Kubernetes version 1.27 in version "v1", then you'll see a `ClusterClass` that has the name "docker-ferrol-1-27-v1".

You can use this `ClusterClass` by referencing it in a `Cluster` object. For details, you can check out the official Cluster-API documentation.

### Wait until cluster addons are ready

If you created a workload cluster by applying a `Cluster` object, the cluster addons will be applied automatically. You just have to wait until everything is ready, e.g. that the CCM or CNI are installed.

## Recap - how do Cluster API and Cluster Stacks work together?

The user triggers the flow by configuring and applying a `ClusterStack` custom resource. This will trigger some work in the background, to make node images and Cluster API objects ready to use.

This process is completed, when a `ClusterClass` with a certain name is created. This `ClusterClass` resource is used in order to create as many clusters as you want that look like the template specified in the `ClusterClass`.

Upgrades of clusters are done by changing the reference to a new `ClusterClass`, e.g. from `docker-ferrol-1-27-v1` to `docker-ferrol-1-27-v2`.

To sum up: The Cluster Stack Operator takes care of steps that you would otherwise have to do manually. It does not change anything in the normal Cluster API flow, expcept that it enforces the use of `ClusterClasses`.

Loading

0 comments on commit a8aceaa

Please sign in to comment.