Skip to content

Commit a261242

Browse files
authored
Proofreading of Configuration as code. (#233)
Signed-off-by: Michael Greaves <[email protected]>
1 parent 8a63156 commit a261242

File tree

1 file changed

+134
-122
lines changed

1 file changed

+134
-122
lines changed
Lines changed: 134 additions & 122 deletions
Original file line numberDiff line numberDiff line change
@@ -1,156 +1,168 @@
11
---
2-
title: "Configuration as Data"
2+
title: "Configuration as Data (CaD)"
33
type: docs
44
weight: 1
55
description:
66
---
77

8-
## Why
8+
This document provides the background context for Package Orchestration, which is further
9+
elaborated in a dedicated [document](package-orchestration.md).
910

10-
This document provides background context for Package Orchestration, which is further elaborated in a dedicated
11-
[document](package-orchestration.md).
11+
## Configuration as data (CaD)
1212

13-
## Configuration as Data
13+
CaD is an approach to the management of configuration. It includes the configuration of
14+
infrastructure, policy, services, applications, and so on. CaD performs the following actions:
1415

15-
Configuration as Data is an approach to management of configuration (incl.
16-
configuration of infrastructure, policy, services, applications, etc.) which:
17-
18-
* makes configuration data the source of truth, stored separately from the live
19-
state
20-
* uses a uniform, serializable data model to represent configuration
21-
* separates code that acts on the configuration from the data and from packages
22-
/ bundles of the data
23-
* abstracts configuration file structure and storage from operations that act
24-
upon the configuration data; clients manipulating configuration data don’t
25-
need to directly interact with storage (git, container images)
16+
* Making configuration data the source of truth, stored separately from the live state.
17+
* Using a uniform, serializable data model to represent the configuration.
18+
* Separating the code that acts on the configuration from the data and from packages/bundles of
19+
data.
20+
* Abstracting the configuration file structure and storage from the operations that act on the
21+
configuration data. Clients manipulating the configuration data do not need to interact directly
22+
with the storage (such as git, container images, and so on).
2623

2724
![CaD Overview](/static/images/porch/CaD-Overview.svg)
2825

29-
## Key Principles
26+
## Key principles
3027

3128
A system based on CaD should observe the following key principles:
3229

33-
* secrets should be stored separately, in a secret-focused storage
34-
system ([example](https://cert-manager.io/))
35-
* stores a versioned history of configuration changes by change sets to bundles
36-
of related configuration data
37-
* relies on uniformity and consistency of the configuration format, including
38-
type metadata, to enable pattern-based operations on the configuration data,
39-
along the lines of duck typing
40-
* separates schemas for the configuration data from the data, and relies on
41-
schema information for strongly typed operations and to disambiguate data
42-
structures and other variations within the model
43-
* decouples abstractions of configuration from collections of configuration data
44-
* represents abstractions of configuration generators as data with schemas, like
45-
other configuration data
46-
* finds, filters / queries / selects, and/or validates configuration data that
47-
can be operated on by given code (functions)
48-
* finds and/or filters / queries / selects code (functions) that can operate on
49-
resource types contained within a body of configuration data
50-
* actuation (reconciliation of configuration data with live state) is separate
51-
from transformation of configuration data, and is driven by the declarative
52-
data model
53-
* transformations, particularly value propagation, are preferable to wholesale
54-
configuration generation except when the expansion is dramatic (say, >10x)
55-
* transformation input generation should usually be decoupled from propagation
56-
* deployment context inputs should be taken from well defined “provider context”
57-
objects
58-
* identifiers and references should be declarative
59-
* live state should be linked back to sources of truth (configuration)
60-
61-
## KRM CaD
30+
* Separate handling of secrets in secret storage, in a secret-focused storage system, such as
31+
([example](https://cert-manager.io/)).
32+
* Storage of a versioned history of configuration changes by change sets to bundles of related
33+
configuration data.
34+
* Reliance on the uniformity and consistency of the configuration format, including type metadata,
35+
to enable pattern-based operations on the configuration data, along the lines of duck typing.
36+
* Separation of the configuration data from its schemas, and reliance on the schema information for
37+
strongly typed operations and disambiguation of data structures and other variations within the
38+
model.
39+
* Decoupling of abstractions of configuration from collections of configuration data.
40+
* Representation of abstractions of configuration generators as data with schemas, as with other
41+
configuration data.
42+
* Finding, filtering, querying, selecting, and/or validating of configuration data that can be
43+
operated on by given code (functions).
44+
* Finding and/or filtering, querying, and selecting of code (functions) that can operate on
45+
resource types contained within a body of configuration data.
46+
* Actuation (reconciliation of configuration data with live state) that is separate from the
47+
transformation of the configuration data, and is driven by the declarative data model.
48+
* Transformations. Transformations, particularly value propagation, are preferable to wholesale
49+
configuration generation, except when the expansion is dramatic (for example, >10x).
50+
* Transformation input generation: this should usually be decoupled from propagation.
51+
* Deployment context inputs: these should be taken from well-defined “provider context” objects.
52+
* Identifiers and references: these should be declarative.
53+
* Live state: this should be linked back to sources of truth (configuration).
54+
55+
## Kubernetes Resouce Model configuration as data (KRM CaD)
6256

6357
Our implementation of the Configuration as Data approach (
6458
[kpt](https://kpt.dev),
6559
[Config Sync](https://cloud.google.com/anthos-config-management/docs/config-sync-overview),
6660
and [Package Orchestration](https://github.com/nephio-project/porch))
67-
is built on the foundation of
61+
is built on the foundation of the
6862
[Kubernetes Resource Model](https://github.com/kubernetes/design-proposals-archive/blob/main/architecture/resource-management.md)
6963
(KRM).
7064

7165
{{% alert title="Note" color="primary" %}}
7266

73-
Even though KRM is not a requirement of Config as Data (just like
74-
Python or Go templates or Jinja are not specifically
75-
requirements for [IaC](https://en.wikipedia.org/wiki/Infrastructure_as_code)), the choice of
76-
another foundational config representation format would necessitate
77-
implementing adapters for all types of infrastructure and applications
78-
configured, including Kubernetes, CRDs, GCP resources and more. Likewise, choice
79-
of another configuration format would require redesign of a number of the
80-
configuration management mechanisms that have already been designed for KRM,
81-
such as 3-way merge, structural merge patch, schema descriptions, resource
82-
metadata, references, status conventions, etc.
67+
Even though KRM is not a requirement of CaD (just as Python or Go templates, or Jinja, are not
68+
specifically requirements for [IaC](https://en.wikipedia.org/wiki/Infrastructure_as_code)), the
69+
choice of another foundational configuration representation format would necessitate the
70+
implementation of adapters for all types of infrastructure and applications configured, including
71+
Kubernetes, CRDs, GCP resources, and more. Likewise, choosing another configuration format would
72+
require the redesign of several of the configuration management mechanisms that have already been
73+
designed for KRM, such as three-way merge, structural merge patch, schema descriptions, resource
74+
metadata, references, status conventions, and so on.
8375

8476
{{% /alert %}}
8577

8678

87-
**KRM CaD** is therefore a specific approach to implementing *Configuration as Data* which:
88-
89-
* uses [KRM](https://github.com/kubernetes/design-proposals-archive/blob/main/architecture/resource-management.md)
90-
as the configuration serialization data model
91-
* uses [Kptfile](https://kpt.dev/reference/schema/kptfile/) to store package metadata
92-
* uses [ResourceList](https://kpt.dev/reference/schema/resource-list/) as a serialized package wire-format
93-
* uses a function `ResourceList → ResultList` (*kpt* function) as the foundational, composable unit of
94-
package-manipulation code (note that other forms of code can manipulate packages as well, i.e. UIs, custom algorithms
95-
not necessarily packaged and used as kpt functions)
96-
97-
and provides the following basic functionality:
98-
99-
* load a serialized package from a repository (as ResourceList) (examples of repository may be one or more of: local
100-
HDD, Git repository, OCI, Cloud Storage, etc.)
101-
* save a serialized package (as ResourceList) to a package repository
102-
* evaluate a function on a serialized package (ResourceList)
103-
* [render](https://kpt.dev/book/04-using-functions/01-declarative-function-execution) a package (evaluate functions
104-
declared within the package itself)
105-
* create a new (empty) package
106-
* fork (or clone) an existing package from one package repository (called upstream) to another (called downstream)
107-
* delete a package from a repository
108-
* associate a version with the package; guarantee immutability of packages with an assigned version
109-
* incorporate changes from the new version of an upstream package into a new version of a downstream package (3 way merge)
110-
* revert to a prior version of a package
111-
112-
## Value
113-
114-
The Config as Data approach enables some key value which is available in other
115-
configuration management approaches to a lesser extent or is not available
116-
at all.
117-
118-
* simplified authoring of configuration using a variety of methods and sources
119-
* WYSIWYG interaction with configuration using a simple data serialization formation rather than a code-like format
120-
* layering of interoperable interface surfaces (notably GUI) over declarative configuration mechanisms rather than
121-
forcing choices between exclusive alternatives (exclusively UI/CLI or IaC initially followed by exclusively
122-
UI/CLI or exclusively IaC)
123-
* the ability to apply UX techniques to simplify configuration authoring and viewing
124-
* compared to imperative tools (e.g., UI, CLI) that directly modify the live state via APIs, CaD enables versioning,
125-
undo, audits of configuration history, review/approval, pre-deployment preview, validation, safety checks,
126-
constraint-based policy enforcement, and disaster recovery
127-
* bulk changes to configuration data in their sources of truth
128-
* injection of configuration to address horizontal concerns
129-
* merging of multiple sources of truth
130-
* state export to reusable blueprints without manual templatization
131-
* cooperative editing of configuration by humans and automation, such as for security remediation (which is usually
132-
implemented against live-state APIs)
133-
* reusability of configuration transformation code across multiple bodies of configuration data containing the same
134-
resource types, amortizing the effort of writing, testing, documenting the code
135-
* combination of independent configuration transformations
136-
* implementation of config transformations using the languages of choice, including both programming and scripting
137-
approaches
138-
* reducing the frequency of changes to existing transformation code
139-
* separation of roles between developer and non-developer configuration users
140-
* defragmenting the configuration transformation ecosystem
141-
* admission control and invariant enforcement on sources of truth
142-
* maintaining variants of configuration blueprints without one-size-fits-all full struct-constructor-style
143-
parameterization and without manually constructing and maintaining patches
144-
* drift detection and remediation for most of the desired state via continuous reconciliation using apply and/or for
145-
specific attributes via targeted mutation of the sources of truth
146-
147-
## Related Articles
148-
149-
For more information about Configuration as Data and Kubernetes Resource Model,
150-
visit the following links:
79+
**KRM CaD** is, therefore, a specific approach to implementing *Configuration as Data* which uses
80+
the following:
81+
82+
* [KRM](https://github.com/kubernetes/design-proposals-archive/blob/main/architecture/resource-management.md)
83+
as the configuration serialization data model.
84+
* [Kptfile](https://kpt.dev/reference/schema/kptfile/) to store package metadata.
85+
* [ResourceList](https://kpt.dev/reference/schema/resource-list/) as a serialized package wire
86+
format.
87+
* A function `ResourceList → ResultList` (*kpt* function) as the foundational, composable unit of
88+
package manipulation code.
89+
90+
{{% alert title="Note" color="primary" %}}
91+
92+
Other forms of code can also manipulate packages, such as UIs and custom algorithms not
93+
necessarily packaged and used as kpt functions.
94+
95+
{{% /alert %}}
96+
97+
98+
**KRM CaD** provides the following basic functionalities:
99+
100+
* Loading a serialized package from a repository (as a ResourceList). Examples of a repository may
101+
be one or more of the following:
102+
* Local HDD
103+
* Git repository
104+
* OCI
105+
* Cloud storage
106+
* Saving a serialized package (as a ResourceList) to a package repository.
107+
* Evaluating a function on a serialized package (ResourceList).
108+
* [Rendering](https://kpt.dev/book/04-using-functions/01-declarative-function-execution) a package
109+
(evaluating the functions declared within the package itself).
110+
* Creating a new (empty) package.
111+
* Forking (or cloning) an existing package from one package repository (called upstream) to another
112+
(called downstream).
113+
* Deleting a package from a repository.
114+
* Associating a version with the package and guaranteeing the immutability of packages with an
115+
assigned version.
116+
* Incorporating changes from the new version of an upstream package into a new version of a
117+
downstream package (three-way merge).
118+
* Reverting to a prior version of a package.
119+
120+
## Configuration values
121+
122+
The configuration as data approach enables some key values which are available in other
123+
configuration management approaches to a lesser extent or not at all.
124+
125+
The values enabled by the configuration as data approach are as follows:
126+
127+
* Simplified authoring of the configuration using a variety of methods and sources.
128+
* What-you-see-is-what-you-get (WYSIWYG) interaction with the configuration using a simple data
129+
serialization formation, rather than a code-like format.
130+
* Layering of interoperable interface surfaces (notably GUIs) over declarative configuration
131+
mechanisms, rather than forcing choices between exclusive alternatives (exclusively, UI/CLI or
132+
IaC initially, followed by exclusively UI/CLI or exclusively IaC).
133+
* The ability to apply UX techniques to simplify configuration authoring and viewing.
134+
* Compared to imperative tools, such as UI and CLI, that directly modify the live state via APIs,
135+
CaD enables versioning, undo, audits of configuration history, review/approval, predeployment
136+
preview, validation, safety checks, constraint-based policy enforcement, and disaster recovery.
137+
* Bulk changes to configuration data in their sources of truth.
138+
* Injection of configuration to address horizontal concerns.
139+
* Merging of multiple sources of truth.
140+
* State export to reusable blueprints without manual templatization.
141+
* Cooperative editing of configurations by humans and automation, such as for security remediation,
142+
which is usually implemented against live-state APIs.
143+
* Reusability of the configuration transformation code across multiple bodies of configuration data
144+
containing the same resource types, amortizing the effort of writing, testing, and documenting
145+
the code.
146+
* A combination of independent configuration transformations.
147+
* Implementation of configuration transformations using the languages of choice, including both
148+
programming and scripting approaches.
149+
* Reducing the frequency of changes to the existing transformation code.
150+
* Separation of roles between developer and non-developer configuration users.
151+
* Defragmenting the configuration transformation ecosystem.
152+
* Admission control and invariant enforcement on sources of truth.
153+
* Maintaining variants of configuration blueprints without one-size-fits-all full
154+
struct-constructor-style parameterization and without manually constructing and maintaining
155+
patches.
156+
* Drift detection and remediation for most of the desired state via continuous reconciliation,
157+
using apply and/or for specific attributes via a targeted mutation of the sources of truth.
158+
159+
## Related articles
160+
161+
For more information about configuration as data and the Kubernetes Resource Model, visit the
162+
following links:
151163

152164
* [Rationale for kpt](https://kpt.dev/guides/rationale)
153165
* [Understanding Configuration as Data](https://cloud.google.com/blog/products/containers-kubernetes/understanding-configuration-as-data-in-kubernetes)
154-
blog post.
166+
blog post
155167
* [Kubernetes Resource Model](https://cloud.google.com/blog/topics/developers-practitioners/build-platform-krm-part-1-whats-platform)
156168
blog post series

0 commit comments

Comments
 (0)