|
1 | 1 | ---
|
2 |
| -title: "Configuration as Data" |
| 2 | +title: "Configuration as Data (CaD)" |
3 | 3 | type: docs
|
4 | 4 | weight: 1
|
5 | 5 | description:
|
6 | 6 | ---
|
7 | 7 |
|
8 |
| -## Why |
| 8 | +This document provides the background context for Package Orchestration, which is further |
| 9 | +elaborated in a dedicated [document](package-orchestration.md). |
9 | 10 |
|
10 |
| -This document provides background context for Package Orchestration, which is further elaborated in a dedicated |
11 |
| -[document](package-orchestration.md). |
| 11 | +## Configuration as data (CaD) |
12 | 12 |
|
13 |
| -## Configuration as Data |
| 13 | +CaD is an approach to the management of configuration. It includes the configuration of |
| 14 | +infrastructure, policy, services, applications, and so on. CaD performs the following actions: |
14 | 15 |
|
15 |
| -Configuration as Data is an approach to management of configuration (incl. |
16 |
| -configuration of infrastructure, policy, services, applications, etc.) which: |
17 |
| - |
18 |
| -* makes configuration data the source of truth, stored separately from the live |
19 |
| - state |
20 |
| -* uses a uniform, serializable data model to represent configuration |
21 |
| -* separates code that acts on the configuration from the data and from packages |
22 |
| - / bundles of the data |
23 |
| -* abstracts configuration file structure and storage from operations that act |
24 |
| - upon the configuration data; clients manipulating configuration data don’t |
25 |
| - need to directly interact with storage (git, container images) |
| 16 | +* Making configuration data the source of truth, stored separately from the live state. |
| 17 | +* Using a uniform, serializable data model to represent the configuration. |
| 18 | +* Separating the code that acts on the configuration from the data and from packages/bundles of |
| 19 | + data. |
| 20 | +* Abstracting the configuration file structure and storage from the operations that act on the |
| 21 | + configuration data. Clients manipulating the configuration data do not need to interact directly |
| 22 | + with the storage (such as git, container images, and so on). |
26 | 23 |
|
27 | 24 | 
|
28 | 25 |
|
29 |
| -## Key Principles |
| 26 | +## Key principles |
30 | 27 |
|
31 | 28 | A system based on CaD should observe the following key principles:
|
32 | 29 |
|
33 |
| -* secrets should be stored separately, in a secret-focused storage |
34 |
| -system ([example](https://cert-manager.io/)) |
35 |
| -* stores a versioned history of configuration changes by change sets to bundles |
36 |
| - of related configuration data |
37 |
| -* relies on uniformity and consistency of the configuration format, including |
38 |
| - type metadata, to enable pattern-based operations on the configuration data, |
39 |
| - along the lines of duck typing |
40 |
| -* separates schemas for the configuration data from the data, and relies on |
41 |
| - schema information for strongly typed operations and to disambiguate data |
42 |
| - structures and other variations within the model |
43 |
| -* decouples abstractions of configuration from collections of configuration data |
44 |
| -* represents abstractions of configuration generators as data with schemas, like |
45 |
| - other configuration data |
46 |
| -* finds, filters / queries / selects, and/or validates configuration data that |
47 |
| - can be operated on by given code (functions) |
48 |
| -* finds and/or filters / queries / selects code (functions) that can operate on |
49 |
| - resource types contained within a body of configuration data |
50 |
| -* actuation (reconciliation of configuration data with live state) is separate |
51 |
| - from transformation of configuration data, and is driven by the declarative |
52 |
| - data model |
53 |
| -* transformations, particularly value propagation, are preferable to wholesale |
54 |
| - configuration generation except when the expansion is dramatic (say, >10x) |
55 |
| -* transformation input generation should usually be decoupled from propagation |
56 |
| -* deployment context inputs should be taken from well defined “provider context” |
57 |
| - objects |
58 |
| -* identifiers and references should be declarative |
59 |
| -* live state should be linked back to sources of truth (configuration) |
60 |
| - |
61 |
| -## KRM CaD |
| 30 | +* Separate handling of secrets in secret storage, in a secret-focused storage system, such as |
| 31 | + ([example](https://cert-manager.io/)). |
| 32 | +* Storage of a versioned history of configuration changes by change sets to bundles of related |
| 33 | + configuration data. |
| 34 | +* Reliance on the uniformity and consistency of the configuration format, including type metadata, |
| 35 | + to enable pattern-based operations on the configuration data, along the lines of duck typing. |
| 36 | +* Separation of the configuration data from its schemas, and reliance on the schema information for |
| 37 | + strongly typed operations and disambiguation of data structures and other variations within the |
| 38 | + model. |
| 39 | +* Decoupling of abstractions of configuration from collections of configuration data. |
| 40 | +* Representation of abstractions of configuration generators as data with schemas, as with other |
| 41 | + configuration data. |
| 42 | +* Finding, filtering, querying, selecting, and/or validating of configuration data that can be |
| 43 | + operated on by given code (functions). |
| 44 | +* Finding and/or filtering, querying, and selecting of code (functions) that can operate on |
| 45 | + resource types contained within a body of configuration data. |
| 46 | +* Actuation (reconciliation of configuration data with live state) that is separate from the |
| 47 | + transformation of the configuration data, and is driven by the declarative data model. |
| 48 | +* Transformations. Transformations, particularly value propagation, are preferable to wholesale |
| 49 | + configuration generation, except when the expansion is dramatic (for example, >10x). |
| 50 | +* Transformation input generation: this should usually be decoupled from propagation. |
| 51 | +* Deployment context inputs: these should be taken from well-defined “provider context” objects. |
| 52 | +* Identifiers and references: these should be declarative. |
| 53 | +* Live state: this should be linked back to sources of truth (configuration). |
| 54 | + |
| 55 | +## Kubernetes Resouce Model configuration as data (KRM CaD) |
62 | 56 |
|
63 | 57 | Our implementation of the Configuration as Data approach (
|
64 | 58 | [kpt](https://kpt.dev),
|
65 | 59 | [Config Sync](https://cloud.google.com/anthos-config-management/docs/config-sync-overview),
|
66 | 60 | and [Package Orchestration](https://github.com/nephio-project/porch))
|
67 |
| -is built on the foundation of |
| 61 | +is built on the foundation of the |
68 | 62 | [Kubernetes Resource Model](https://github.com/kubernetes/design-proposals-archive/blob/main/architecture/resource-management.md)
|
69 | 63 | (KRM).
|
70 | 64 |
|
71 | 65 | {{% alert title="Note" color="primary" %}}
|
72 | 66 |
|
73 |
| -Even though KRM is not a requirement of Config as Data (just like |
74 |
| -Python or Go templates or Jinja are not specifically |
75 |
| -requirements for [IaC](https://en.wikipedia.org/wiki/Infrastructure_as_code)), the choice of |
76 |
| -another foundational config representation format would necessitate |
77 |
| -implementing adapters for all types of infrastructure and applications |
78 |
| -configured, including Kubernetes, CRDs, GCP resources and more. Likewise, choice |
79 |
| -of another configuration format would require redesign of a number of the |
80 |
| -configuration management mechanisms that have already been designed for KRM, |
81 |
| -such as 3-way merge, structural merge patch, schema descriptions, resource |
82 |
| -metadata, references, status conventions, etc. |
| 67 | +Even though KRM is not a requirement of CaD (just as Python or Go templates, or Jinja, are not |
| 68 | +specifically requirements for [IaC](https://en.wikipedia.org/wiki/Infrastructure_as_code)), the |
| 69 | +choice of another foundational configuration representation format would necessitate the |
| 70 | +implementation of adapters for all types of infrastructure and applications configured, including |
| 71 | +Kubernetes, CRDs, GCP resources, and more. Likewise, choosing another configuration format would |
| 72 | +require the redesign of several of the configuration management mechanisms that have already been |
| 73 | +designed for KRM, such as three-way merge, structural merge patch, schema descriptions, resource |
| 74 | +metadata, references, status conventions, and so on. |
83 | 75 |
|
84 | 76 | {{% /alert %}}
|
85 | 77 |
|
86 | 78 |
|
87 |
| -**KRM CaD** is therefore a specific approach to implementing *Configuration as Data* which: |
88 |
| - |
89 |
| -* uses [KRM](https://github.com/kubernetes/design-proposals-archive/blob/main/architecture/resource-management.md) |
90 |
| - as the configuration serialization data model |
91 |
| -* uses [Kptfile](https://kpt.dev/reference/schema/kptfile/) to store package metadata |
92 |
| -* uses [ResourceList](https://kpt.dev/reference/schema/resource-list/) as a serialized package wire-format |
93 |
| -* uses a function `ResourceList → ResultList` (*kpt* function) as the foundational, composable unit of |
94 |
| - package-manipulation code (note that other forms of code can manipulate packages as well, i.e. UIs, custom algorithms |
95 |
| - not necessarily packaged and used as kpt functions) |
96 |
| - |
97 |
| -and provides the following basic functionality: |
98 |
| - |
99 |
| -* load a serialized package from a repository (as ResourceList) (examples of repository may be one or more of: local |
100 |
| - HDD, Git repository, OCI, Cloud Storage, etc.) |
101 |
| -* save a serialized package (as ResourceList) to a package repository |
102 |
| -* evaluate a function on a serialized package (ResourceList) |
103 |
| -* [render](https://kpt.dev/book/04-using-functions/01-declarative-function-execution) a package (evaluate functions |
104 |
| - declared within the package itself) |
105 |
| -* create a new (empty) package |
106 |
| -* fork (or clone) an existing package from one package repository (called upstream) to another (called downstream) |
107 |
| -* delete a package from a repository |
108 |
| -* associate a version with the package; guarantee immutability of packages with an assigned version |
109 |
| -* incorporate changes from the new version of an upstream package into a new version of a downstream package (3 way merge) |
110 |
| -* revert to a prior version of a package |
111 |
| - |
112 |
| -## Value |
113 |
| - |
114 |
| -The Config as Data approach enables some key value which is available in other |
115 |
| -configuration management approaches to a lesser extent or is not available |
116 |
| -at all. |
117 |
| - |
118 |
| -* simplified authoring of configuration using a variety of methods and sources |
119 |
| -* WYSIWYG interaction with configuration using a simple data serialization formation rather than a code-like format |
120 |
| -* layering of interoperable interface surfaces (notably GUI) over declarative configuration mechanisms rather than |
121 |
| - forcing choices between exclusive alternatives (exclusively UI/CLI or IaC initially followed by exclusively |
122 |
| - UI/CLI or exclusively IaC) |
123 |
| -* the ability to apply UX techniques to simplify configuration authoring and viewing |
124 |
| -* compared to imperative tools (e.g., UI, CLI) that directly modify the live state via APIs, CaD enables versioning, |
125 |
| - undo, audits of configuration history, review/approval, pre-deployment preview, validation, safety checks, |
126 |
| - constraint-based policy enforcement, and disaster recovery |
127 |
| -* bulk changes to configuration data in their sources of truth |
128 |
| -* injection of configuration to address horizontal concerns |
129 |
| -* merging of multiple sources of truth |
130 |
| -* state export to reusable blueprints without manual templatization |
131 |
| -* cooperative editing of configuration by humans and automation, such as for security remediation (which is usually |
132 |
| - implemented against live-state APIs) |
133 |
| -* reusability of configuration transformation code across multiple bodies of configuration data containing the same |
134 |
| - resource types, amortizing the effort of writing, testing, documenting the code |
135 |
| -* combination of independent configuration transformations |
136 |
| -* implementation of config transformations using the languages of choice, including both programming and scripting |
137 |
| - approaches |
138 |
| -* reducing the frequency of changes to existing transformation code |
139 |
| -* separation of roles between developer and non-developer configuration users |
140 |
| -* defragmenting the configuration transformation ecosystem |
141 |
| -* admission control and invariant enforcement on sources of truth |
142 |
| -* maintaining variants of configuration blueprints without one-size-fits-all full struct-constructor-style |
143 |
| - parameterization and without manually constructing and maintaining patches |
144 |
| -* drift detection and remediation for most of the desired state via continuous reconciliation using apply and/or for |
145 |
| - specific attributes via targeted mutation of the sources of truth |
146 |
| - |
147 |
| -## Related Articles |
148 |
| - |
149 |
| -For more information about Configuration as Data and Kubernetes Resource Model, |
150 |
| -visit the following links: |
| 79 | +**KRM CaD** is, therefore, a specific approach to implementing *Configuration as Data* which uses |
| 80 | +the following: |
| 81 | + |
| 82 | +* [KRM](https://github.com/kubernetes/design-proposals-archive/blob/main/architecture/resource-management.md) |
| 83 | + as the configuration serialization data model. |
| 84 | +* [Kptfile](https://kpt.dev/reference/schema/kptfile/) to store package metadata. |
| 85 | +* [ResourceList](https://kpt.dev/reference/schema/resource-list/) as a serialized package wire |
| 86 | + format. |
| 87 | +* A function `ResourceList → ResultList` (*kpt* function) as the foundational, composable unit of |
| 88 | + package manipulation code. |
| 89 | + |
| 90 | + {{% alert title="Note" color="primary" %}} |
| 91 | + |
| 92 | + Other forms of code can also manipulate packages, such as UIs and custom algorithms not |
| 93 | + necessarily packaged and used as kpt functions. |
| 94 | + |
| 95 | + {{% /alert %}} |
| 96 | + |
| 97 | + |
| 98 | +**KRM CaD** provides the following basic functionalities: |
| 99 | + |
| 100 | +* Loading a serialized package from a repository (as a ResourceList). Examples of a repository may |
| 101 | + be one or more of the following: |
| 102 | + * Local HDD |
| 103 | + * Git repository |
| 104 | + * OCI |
| 105 | + * Cloud storage |
| 106 | +* Saving a serialized package (as a ResourceList) to a package repository. |
| 107 | +* Evaluating a function on a serialized package (ResourceList). |
| 108 | +* [Rendering](https://kpt.dev/book/04-using-functions/01-declarative-function-execution) a package |
| 109 | + (evaluating the functions declared within the package itself). |
| 110 | +* Creating a new (empty) package. |
| 111 | +* Forking (or cloning) an existing package from one package repository (called upstream) to another |
| 112 | + (called downstream). |
| 113 | +* Deleting a package from a repository. |
| 114 | +* Associating a version with the package and guaranteeing the immutability of packages with an |
| 115 | + assigned version. |
| 116 | +* Incorporating changes from the new version of an upstream package into a new version of a |
| 117 | + downstream package (three-way merge). |
| 118 | +* Reverting to a prior version of a package. |
| 119 | + |
| 120 | +## Configuration values |
| 121 | + |
| 122 | +The configuration as data approach enables some key values which are available in other |
| 123 | +configuration management approaches to a lesser extent or not at all. |
| 124 | + |
| 125 | +The values enabled by the configuration as data approach are as follows: |
| 126 | + |
| 127 | +* Simplified authoring of the configuration using a variety of methods and sources. |
| 128 | +* What-you-see-is-what-you-get (WYSIWYG) interaction with the configuration using a simple data |
| 129 | + serialization formation, rather than a code-like format. |
| 130 | +* Layering of interoperable interface surfaces (notably GUIs) over declarative configuration |
| 131 | + mechanisms, rather than forcing choices between exclusive alternatives (exclusively, UI/CLI or |
| 132 | + IaC initially, followed by exclusively UI/CLI or exclusively IaC). |
| 133 | +* The ability to apply UX techniques to simplify configuration authoring and viewing. |
| 134 | +* Compared to imperative tools, such as UI and CLI, that directly modify the live state via APIs, |
| 135 | + CaD enables versioning, undo, audits of configuration history, review/approval, predeployment |
| 136 | + preview, validation, safety checks, constraint-based policy enforcement, and disaster recovery. |
| 137 | +* Bulk changes to configuration data in their sources of truth. |
| 138 | +* Injection of configuration to address horizontal concerns. |
| 139 | +* Merging of multiple sources of truth. |
| 140 | +* State export to reusable blueprints without manual templatization. |
| 141 | +* Cooperative editing of configurations by humans and automation, such as for security remediation, |
| 142 | + which is usually implemented against live-state APIs. |
| 143 | +* Reusability of the configuration transformation code across multiple bodies of configuration data |
| 144 | + containing the same resource types, amortizing the effort of writing, testing, and documenting |
| 145 | + the code. |
| 146 | +* A combination of independent configuration transformations. |
| 147 | +* Implementation of configuration transformations using the languages of choice, including both |
| 148 | + programming and scripting approaches. |
| 149 | +* Reducing the frequency of changes to the existing transformation code. |
| 150 | +* Separation of roles between developer and non-developer configuration users. |
| 151 | +* Defragmenting the configuration transformation ecosystem. |
| 152 | +* Admission control and invariant enforcement on sources of truth. |
| 153 | +* Maintaining variants of configuration blueprints without one-size-fits-all full |
| 154 | + struct-constructor-style parameterization and without manually constructing and maintaining |
| 155 | + patches. |
| 156 | +* Drift detection and remediation for most of the desired state via continuous reconciliation, |
| 157 | + using apply and/or for specific attributes via a targeted mutation of the sources of truth. |
| 158 | + |
| 159 | +## Related articles |
| 160 | + |
| 161 | +For more information about configuration as data and the Kubernetes Resource Model, visit the |
| 162 | +following links: |
151 | 163 |
|
152 | 164 | * [Rationale for kpt](https://kpt.dev/guides/rationale)
|
153 | 165 | * [Understanding Configuration as Data](https://cloud.google.com/blog/products/containers-kubernetes/understanding-configuration-as-data-in-kubernetes)
|
154 |
| - blog post. |
| 166 | + blog post |
155 | 167 | * [Kubernetes Resource Model](https://cloud.google.com/blog/topics/developers-practitioners/build-platform-krm-part-1-whats-platform)
|
156 | 168 | blog post series
|
0 commit comments