diff --git a/docs/docs/getting-started/index.xml b/docs/docs/getting-started/index.xml index 42a34b174..1a36365e8 100644 --- a/docs/docs/getting-started/index.xml +++ b/docs/docs/getting-started/index.xml @@ -59,7 +59,7 @@ the link between CD and CT to provide Level 2 of the <a href="https://cloud.g <li><a href="../../reference/run-completion">Run Completion Eventsource</a></li> </ul> <h2 id="architecture-overview">Architecture Overview</h2> -<p>To do.</p>Docs: Installationhttps://sky-uk.github.io/kfp-operator/docs/getting-started/installation/Mon, 01 Jan 0001 00:00:00 +0000https://sky-uk.github.io/kfp-operator/docs/getting-started/installation/ +<p><img src="https://sky-uk.github.io/kfp-operator/images/architecture.svg" alt="architecture.svg"></p>Docs: Installationhttps://sky-uk.github.io/kfp-operator/docs/getting-started/installation/Mon, 01 Jan 0001 00:00:00 +0000https://sky-uk.github.io/kfp-operator/docs/getting-started/installation/ <p>We recommend the installation using Helm as it allows a declarative approach to managing Kubernetes resources.</p> <p>This guide assumes you are familiar with <a href="https://helm.sh/">Helm</a>.</p> <h2 id="prerequisites">Prerequisites</h2> diff --git a/docs/docs/getting-started/overview/index.html b/docs/docs/getting-started/overview/index.html index 6e4d62800..3a28f21f7 100644 --- a/docs/docs/getting-started/overview/index.html +++ b/docs/docs/getting-started/overview/index.html @@ -1,3 +1,4 @@ +<<<<<<< HEAD Overview | KFP-Operator Create project issue

Overview

The Kubeflow Pipelines Operator (KFP Operator) provides a declarative API for managing and running ML pipelines with Resource Definitions on multiple providers. A provider is a runtime environment for managing and executing ML pipelines and related resources.

Why KFP Operator

We started this project to promote the best engineering practices in the Machine Learning process, while reducing the operational overhead associated with deploying, running and maintaining training pipelines. We wanted to move away from a manual, opaque, copy-and-paste style deployment and closer to a declarative, traceable, and self-serve approach.

By configuring simple Kubernetes resources, machine learning practitioners can run their desired training pipelines in each environment on the path to production in a repeatable, testable and scalable way. When linked with serving components, this provides a fully testable path to production for machine learning systems.

cd-ct

Through separating training code from infrastructure, KFP Operator provides the link between CD and CT to provide Level 2 of the MLOps Maturity model.

mlops maturity level

+======= +Overview | KFP-Operator

Overview

The Kubeflow Pipelines Operator provides a declarative API for managing and running ML pipelines with Resource Definitions on multiple providers. +A provider is a runtime environment for managing and executing ML pipelines and related resources.

Compatibility

The operator currently supports

  • TFX Pipelines with Python 3.7 and 3.9 - pipelines created using the KFP DSL are not supported yet
  • KFP standalone (a full KFP installation is not supported yet) and Vertex AI

TFX Pipelines and Components

Unlike imperative Kubeflow Pipelines deployments, the operator takes care of providing all environment-specific configuration and setup for the pipelines. Pipeline creators therefore don’t have to provide DAG runners, metadata configs, serving directories, etc. Furthermore, pusher is not required and the operator can extend the pipeline with this very environment-specific component.

For running a pipeline using the operator, only the list of TFX components needs to be returned. Everything else is done by the operator. See the penguin pipeline for an example.

Lifecycle phases and Parameter types

TFX Pipelines go through certain lifecycle phases that are unique to this technology. It is helpful to understand where these differ and where they are executed.

Development: Creating the components definition as code.

Compilation: Applying compile-time parameters and defining the execution runtime (aka DAG runner) for the pipeline to be compiled into a deployable artifact.

Deployment: Creating a pipeline representation in the target environment.

Running: Instantiating the pipeline, applying runtime parameters and running all pipeline steps involved to completion.

Note: Local runners usually skip compilation and deployment and run the pipeline straight away.

TFX allows the parameterization of Pipelines in most lifecycle stages:

Parameter typeDescriptionExample
Named ConstantsCode constantsANN layer size
Compile-time parameterParameters that are unlikely to change between pipeline runs supplied as environment variabels to the pipeline functionBigquery dataset
Runtime parameterParameters exposed as TFX RuntimeParameter which can be overridden at runtime allow simplified experimentation without having to recompile the pipelineNumber of training runs

The pipeline operator supports the application of compile time and runtime parameters through its custom resources. We strongly encourage the usage of both of these parameter types to speed up development and experimentation lifecycles. Note that Runtime parameters can initialised to default values from both constants and compile-time parameters

Eventing Support

The Kubeflow Pipelines operator can optionally be installed with Argo-Events eventsources which lets users react to events.

Currently, we support the following eventsources:

Architecture Overview

architecture.svg

+>>>>>>> 3da041c (Add autogenerated website files) \ No newline at end of file diff --git a/docs/docs/index.xml b/docs/docs/index.xml index b328f1667..6541b91cb 100644 --- a/docs/docs/index.xml +++ b/docs/docs/index.xml @@ -115,6 +115,7 @@ We suggest an exponential backoff with min and max backoff set to at least 10 se <p><!-- raw HTML omitted -->*<!-- raw HTML omitted --> fields only needed if the operator is installed with <a href="../../getting-started/overview/#eventing-support">eventing support</a></p>
Docs: Overviewhttps://sky-uk.github.io/kfp-operator/docs/getting-started/overview/Mon, 01 Jan 0001 00:00:00 +0000https://sky-uk.github.io/kfp-operator/docs/getting-started/overview/ <p>The Kubeflow Pipelines Operator (KFP Operator) provides a declarative API for managing and running ML pipelines with Resource Definitions on multiple providers. A provider is a runtime environment for managing and executing ML pipelines and related resources.</p> +<<<<<<< HEAD <h3 id="why-kfp-operator">Why KFP Operator</h3> <p>We started this project to promote the best engineering practices in the Machine Learning process, while reducing the operational overhead associated with deploying, running and maintaining training pipelines. We wanted to move away from a manual, opaque, copy-and-paste style deployment and closer to a declarative, traceable, and self-serve approach.</p> <p>By configuring simple Kubernetes resources, machine learning practitioners can run their desired training pipelines in each environment on the path to production in a repeatable, testable and scalable way. When linked with serving components, this provides a fully testable path to production for machine learning systems.</p> @@ -122,6 +123,60 @@ A provider is a runtime environment for managing and executing ML pipelines and <p>Through separating training code from infrastructure, KFP Operator provides the link between CD and CT to provide Level 2 of the <a href="https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning#mlops_level_2_cicd_pipeline_automation">MLOps Maturity model</a>.</p> <p><img src="https://sky-uk.github.io/kfp-operator/images/mlops-maturity.svg" alt="mlops maturity level"></p>Docs: Pipelinehttps://sky-uk.github.io/kfp-operator/docs/reference/resources/pipeline/Mon, 01 Jan 0001 00:00:00 +0000https://sky-uk.github.io/kfp-operator/docs/reference/resources/pipeline/ +======= +<h2 id="compatibility">Compatibility</h2> +<p>The operator currently supports</p> +<ul> +<li>TFX Pipelines with Python 3.7 and 3.9 - pipelines created using the KFP DSL are not supported yet</li> +<li>KFP standalone (a full KFP installation is not supported yet) and Vertex AI</li> +</ul> +<h2 id="tfx-pipelines-and-components">TFX Pipelines and Components</h2> +<p>Unlike imperative Kubeflow Pipelines deployments, the operator takes care of providing all environment-specific configuration and setup for the pipelines. Pipeline creators therefore don&rsquo;t have to provide DAG runners, metadata configs, serving directories, etc. Furthermore, pusher is not required and the operator can extend the pipeline with this very environment-specific component.</p> +<p>For running a pipeline using the operator, only the list of TFX components needs to be returned. Everything else is done by the operator. See the <a href="https://github.com/sky-uk/kfp-operator/blob/master/docs-gen/includes/quickstart/penguin_pipeline/pipeline.py">penguin pipeline</a> for an example.</p> +<h3 id="lifecycle-phases-and-parameter-types">Lifecycle phases and Parameter types</h3> +<p>TFX Pipelines go through certain lifecycle phases that are unique to this technology. It is helpful to understand where these differ and where they are executed.</p> +<p><strong>Development:</strong> Creating the components definition as code.</p> +<p><strong>Compilation:</strong> Applying compile-time parameters and defining the execution runtime (aka DAG runner) for the pipeline to be compiled into a deployable artifact.</p> +<p><strong>Deployment:</strong> Creating a pipeline representation in the target environment.</p> +<p><strong>Running:</strong> Instantiating the pipeline, applying runtime parameters and running all pipeline steps involved to completion.</p> +<p><em>Note:</em> Local runners usually skip compilation and deployment and run the pipeline straight away.</p> +<p>TFX allows the parameterization of Pipelines in most lifecycle stages:</p> +<table> +<thead> +<tr> +<th>Parameter type</th> +<th>Description</th> +<th>Example</th> +</tr> +</thead> +<tbody> +<tr> +<td>Named Constants</td> +<td>Code constants</td> +<td>ANN layer size</td> +</tr> +<tr> +<td>Compile-time parameter</td> +<td>Parameters that are unlikely to change between pipeline runs supplied as environment variabels to the pipeline function</td> +<td>Bigquery dataset</td> +</tr> +<tr> +<td>Runtime parameter</td> +<td>Parameters exposed as TFX <a href="https://www.tensorflow.org/tfx/api_docs/python/tfx/v1/dsl/experimental/RuntimeParameter?hl=en">RuntimeParameter</a> which can be overridden at runtime allow simplified experimentation without having to recompile the pipeline</td> +<td>Number of training runs</td> +</tr> +</tbody> +</table> +<p>The pipeline operator supports the application of compile time and runtime parameters through its custom resources. We strongly encourage the usage of both of these parameter types to speed up development and experimentation lifecycles. Note that Runtime parameters can initialised to default values from both constants and compile-time parameters</p> +<h2 id="eventing-support">Eventing Support</h2> +<p>The Kubeflow Pipelines operator can optionally be installed with <a href="https://argoproj.github.io/argo-events/">Argo-Events</a> eventsources which lets users react to events.</p> +<p>Currently, we support the following eventsources:</p> +<ul> +<li><a href="../../reference/run-completion">Run Completion Eventsource</a></li> +</ul> +<h2 id="architecture-overview">Architecture Overview</h2> +<p><img src="https://sky-uk.github.io/kfp-operator/images/architecture.svg" alt="architecture.svg"></p>Docs: Pipelinehttps://sky-uk.github.io/kfp-operator/docs/reference/resources/pipeline/Mon, 01 Jan 0001 00:00:00 +0000https://sky-uk.github.io/kfp-operator/docs/reference/resources/pipeline/ +>>>>>>> 3da041c (Add autogenerated website files) <p>The Pipeline resource represents the lifecycle of ML pipelines. Pipelines can be created, updated and deleted via this resource. The operator compiles the pipeline into a deployable artifact while providing compile time parameters as environment variables.