Skip to content

Commit

Permalink
add some documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
hawkowl committed Oct 11, 2024
1 parent efc24b4 commit 3176e49
Show file tree
Hide file tree
Showing 6 changed files with 56 additions and 1 deletion.
1 change: 0 additions & 1 deletion docs/mimo.md

This file was deleted.

22 changes: 22 additions & 0 deletions docs/mimo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# MIMO Documentation

The Managed Infrastructure Maintenance Operator, or MIMO, is a component of the Azure Red Hat OpenShift Resource Provider (ARO-RP) which is responsible for automated maintenance of clusters provisioned by the platform.
MIMO specifically focuses on "managed infrastructure", the parts of ARO that are deployed and maintained by the RP and ARO Operator instead of by OCP (in-cluster) or Hive (out-of-cluster).

MIMO consists of two main components, the [Actuator](./actuator.md) and the [Scheduler](./scheduler.md). It is primarily interfaced with via the [Admin API](./admin-api.md).

## A Primer On MIMO

The smallest thing that you can tell MIMO to run is a **Task** (see [`pkg/mimo/tasks/`](../../pkg/mimo/tasks/)).
A Task is composed of reusable **Steps** (see [`pkg/mimo/steps/`](../../pkg/mimo/steps/)), reusing the framework utilised by AdminUpdate/Update/Install methods in `pkg/cluster/`.
A Task only runs in the scope of a singular cluster.
These steps are run in sequence and can return either **Terminal** errors (causing the ran Task to fail and not be retried) or **Transient** errors (which indicates that the Task can be retried later).

Tasks are executed by the **Actuator** by way of creation of a **Maintenance Manifest**.
This Manifest is created with the cluster ID (which is elided from the cluster-scoped Admin APIs), the Task ID (which is currently a UUID), and optional priority, "start after", and "start before" times which are filled in with defaults if not provided.
The Actuator will treat these Maintenance Manifests as a work queue, taking ones which are past their "start after" time and executing them in order of earliest start-after and priority.
After running each, a state will be written into the Manifest (with optional free-form status text) with the result of the ran Task.
Manifests past their start-before times are marked as having a "timed out" state and not ran.

Currently, Manifests are created by the Admin API.
In the future, the Scheduler will create some these Manifests depending on cluster state/version and wall-clock time, providing the ability to perform tasks like rotations of secrets autonomously.
30 changes: 30 additions & 0 deletions docs/mimo/actuator.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Managed Infrastructure Maintenance Operator: Actuator

The Actuator is the MIMO component that performs execution of tasks.
The process of running tasks looks like this:

```mermaid
graph TD;
START((Start))-->QUERY;
QUERY[Fetch all State = Pending] -->SORT;
SORT[Sort tasks by RUNAFTER and PRIORITY]-->ITERATE[Iterate over tasks];
ITERATE-- Per Task -->ISEXPIRED;
subgraph PerTask[ ]
ISEXPIRED{{Is RUNBEFORE > now?}}-- Yes --> STATETIMEDOUT([State = TimedOut]) --> CONTINUE[Continue];
ISEXPIRED-- No --> DEQUEUECLUSTER;
DEQUEUECLUSTER[Claim lease on OpenShiftClusterDocument] --> DEQUEUE;
DEQUEUE[Actuator dequeues task]--> ISRETRYLIMIT;
ISRETRYLIMIT{{Have we retried the task too many times?}} -- Yes --> STATETIMEDOUT;
ISRETRYLIMIT -- No -->STATEINPROGRESS;
STATEINPROGRESS([State = InProgress]) -->RUN[[Task is run]];
RUN -- Success --> SUCCESS
RUN-- Terminal Error-->TERMINALERROR;
RUN-- Transient Error-->TRANSIENTERROR;
SUCCESS([State = Completed])-->DELEASECLUSTER
TERMINALERROR([State = Failed])-->DELEASECLUSTER;
TRANSIENTERROR([State = Pending])-->DELEASECLUSTER;
DELEASECLUSTER[Release Lease on OpenShiftClusterDocument] -->CONTINUE;
end
CONTINUE-->ITERATE;
ITERATE-- Finished -->END;
```
Empty file added docs/mimo/admin-api.md
Empty file.
3 changes: 3 additions & 0 deletions docs/mimo/scheduler.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# MIMO Scheduler

The MIMO Scheduler is a planned component, but is not yet implemented.
1 change: 1 addition & 0 deletions docs/mimo/writing-tasks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Writing MIMO Tasks

0 comments on commit 3176e49

Please sign in to comment.