Skip to content

Commit

Permalink
docs: Add glossary to documentation (#1127)
Browse files Browse the repository at this point in the history
  • Loading branch information
aneojgurhem authored Jun 14, 2023
2 parents 017599f + 84bbd86 commit 19e2555
Showing 1 changed file with 79 additions and 0 deletions.
79 changes: 79 additions & 0 deletions .docs/content/0.armonik/1.glossary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Glossary

## Client

User-developed software that communicates with the ArmoniK Control Plane to submit a list of tasks to be executed (by one or several workers) and retrieves results and error.

## Compute Plane

Agreed term designating the set of compute pods, i.e., the pods running a Scheduling Agent + Worker pair within the Kubernetes cluster.

## Container

A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another. (see Docker [documentation](https://www.docker.com/resources/what-container/))

## Control Plane

Agreed term designating two different things depending on the context. Kubernetes and ArmoniK each have a Control Plane.
* The Kubernetes Control Plane is a set of components that manage global decision-making for the cluster, as well as the detection and management of cluster events.
* the ArmoniK Control Plane is a set of applications that act as a bridge between the *client* and the Compute Plane. In particular, it handles communications between certain components such as the queue, Redis and MongoDB.

## Data dependency

Input data for a given task that depends on another unique task. Data dependencies formalize dependencies between tasks.

## Data Plane

Expression designating the set of software components running the various storage and database systems within ArmoniK.

## Kubernetes

Kubernetes is an open source container orchestration engine for automating deployment, scaling and management of containerized applications. The open source project is hosted by the Cloud Native Computing Foundation ([CNCF](https://www.cncf.io/about)) (see Kubernetes [documentation](https://kubernetes.io/docs/home/#:~:text=Kubernetes%20is%20an%20open%20source,and%20management%20of%20containerized%20applications.)).

## MongoDB

MongoDB is a document database designed for ease of application development and scaling (see MongoDB [documentation](https://www.mongodb.com/docs/manual/)).

## Node

In the context of Kubernetes, a node is a machine that runs containerized workloads as part of a Kubernetes cluster. A node can be a physical or virtual machine hosted in the cloud or on-premise.

## Partition

Logical segmentation of the Kubernetes cluster's pool of machines to distribute workloads according to usage. This feature is provided and handled by ArmoniK.

## Payload

Input data for a task that does not depend on any other task.

## Pod

Pods are the smallest deployable units of computing that one can create and manage in Kubernetes. A Pod is a group of one or more containers, with shared storage and network resources and a specification for how to run the containers. A Pod's contents are always co-located and co-scheduled and run in a shared context. A Pod models an application-specific "logical host": it contains one or more application containers which are relatively tightly coupled (see Kubernetes [documentation](https://kubernetes.io/docs/concepts/workloads/pods/)).

## Polling agent

Former term for scheduling agent.

## Redis

Redis is an open source (BSD licensed), in-memory data structure store used as a database, cache, message broker and streaming engine. Redis provides data structures such as strings, hashes, lists, sets and sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes and streams. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster (see [Redis documentation](https://redis.io/docs/about/)). In ArmoniK, Redis is used as a key-value cache for task data (such as payloads and results).

## Scheduling agent

Containerized software cohabiting with a worker within a pod, running a specific algorithm to determine which tasks "its" worker (the one with which it shares the pod) should perform. It also manages all interactions between the worker and the databases (retrieving/saving data, creating new tasks, etc.), as well as managing worker errors and retrying/resubmitting failed tasks when necessary. A scheduling agent, like a worker, exists within a single partition.

## Session

A session is a logical container for tasks and associated data (task statut, results, errors, etc). Every task is submitted within a session. An existing session can be resumed to retrieve data or submit new tasks. When a session is cancelled, all associated executions still in progress are interrupted.

## Submitter

Containerized software in charge of submitting tasks, i.e., writing the corresponding data to the various databases (queue, Redis and MongoDB).

## Task

Atomic computation taking one or several input data and outputting one or several results. A task is launched by a client and processed by a worker. In ArmoniK, tasks cannot communicate with each other. They can, however, depend on each other via their input/output data, known as data dependency.

## Worker

User-developed containerized software capable of performing one or several tasks depending on its implementation. A worker can simply take input data and perform calculations on it to return a result. A worker can also submit new tasks that will be self-performed, or by different workers, other instances of itself.

0 comments on commit 19e2555

Please sign in to comment.