Kubernetes Jobs Manager Operator

Kubernetes Jobs Manager Operator

Description

This operator is responsible for managing the lifecycle of complicated workflows which consist of multiple jobs and making their management easy, without need for dozens of yaml files and doing magic with ordering.

Features

Workflow orchestration - Define complex job workflows with dependency management
Parameter inheritance - DRY configuration with params merged from workflow → group → job levels
Parallel execution - Run jobs in parallel within groups or sequentially with dependencies
Prometheus metrics - Built-in observability with metrics for jobs created/succeeded/failed, active jobs count, and reconciliation duration
Resource limits - Configure CPU and memory limits for job containers
Finalizers - Proper cleanup of child jobs when ManagedJob resources are deleted
kubectl plugin - Visualize job dependency graphs directly from the command line
O(1) dependency lookup - Optimized performance for large workflows

Getting Started

Installation with helm

helm repo add raczylo https://lukaszraczylo.github.io/helm-charts/
helm repo update raczylo
helm install jobs-manager raczylo/jobs-manager

Prerequisites for local runs

go v1.16+
kustomize v3.5.4+
docker v19.03.8+
kubectl v1.18.2+

Jobs configuration

apiVersion: jobsmanager.raczylo.com/v1beta1
kind: ManagedJob
metadata:
  labels:
  name: managedjob-sample
spec:
  retries: 3
  params:
    env:
      - name: "FOO"
        value: "bar"
      - name: "QUE"
        value: "pasa"

  # Job groups definitions
  groups:
    - name: "first-group"
      parallel: true
      params:
        env:
          - name: "FEE"
            value: "bee"
      jobs:
        - name: "first-job"
          image: "busybox"
          args:
            - "echo"
            - "Hello world!"
          params:
            env:
              - name: "POO"
                value: "paz"

        - name: "second-job"
          image: "busybox"
          args:
            - "sleep"
            - "10"
        - name: "second-half-job"
          image: "busybox"
          args:
            - "sleep"
            - "10"

    - name: "second-group"
      parallel: true
      jobs:
        - name: "third-job"
          image: "busybox"
          args:
            - "echo"
            - "Hello world!"
          parallel: true

        - name: "fourth-job"
          image: "busybox"
          args:
            - "sleep"
            - "10"
          parallel: false

    - name: "third-group"
      parallel: false
      jobs:
        - name: "fifth-job"
          image: "busybox"
          args:
            - "echo"
            - "Hello world!"
          parallel: true

Examples

More example manifests are available in the config/samples/ directory:

Example	Description
Quick Start	Simple "Hello World" with two sequential jobs
Parallel Processing	Fan-out/fan-in pattern with parallel job execution
Comprehensive Demo	Full-featured example showcasing all capabilities: ConfigMaps, Secrets, volumes, resource limits, multi-group dependencies, and parameter inheritance

How does it look in practice?

managedjob-sample
├── first-group
│   ├── first-job
│   ├── second-job
│   │   └── Depends on: managedjob-sample-first-group-first-job
│   └── second-half-job
│       ├── Depends on: managedjob-sample-first-group-first-job
│       └── Depends on: managedjob-sample-first-group-second-job
├── second-group
│   ├── third-job
│   └── fourth-job
│       └── Depends on: managedjob-sample-second-group-third-job
└── third-group
    ├── fifth-job
    ├── Depends on group: first-group
    └── Depends on group: second-group

If dependency exists on the group level - the group will not be executed until all of remaining groups have finished successfuly. If dependency exists on the job level - the job will not be executed until all of remaining jobs have finished successfuly. Remember that ORDER matters.

Things to remember

Parameters params are always merged downwards to DRY your definitions. In this case - result for the first job will look like this:

    - jobs:
      - args:
        - echo
        - Hello world!
        compiledParams:
          env:
          - name: POO
            value: paz
          - name: FEE
            value: bee
          - name: FOO
            value: bar
          - name: QUE
            value: pasa
        image: busybox
        name: first-job
        parallel: false
        status: succeeded

Available params

There's quite a lot of of flexibility with parameters. On every level where params are allowed, you can define:

params:
  fromEnv:
    - configMapRef:
        name: "configmap-name"
      key: "key-name"
  env:
    - name: "FOO"
      value: "bar"
  volumes:
    - name: secrets-store-api
      csi:
        driver: secrets-store.csi.k8s.io
        readOnly: true
        volumeAttributes:
          secretProviderClass: api-secrets-provider
  volumeMount:
    - name: secrets-store-api
      mountPath: "/mnt/secrets-api"
      readOnly: true
  serviceAccount: "service-account-name"
  restartPolicy: "Never"
  imagePullSecrets:
    - "ghcr-token"
  imagePullPolicy:
    - "Always"
  labels:
    this/works: "true"
  annotations:
    this/works/aswell: "true"
  resources:
    limits:
      cpu: "500m"
      memory: "256Mi"
    requests:
      cpu: "100m"
      memory: "128Mi"

Resource limits

You can configure resource limits for job containers at any level (workflow, group, or job). These follow the standard Kubernetes resource requirements:

params:
  resources:
    limits:
      cpu: "1"
      memory: "512Mi"
    requests:
      cpu: "250m"
      memory: "128Mi"

Kustomization and references

In case of any issues with configmapGenerator or secretGenerator, please add following to your kustomization.yaml:

configurations:
  - crd-name-reference.yaml

Then you can create crd-name-reference.yaml file with following content:

---
nameReference:
  - kind: 'ConfigMap'
    fieldSpecs:
      - kind: 'ManagedJob'
        path: 'spec/params/fromEnv[]/configMapRef/name'
      - kind: 'ManagedJob'
        path: 'spec/params/env[]/configMapRef/name'

This will instruct kustomize to replace all references to configmaps with their names if they are managed by generators.

Running on the cluster

Manual installation

Install Instances of Custom Resources:

kubectl apply -f config/samples/

Build and push your image to the location specified by IMG:

make docker-build docker-push IMG=ghcr.io/lukaszraczylo/jobs-manager-operator:tag

Deploy the controller to the cluster with the image specified by IMG:

make deploy IMG=ghcr.io/lukaszraczylo/jobs-manager-operator:tag

Manually uninstall CRDs

To delete the CRDs from the cluster:

make uninstall

Manually undeploy controller

UnDeploy the controller from the cluster:

make undeploy

How it works

This project aims to follow the Kubernetes Operator pattern.

It uses Controllers, which provide a reconcile function responsible for synchronizing resources until the desired state is reached on the cluster.

kubectl Plugin

The operator includes a kubectl plugin for visualizing job dependency graphs. Install it using the provided script:

curl -sSL https://raw.githubusercontent.com/lukaszraczylo/jobs-manager-operator/main/scripts/install-plugin.sh | bash

Once installed, you can visualize ManagedJob workflows:

# Visualize a ManagedJob workflow as an ASCII tree with status colors
kubectl managedjob visualize <managedjob-name> -n <namespace>

# Watch mode - continuously update the visualization
kubectl managedjob visualize <managedjob-name> -w

# Show all ManagedJobs in a namespace
kubectl managedjob list -n <namespace>

# Show status summary for a specific workflow
kubectl managedjob status <managedjob-name> -n <namespace>

Status colors in the visualization:

Green: succeeded
Yellow: running
Red: failed
Gray: pending

Observability

Prometheus Metrics

The operator exposes the following Prometheus metrics:

Metric	Type	Labels	Description
`managedjob_jobs_created_total`	Counter	namespace, workflow, group	Total number of jobs created
`managedjob_jobs_succeeded_total`	Counter	namespace, workflow, group	Total number of jobs that succeeded
`managedjob_jobs_failed_total`	Counter	namespace, workflow, group	Total number of jobs that failed
`managedjob_active_jobs`	Gauge	namespace, workflow	Current number of active jobs
`managedjob_reconciliation_duration_seconds`	Histogram	namespace, workflow	Time taken for reconciliation

ServiceMonitor

If you're using the Prometheus Operator, a ServiceMonitor is included in the Helm chart. Enable it in your values:

serviceMonitor:
  enabled: true
  interval: 30s
  labels: {}

Configuration

The operator supports the following configuration options:

Flag	Environment Variable	Description	Default
`--leader-election-id`	-	Custom leader election ID	`jobs-manager-operator`
`--dev-mode`	-	Enable development logging mode	`false`
-	`LOG_LEVEL`	Logging level (debug, info, warn, error)	`info`

Example Helm values for configuration:

controllerManager:
  manager:
    leaderElectionId: "my-custom-id"
    devMode: true
    env:
      LOG_LEVEL: "debug"

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.github		.github
api/v1beta1		api/v1beta1
charts/jobs-manager-operator		charts/jobs-manager-operator
cmd/kubectl-managedjob		cmd/kubectl-managedjob
config		config
controllers		controllers
hack		hack
pkg/visualization		pkg/visualization
plugins/krew		plugins/krew
public		public
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
CODEOWNERS		CODEOWNERS
Dockerfile		Dockerfile
Dockerfile.goreleaser		Dockerfile.goreleaser
Makefile		Makefile
PROJECT		PROJECT
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go
semver.yaml		semver.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kubernetes Jobs Manager Operator

Description

Features

Getting Started

Installation with helm

Prerequisites for local runs

Jobs configuration

Examples

How does it look in practice?

Things to remember

Available params

Resource limits

Kustomization and references

Running on the cluster

Manual installation

Manually uninstall CRDs

Manually undeploy controller

How it works

kubectl Plugin

Observability

Prometheus Metrics

ServiceMonitor

Configuration

License

About

Uh oh!

Releases 28

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Kubernetes Jobs Manager Operator

Description

Features

Getting Started

Installation with helm

Prerequisites for local runs

Jobs configuration

Examples

How does it look in practice?

Things to remember

Available params

Resource limits

Kustomization and references

Running on the cluster

Manual installation

Manually uninstall CRDs

Manually undeploy controller

How it works

kubectl Plugin

Observability

Prometheus Metrics

ServiceMonitor

Configuration

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 28

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages