Kubernetes operator for managing heterogeneous AI agent runtimes. One CRD, any runtime, production-ready from day one.
k8s4claw lets you deploy, connect, and operate AI agents on Kubernetes the same way you manage any other workload — declaratively, with built-in persistence, auto-updates, and inter-agent communication.
kind cluster → deploy agent → kubectl get pods → WebSocket chat
Running AI agents in production means solving the same problems over and over: secret management, persistent storage, graceful updates, inter-service communication, and observability. k8s4claw wraps all of this into a single Claw CRD so you can focus on what your agent does, not how it runs.
graph TB
subgraph "Kubernetes Cluster"
OP[k8s4claw Operator]
subgraph "Claw Pod"
INIT["claw-init<br/>(config merge)"]
RT["Runtime Container<br/>(OpenClaw / NanoClaw / ...)"]
IPC["IPC Bus Sidecar<br/>(WAL + DLQ + backpressure)"]
CH["Channel Sidecar<br/>(Slack / Webhook / ...)"]
ARC["Archive Sidecar<br/>(S3 upload)"]
end
STS[StatefulSet]
SVC[Service]
CM[ConfigMap]
SA[ServiceAccount]
PDB[PodDisruptionBudget]
PVC[(PVCs<br/>session / output / workspace)]
SEC[/Secrets/]
OP -->|manages| STS
OP -->|manages| SVC
OP -->|manages| CM
OP -->|manages| SA
OP -->|manages| PDB
STS -->|creates| PVC
STS -.->|runs| INIT
STS -.->|runs| RT
STS -.->|runs| IPC
STS -.->|runs| CH
STS -.->|runs| ARC
CH <-->|"UDS<br/>bus.sock"| IPC
IPC <-->|"WS / TCP / UDS / SSE"| RT
RT -->|reads| CM
RT -->|reads| SEC
ARC -->|mounts| PVC
end
REG[(OCI Registry)]
OBJ[(S3 / MinIO)]
EXT["External Service<br/>(Slack API, etc.)"]
OP -.->|"polls tags<br/>(auto-update)"| REG
ARC -->|uploads| OBJ
CH <-->|API calls| EXT
The IPC Bus is a native sidecar that routes JSON messages between channel sidecars and the AI runtime:
Channel Sidecar ──UDS──► IPC Bus ──Bridge──► Runtime Container
│ WAL │
│ DLQ │
│ Ring │
│Buffer │
└───────┘
- WAL — append-only write-ahead log for at-least-once delivery
- DLQ — BoltDB dead letter queue for messages exceeding retry limits
- Backpressure — ring buffer with high/low watermark flow control
- Bridge protocols — WebSocket (OpenClaw), TCP (PicoClaw), UDS (NanoClaw), SSE (ZeroClaw)
| Runtime | Language | Use Case | Gateway | Probe |
|---|---|---|---|---|
| OpenClaw | TypeScript/Node.js | Full-featured AI assistant platform | 18900 | HTTP |
| NanoClaw | TypeScript/Node.js | Lightweight secure personal assistant | 19000 | TCP |
| ZeroClaw | Rust | High-performance agent runtime | 3000 | HTTP |
| PicoClaw | — | Ultra-minimal serverless agent | 8080 | TCP |
| IronClaw | Rust + WASM | Security/privacy-focused AI assistant | 3001 | HTTP |
| Custom | Any | Bring your own runtime | — | — |
- Kubernetes cluster (v1.28+, or kind / minikube for local dev)
- kubectl configured
- Go 1.23+ (for building from source)
git clone https://github.com/Prismer-AI/k8s4claw.git
cd k8s4claw
# Install CRDs into the cluster
make install
# Run operator locally (or deploy with `make deploy`)
make runkubectl create secret generic llm-api-keys \
--from-literal=ANTHROPIC_API_KEY=sk-ant-xxx \
--from-literal=OPENAI_API_KEY=sk-xxx# my-agent.yaml
apiVersion: claw.prismer.ai/v1alpha1
kind: Claw
metadata:
name: my-agent
spec:
runtime: openclaw
config:
model: "claude-sonnet-4"
credentials:
secretRef:
name: llm-api-keys
persistence:
session:
enabled: true
size: 2Gi
mountPath: /data/session
workspace:
enabled: true
size: 10Gi
mountPath: /workspacekubectl apply -f my-agent.yaml
# Watch it come up
kubectl get claw my-agent -wapiVersion: claw.prismer.ai/v1alpha1
kind: ClawChannel
metadata:
name: slack-team
spec:
type: slack
mode: bidirectional
credentials:
secretRef:
name: slack-bot-token
config:
appId: "A0123456789"Then reference it in your Claw:
spec:
channels:
- name: slack-team
mode: bidirectional- StatefulSet-based with PVC lifecycle, PDB, NetworkPolicy, Ingress
- Per-runtime resource defaults, probes, and graceful shutdown tuning
- Webhook validation: credential requirements, PVC immutability, runtime type lock
- OCI registry polling on cron schedule
- Semver constraint filtering (
^1.x,~2.0.0) - Health-verified rollouts with configurable timeout
- Automatic rollback + circuit breaker after N failures
- Session, output, and workspace PVCs via StatefulSet volumeClaimTemplates
- CSI VolumeSnapshot on cron schedule with retention pruning
- S3-compatible archival sidecar (S3, MinIO, GCS, R2)
import "github.com/Prismer-AI/k8s4claw/sdk"
client, err := sdk.NewClient()
if err != nil {
log.Fatal(err)
}
claw, err := client.Create(ctx, &sdk.ClawSpec{
Runtime: sdk.OpenClaw,
Config: &sdk.RuntimeConfig{
Environment: map[string]string{"MODEL": "claude-sonnet-4"},
},
})make build # Build operator binary
make build-ipcbus # Build IPC Bus binary
make test # Run tests (requires setup-envtest)
make lint # Lint
make vet # Run go vet
make fmt # Run gofmt + goimports
make manifests # Generate CRD YAML
make generate # Generate deepcopy
make docker-build # Build container imageSee CONTRIBUTING.md for the full development guide.
Apache-2.0