Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
183 changes: 183 additions & 0 deletions patterns/cross-cycle-consensus-relay.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
---
title: Cross-Cycle Consensus Relay
status: emerging
authors: ["Nikita Dmitrieff (@NikitaDmitrieff)"]
based_on: ["auto-co autonomous AI company framework"]
category: Orchestration & Control
source: "https://github.com/NikitaDmitrieff/auto-co-meta"
tags: [multi-agent, state-management, persistence, long-running-tasks, orchestration, autonomous-loops]
---

## Problem

Autonomous multi-agent loops that run across many cycles (minutes, hours, or days) need a way to reliably transfer context, decisions, and next actions between cycles. In-memory state is lost on crash or restart. Generic checkpoint files don't encode the *structured reasoning* — what was decided, why, and what comes next — that agents need to make good decisions in subsequent cycles.

Without a structured relay mechanism, autonomous loops suffer from:
- **Drift**: each cycle restarts without awareness of prior decisions
- **Repetition**: agents re-debate already-settled questions
- **Stalls**: no convergence signal — agents loop indefinitely without shipping

## Solution

Each agent cycle reads a **consensus relay document** at the start, executes its work, then writes an updated consensus document at the end. The document is not just a checkpoint — it's a structured handoff that encodes current phase, completed actions, active decisions, open questions, and the single most important **next action** for the following cycle.

The relay document is written to a temp file and atomically renamed to prevent partial-write corruption. Every field is a deliberate relay signal, not just a log.

**Core pattern:**

```bash
# auto-loop.sh — the minimal autonomous loop
while true; do
# Each cycle reads the relay baton, executes, writes a new one
claude -p "$(cat PROMPT.md)" \
--context "$(cat memories/consensus.md)"
sleep $CYCLE_INTERVAL
done
```

**Relay document structure:**

```markdown
# Relay Consensus

## Last Updated

2026-03-07T03:00:00Z

## Current Phase

Integration and validation in progress

## What We Did This Cycle

- Completed the deployment checklist for the worker service
- Validated the retry path against the staging environment

## Key Decisions Made

- Standardized on a single relay schema for all cycles
- Deferred load testing until the integration path is stable

## Active Projects

- worker-service: deployed to staging — next: add latency instrumentation
- api-integration: in progress — next: finalize auth handoff

## Metrics

- Successful runs: 42
- Failed runs: 3
- Mean cycle time: 11m

## Next Action

Finish auth handoff between the scheduler and the API gateway

## Open Questions

- Should stalled-cycle detection trigger after 2 repeats or 3?
```

**Atomic write protocol:**

```bash
# Write to temp first, then rename — prevents partial-write corruption
cat > memories/.consensus.tmp << EOF
$(generate_updated_consensus)
EOF
mv memories/.consensus.tmp memories/consensus.md
```

## How to use it

**Best for:**

- Long-running autonomous agent loops (multi-cycle, multi-day execution)
- Multi-agent systems where context and decisions must survive restarts
- Teams of agents where shared understanding of "what we've decided" matters
- Any system where you want convergence toward shipping rather than endless discussion

**Implementation steps:**

1. **Define your relay document schema** — every field should serve the *next* cycle, not just log the current one:

```markdown
## Current Phase # Where are we in the journey?
## What We Did # What happened? (recent history)
## Key Decisions # What was decided and why? (institutional memory)
## Active Projects # What's in flight? (state)
## Next Action # Single most important thing next cycle (directive)
## Open Questions # What's unresolved? (forward pressure)
```

2. **Add convergence detection** — if the same Next Action appears for N consecutive cycles, the loop is stalled and must change direction:

```bash
LAST_ACTION=$(grep "## Next Action" -A1 memories/consensus.md | tail -1)
PREV_ACTION=$(grep "## Next Action" -A1 memories/prev-consensus.md | tail -1 2>/dev/null || echo "")

if [ "$LAST_ACTION" = "$PREV_ACTION" ]; then
echo "CONVERGENCE RULE TRIGGERED: same next action twice — forcing direction change"
# Inject this signal into the prompt for the next cycle
fi
```

3. **Encode the relay read explicitly in the prompt** — the agent must know this is a relay document, not just context:

```
Read memories/consensus.md. This is your relay baton — it tells you what
was decided before you, and what you must do next. After completing your
work, update it with what you did, what you decided, and the next action
for the cycle after you.
```

4. **Separate ephemeral logs from relay state** — don't put everything in the consensus. Per-cycle logs go in `logs/`, per-agent outputs go in `docs/<role>/`. The consensus is the relay, not the archive.

**Convergence rules (optional but recommended):**

```
- Cycle 1: Brainstorm. End with top 3 ranked options.
- Cycle 2: Pick #1. Run pre-mortem + validation. GO / NO-GO.
- Cycle 3+: GO → ship artifacts. Discussion FORBIDDEN.
- Same Next Action 2 consecutive cycles → stalled. Change direction now.
- Every cycle after Cycle 2 MUST produce artifacts (files, deployments).
```

## Trade-offs

**Pros:**

- Agents resume with full context even after crash, restart, or long pause
- Structured relay prevents context drift across cycles
- Open questions create forward pressure — cycles don't stall in comfort
- Convergence rules can be encoded directly in the relay document
- Human-readable: anyone can inspect the relay and understand the workflow state
- Git-friendly: diffs show exactly what changed between cycles

**Cons:**

- Schema discipline required — a poorly structured relay document degrades agent reasoning
- Relay grows over time; needs periodic summarization / archival of old decisions
- Single-file relay is a bottleneck for high-frequency loops (>1 cycle/minute)
- Context window limits constrain how large the relay can grow
- Sensitive state (API keys, credentials) must never be written to the relay file

**Operational considerations:**

- Keep the relay document under ~2000 tokens to leave room for agent reasoning
- Archive old cycles to `memories/archive/` when the relay grows unwieldy
- Use atomic writes (write to temp, rename) to prevent partial-write corruption
- Commit the relay to git after each cycle for full audit trail
- Define what "done" looks like in the relay — open-ended phases create drift

## Known Implementations

- [auto-co framework](https://github.com/NikitaDmitrieff/auto-co-meta) — open-source autonomous loop framework using a relay document across repeated cycles

## References

- [Anthropic Engineering: Effective harnesses for long-running agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents)
- [BabyAGI](https://github.com/yoheinakajima/babyagi) — early task-loop example with persistent task artifacts
- Related: [Filesystem-Based Agent State](filesystem-based-agent-state.md) — for checkpointing within a single cycle
- Related: [Proactive Agent State Externalization](proactive-agent-state-externalization.md) — for agents that self-initiate state writes
- Related: [Initializer-Maintainer Dual Agent Architecture](initializer-maintainer-dual-agent.md) — for session handoff artifacts across repeated work cycles
Loading