|
| 1 | +# ADR-017: Pod-Level Defaults and Service Self-Description |
| 2 | + |
| 3 | +**Date:** 2026-03-22 |
| 4 | +**Status:** Accepted |
| 5 | +**Depends on:** ADR-004 (Service Surface Skills), ADR-013 (Context Feeds) |
| 6 | +**Implementation:** docs/plans/2026-03-22-pod-defaults-and-service-self-description.md |
| 7 | + |
| 8 | +## Context |
| 9 | + |
| 10 | +Clawdapus treats agents as untrusted workloads governed by operator-authored pod contracts. The pod file (`claw-pod.yml`) is the deployment source of truth — inspectable, diffable, deterministic. `claw up` compiles it into runtime artifacts. |
| 11 | + |
| 12 | +Two problems have emerged as pods grow: |
| 13 | + |
| 14 | +1. **Operator repetition.** Every claw in a pod repeats the same `cllama`, `cllama-env`, `surfaces`, `feeds`, and `skills` blocks even when they're identical. Tiverton-house (5 claws, 1 infra service, 1 governor) has four services sharing identical cllama, feeds, and surfaces stanzas. YAML anchors mitigate visual noise but not structural duplication. |
| 15 | + |
| 16 | +2. **Service knowledge is in the wrong place.** Feeds are declared by consumers, not providers. A claw that wants market data must know the trading API's endpoint path, TTL, and auth scheme. Service skills are either a single extracted markdown file (`claw.skill.emit`) or a generic hostname+ports stub. Services cannot advertise their capabilities in a structured way that the pod compiler can consume. |
| 17 | + |
| 18 | +Both problems share a root cause: the pod surface lacks inheritance and the compilation pipeline lacks a service descriptor contract. |
| 19 | + |
| 20 | +## Decision |
| 21 | + |
| 22 | +### 1. Pod-Level Defaults |
| 23 | + |
| 24 | +Pod-level `x-claw` gains four new default fields alongside the existing `handles-defaults`: |
| 25 | + |
| 26 | +- `cllama-defaults` — proxy type and provider env keys |
| 27 | +- `surfaces-defaults` — surface list |
| 28 | +- `feeds-defaults` — feed list |
| 29 | +- `skills-defaults` — skill file list |
| 30 | + |
| 31 | +Every claw-managed service inherits these unless it declares its own value for that field. |
| 32 | + |
| 33 | +### 2. Replace-on-Declare with Spread |
| 34 | + |
| 35 | +Override semantics follow one rule: **if a service declares a list field, it replaces the defaults entirely.** |
| 36 | + |
| 37 | +To extend defaults rather than replace them, the service uses a `...` spread token in the list: |
| 38 | + |
| 39 | +```yaml |
| 40 | +skills: |
| 41 | + - ... # defaults expand here |
| 42 | + - ./policy/escalation.md # then this is appended |
| 43 | +``` |
| 44 | +
|
| 45 | +- No `...` → full replacement |
| 46 | +- `...` present → defaults splice at that position |
| 47 | +- At most one `...` per list |
| 48 | + |
| 49 | +This is more expressive than any standard YAML merge convention while remaining unambiguous. `cllama-defaults.env` is a map and merges additively (service keys win on collision), matching the existing `handles-defaults` pattern. |
| 50 | + |
| 51 | +### 3. Service Self-Description (`claw.describe`) |
| 52 | + |
| 53 | +Services declare a structured JSON descriptor via image label: |
| 54 | + |
| 55 | +```dockerfile |
| 56 | +LABEL claw.describe=/app/.claw-describe.json |
| 57 | +``` |
| 58 | + |
| 59 | +The descriptor advertises feeds provided, auth requirements, a human-readable description, and an optional skill file path. `claw up` extracts it from the image (same mechanism as `claw.skill.emit`) and compiles it into the pod manifest. |
| 60 | + |
| 61 | +The descriptor does not contain a service name — deployment identity comes from the pod YAML, not the image. One image can back multiple services. |
| 62 | + |
| 63 | +### 4. Provider-Owned Feeds with Consumer Subscription |
| 64 | + |
| 65 | +Feeds move from consumer-declared to provider-declared. A service's descriptor advertises its feeds. Consumers subscribe by name: |
| 66 | + |
| 67 | +```yaml |
| 68 | +feeds: [market-context] |
| 69 | +``` |
| 70 | + |
| 71 | +`claw up` resolves the name against a feed registry built from service descriptors. Explicit feed declarations (source + path + ttl) bypass the registry and work as before. |
| 72 | + |
| 73 | +Resolution happens in `claw up` after image inspection, not in the parser. The parser stores unresolved feed names; `claw up` resolves them once the registry exists. |
| 74 | + |
| 75 | +### 5. Unified Context Document |
| 76 | + |
| 77 | +Generated surface and handle skill files are collapsed into CLAWDAPUS.md. One generated context document per agent instead of N files. Service descriptions retain their contractual weight — they're still injected into `AGENTS.generated.md` as guide content, just sourced from CLAWDAPUS.md sections rather than separate files. |
| 78 | + |
| 79 | +Operator-authored skills (policy files, includes with `mode: reference`) remain as separate mounted files. |
| 80 | + |
| 81 | +### 6. Compile-Time Only |
| 82 | + |
| 83 | +All registration and description happens during `claw up`. No runtime self-registration endpoints. The generated compose file and runtime artifacts remain the single source of truth for what's deployed. This preserves the inspectable, diffable deployment model that is Clawdapus's core value proposition. |
| 84 | + |
| 85 | +## Rationale |
| 86 | + |
| 87 | +**Why not runtime self-registration?** Clawdapus's value is deterministic, auditable deployment. If services register at boot, the running state diverges from the pod file. The right version is image self-description compiled by `claw up`. |
| 88 | + |
| 89 | +**Why replace-on-declare instead of always-merge?** List merging is inherently ambiguous (append? prepend? deduplicate by what?). Every system that attempts it (Helm, Kustomize, Ansible) ends up with surprising edge cases. Replace is the simplest default. The `...` spread provides controlled extension when needed. |
| 90 | + |
| 91 | +**Why no `name` in the descriptor?** A single image can back multiple compose services. Tiverton-house uses one hermes base image for all traders. Binding the descriptor to a service name would break image reuse. |
| 92 | + |
| 93 | +**Why two-phase feed resolution?** The parser has no image knowledge. Descriptors are extracted from images during `claw up`. Trying to resolve feed names in the parser would require passing image inspection state into the YAML parser, coupling two independent phases. |
| 94 | + |
| 95 | +## Consequences |
| 96 | + |
| 97 | +**Positive:** |
| 98 | +- Pod files shrink dramatically. Tiverton-house's per-service `x-claw` blocks reduce to identity + overrides only. |
| 99 | +- Services self-describe their feeds, auth, and capabilities. Consumers subscribe by name. |
| 100 | +- One generated context document per agent instead of N redundant files. |
| 101 | +- The `...` spread convention is simple and more expressive than any standard YAML merge. |
| 102 | +- RailsTrail (and future framework adapters) can generate descriptors from introspection, closing the loop between app code and pod contracts. |
| 103 | + |
| 104 | +**Negative:** |
| 105 | +- Breaking change to pod YAML surface. Tiverton-house pod must be rewritten. (Acceptable — it's the only production pod.) |
| 106 | +- Two-phase feed resolution adds a step to `claw up`. Unresolved feeds are a new error category. |
| 107 | +- `claw.skill.emit` becomes redundant once `claw.describe` with `skill` field is live. Deprecation timeline TBD. |
| 108 | +- The `...` spread is a custom convention. Operators must learn it. (Mitigated by simplicity — one token, one rule.) |
0 commit comments