epic: platform foundation roadmap — P0 through P3

## Overview

Platform-level epic tracking all foundation and application work identified in the cross-repo coherence audit and Gastown gap analysis. Organised by phase — each phase has a gate condition that must be met before P1 can begin. Phases P0/P1 address correctness and scale; P2 addresses production quality; P3 expands capability.

Complexity: **S** = 2–5 days · **M** = 1 week · **L** = 2–3 weeks · **XL** = months
Priority within phase: items are listed in recommended execution order.

---

## P0 — Wiring (do first — breaks immediately with multiple agents)

**Gate:** Normative layer functional end-to-end. Trust accumulates from real agent behaviour. Same actor identity across all repos.

| # | Item | Repo | Complexity | Effort | Issue | Untracked deps |
|---|------|------|-----------|--------|-------|----------------|
| 1 | Commitment outcomes → LedgerAttestation (FULFILLED→SOUND, FAILED→FLAGGED) | quarkus-qhorus | Low | S | [qhorus#123](https://github.com/casehubio/quarkus-qhorus/issues/123) | — |
| 2 | ActorTypeResolver utility — unified ActorType derivation across all consumers | quarkus-ledger | Low | S | [ledger#47](https://github.com/casehubio/quarkus-ledger/issues/47) | — |
| 3 | InstanceActorIdProvider SPI — map Qhorus instanceId → ledger actorId (persona format) | quarkus-qhorus | Medium | S | [qhorus#124](https://github.com/casehubio/quarkus-qhorus/issues/124) | ledger#47 done first |
| 4 | Normative→prescriptive wiring — CaseHub work assignments send Qhorus COMMAND | casehub-engine | High | L | [engine#186](https://github.com/casehubio/engine/issues/186) | qhorus#123 done first |

**Why this order:** 1 and 2 are standalone with no inter-dependencies and highest leverage. 3 depends conceptually on the identity model from 2. 4 is last because it depends on the commitment lifecycle being functional (1) and is the most invasive change — it touches CaseContextChangedEventHandler, WorkOrchestrator, and requires a WorkerResponseHandler.

**Risk in P0:** Item 4 (engine#186) requires `CaseLedgerEntry` to be on main for the ledger side to work. If the branch merge (P1.4) is not done first, item 4 must be implemented without ledger integration and revisited. Consider pulling P1.4 into P0 if the branch is close to mergeable.

---

## P1 — Scale (breaks at 10+ concurrent cases/agents)

**Gate:** Can run 10+ simultaneous cases without manual intervention, API exhaustion, or stuck agents. Trust scores drive routing decisions.

| # | Item | Repo | Complexity | Effort | Issue | Notes |
|---|------|------|-----------|--------|-------|-------|
| 1 | Merge CaseLedgerEntry branch (`feat/casehub-ledger-integration`) | casehub-engine | Medium | M | *(not tracked)* | Resolve merge conflicts; verify OTel propagation via @EntityListeners inheritance; add invariant test |
| 2 | Agent concurrency throttling — SpawnThrottle in ClaudonyConfig (global + per-case ceiling, back-pressure queue) | claudony | Medium | M | *(not tracked)* | No inter-dependencies; pure Claudony addition |
| 3 | RecoveryPolicy SPI — detect stalled workers and take action (REPROVISION / ESCALATE / CANCEL / WAIT) | casehub-engine + claudony | Medium | M | *(not tracked)* | SPI in engine api/spi/; ReprovisioningRecoveryPolicy in claudony-casehub |
| 4 | Trust routing wired — WorkerSelectionStrategy injectable in CaseContextChangedEventHandler + TrustWeightedSelectionStrategy | casehub-engine | Medium | M | *(not tracked)* | Depends on P0.1 (trust scores must be computed before routing them) |

**Why this order:** 1 unblocks the compliance story and should be done first. 2 and 3 are independent and can run in parallel. 4 depends on P0.1 being complete — routing by trust is pointless if trust scores are never updated from behaviour.

**Risk in P1:** Item 3 (RecoveryPolicy) requires careful design — what constitutes "stalled" at the casehub-engine level vs the qhorus Watchdog level needs to be clearly defined to avoid double-recovery. The three tiers (qhorus Watchdog → casehub-engine WorkerStatusListener → claudony fleet health) need coordinated stall detection thresholds.

---

## P2 — Production quality (full observability, audit trail, cross-deployment trust)

**Gate:** Full audit trail complete. Case spans correlatable in Jaeger/Grafana. Compliance story holds end-to-end.

| # | Item | Repo | Complexity | Effort | Issue | Notes |
|---|------|------|-----------|--------|-------|-------|
| 1 | OTel trace alignment — PropagationContext.traceId from LedgerTraceIdProvider at case creation | casehub-engine | Low | S | [engine#185](https://github.com/casehubio/engine/issues/185) | One-line change + fallback; quick win |
| 2 | Cross-deployment trust federation — TrustExportService / TrustImportService SPIs | quarkus-ledger | Medium | L | *(not tracked)* | Canonical format design is the hard part; transport (webhook/Kafka) is pluggable |
| 3 | Cross-repo causal chain — causedByEntryId at provisioning; CaseLineageQuery JPA implementation | claudony | High | L | [claudony#94](https://github.com/casehubio/claudony/issues/94) | CaseLineageQuery JPA is non-trivial; requires casehub datasource configured in claudony |

**Why this order:** 1 is a quick win with no dependencies. 2 and 3 are both high-value but complex — run in parallel if capacity allows. 3 depends on CaseLedgerEntry being merged (P1.1).

---

## P3 — Capability expansion (new capabilities on a solid foundation)

**Gate:** P0 and P1 complete. Foundation is solid. Team has capacity for new work.

| # | Item | Repo | Complexity | Effort | Issue | Notes |
|---|------|------|-----------|--------|-------|-------|
| 1 | Notification consolidation — quarkus-work-notifications delegates Slack/Teams to casehub-connectors | quarkus-work + casehub-connectors | Medium | M | [parent#5](https://github.com/casehubio/casehub-parent/issues/5) | Unblocks P3.3 and P3.5 |
| 2 | SLA propagation — case budget bounds child WorkItem and Commitment deadlines | casehub-engine + quarkus-work | Medium | M | [parent#6](https://github.com/casehubio/casehub-parent/issues/6) | Adapter-level change; no foundation changes needed |
| 3 | Critical event notifications — stalled obligations, case faults, escalations → casehub-connectors | qhorus + engine + work + connectors | Medium | M | *(not tracked)* | Depends on P3.1 (unified delivery pipeline first) |
| 4 | Human-in-the-loop end-to-end — casehub-work-adapter: WorkItem COMPLETED → CaseHubReactor.signal() → case continues | casehub-engine + quarkus-work | High | L | *(not tracked)* | Most important HITL integration; currently blocked on engine stability |
| 5 | casehub-assisteddev — AI-assisted development application (merge queue, code review orchestration) | new repo | Very High | XL | *(not tracked — needs its own epic)* | Separate repo; uses foundation primitives; needs domain design first |

---

## Hypothesis test (parallel track — not a blocker for P0-P3)

| # | Item | Repo | Complexity | Effort | Issue | Notes |
|---|------|------|-----------|--------|-------|-------|
| — | Normative layer interoperability experiment — LangChain4j vs CaseHub on production incident scenario | casehub-engine | High | L | [engine#189](https://github.com/casehubio/engine/issues/189) | Can proceed once P0.1 (qhorus#123) is done; generates external evidence for normative layer claims |

---

## Untracked issues to create (P1–P3)

The following items are specified in the roadmap but not yet tracked as GitHub issues:

| Item | Recommended repo | Notes |
|------|-----------------|-------|
| Merge CaseLedgerEntry branch | casehub/engine | May warrant a PR not an issue |
| Agent concurrency throttling (SpawnThrottle) | casehubio/claudony | |
| RecoveryPolicy SPI | casehubio/engine | |
| Trust routing wired (injectable WorkerSelectionStrategy) | casehubio/engine | |
| Cross-deployment trust federation | casehubio/quarkus-ledger | |
| Critical event notifications | casehubio/casehub-parent (cross-repo) | |
| HITL end-to-end (casehub-work-adapter completion) | casehubio/engine | |
| casehub-assisteddev | new repo | Needs its own epic |

---

## Summary

| Phase | Items | Estimated total effort | Gate condition |
|-------|-------|----------------------|----------------|
| **P0 — Wiring** | 4 items | ~3–4 weeks | Normative layer functional; trust accumulates |
| **P1 — Scale** | 4 items | ~4–5 weeks | 10+ agents; no manual intervention needed |
| **P2 — Quality** | 3 items | ~4–6 weeks | Full audit trail; Jaeger correlation |
| **P3 — Expand** | 5 items | ~3 months + XL | New capabilities; casehub-assisteddev is a separate product epic |

**Total to production-quality foundation: ~3–4 months of focused engineering.**
**casehub-assisteddev is a separate product investment beyond the foundation.**

---

## References

- Platform architecture: [casehub-parent PLATFORM.md](https://github.com/casehubio/casehub-parent/blob/main/docs/PLATFORM.md)
- Implementation roadmap: [gastown-casehub-analysis-v2.md §11.4](https://github.com/casehubio/casehub-parent/blob/main/docs/gastown-casehub-analysis-v2.md)
- Platform coherence audit: [casehub-parent#4](https://github.com/casehubio/casehub-parent/issues/4)
- Normative layer body of works: [quarkus-qhorus/docs/normative-framework.md](https://github.com/casehubio/quarkus-qhorus/blob/main/docs/normative-framework.md)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

epic: platform foundation roadmap — P0 through P3 #7

Overview

P0 — Wiring (do first — breaks immediately with multiple agents)

P1 — Scale (breaks at 10+ concurrent cases/agents)

P2 — Production quality (full observability, audit trail, cross-deployment trust)

P3 — Capability expansion (new capabilities on a solid foundation)

Hypothesis test (parallel track — not a blocker for P0-P3)

Untracked issues to create (P1–P3)

Summary

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

#	Item	Repo	Complexity	Effort	Issue	Untracked deps
1	Commitment outcomes → LedgerAttestation (FULFILLED→SOUND, FAILED→FLAGGED)	quarkus-qhorus	Low	S	qhorus#123	—
2	ActorTypeResolver utility — unified ActorType derivation across all consumers	quarkus-ledger	Low	S	ledger#47	—
3	InstanceActorIdProvider SPI — map Qhorus instanceId → ledger actorId (persona format)	quarkus-qhorus	Medium	S	qhorus#124	ledger#47 done first
4	Normative→prescriptive wiring — CaseHub work assignments send Qhorus COMMAND	casehub-engine	High	L	engine#186	qhorus#123 done first

#	Item	Repo	Complexity	Effort	Issue	Notes
1	Merge CaseLedgerEntry branch (`feat/casehub-ledger-integration`)	casehub-engine	Medium	M	(not tracked)	Resolve merge conflicts; verify OTel propagation via @EntityListeners inheritance; add invariant test
2	Agent concurrency throttling — SpawnThrottle in ClaudonyConfig (global + per-case ceiling, back-pressure queue)	claudony	Medium	M	(not tracked)	No inter-dependencies; pure Claudony addition
3	RecoveryPolicy SPI — detect stalled workers and take action (REPROVISION / ESCALATE / CANCEL / WAIT)	casehub-engine + claudony	Medium	M	(not tracked)	SPI in engine api/spi/; ReprovisioningRecoveryPolicy in claudony-casehub
4	Trust routing wired — WorkerSelectionStrategy injectable in CaseContextChangedEventHandler + TrustWeightedSelectionStrategy	casehub-engine	Medium	M	(not tracked)	Depends on P0.1 (trust scores must be computed before routing them)

#	Item	Repo	Complexity	Effort	Issue	Notes
1	OTel trace alignment — PropagationContext.traceId from LedgerTraceIdProvider at case creation	casehub-engine	Low	S	engine#185	One-line change + fallback; quick win
2	Cross-deployment trust federation — TrustExportService / TrustImportService SPIs	quarkus-ledger	Medium	L	(not tracked)	Canonical format design is the hard part; transport (webhook/Kafka) is pluggable
3	Cross-repo causal chain — causedByEntryId at provisioning; CaseLineageQuery JPA implementation	claudony	High	L	claudony#94	CaseLineageQuery JPA is non-trivial; requires casehub datasource configured in claudony

#	Item	Repo	Complexity	Effort	Issue	Notes
1	Notification consolidation — quarkus-work-notifications delegates Slack/Teams to casehub-connectors	quarkus-work + casehub-connectors	Medium	M	parent#5	Unblocks P3.3 and P3.5
2	SLA propagation — case budget bounds child WorkItem and Commitment deadlines	casehub-engine + quarkus-work	Medium	M	parent#6	Adapter-level change; no foundation changes needed
3	Critical event notifications — stalled obligations, case faults, escalations → casehub-connectors	qhorus + engine + work + connectors	Medium	M	(not tracked)	Depends on P3.1 (unified delivery pipeline first)
4	Human-in-the-loop end-to-end — casehub-work-adapter: WorkItem COMPLETED → CaseHubReactor.signal() → case continues	casehub-engine + quarkus-work	High	L	(not tracked)	Most important HITL integration; currently blocked on engine stability
5	casehub-assisteddev — AI-assisted development application (merge queue, code review orchestration)	new repo	Very High	XL	(not tracked — needs its own epic)	Separate repo; uses foundation primitives; needs domain design first

Item	Recommended repo	Notes
Merge CaseLedgerEntry branch	casehub/engine	May warrant a PR not an issue
Agent concurrency throttling (SpawnThrottle)	casehubio/claudony
RecoveryPolicy SPI	casehubio/engine
Trust routing wired (injectable WorkerSelectionStrategy)	casehubio/engine
Cross-deployment trust federation	casehubio/quarkus-ledger
Critical event notifications	casehubio/casehub-parent (cross-repo)
HITL end-to-end (casehub-work-adapter completion)	casehubio/engine
casehub-assisteddev	new repo	Needs its own epic

Phase	Items	Estimated total effort	Gate condition
P0 — Wiring	4 items	~3–4 weeks	Normative layer functional; trust accumulates
P1 — Scale	4 items	~4–5 weeks	10+ agents; no manual intervention needed
P2 — Quality	3 items	~4–6 weeks	Full audit trail; Jaeger correlation
P3 — Expand	5 items	~3 months + XL	New capabilities; casehub-assisteddev is a separate product epic

epic: platform foundation roadmap — P0 through P3 #7

Description

Overview

P0 — Wiring (do first — breaks immediately with multiple agents)

P1 — Scale (breaks at 10+ concurrent cases/agents)

P2 — Production quality (full observability, audit trail, cross-deployment trust)

P3 — Capability expansion (new capabilities on a solid foundation)

Hypothesis test (parallel track — not a blocker for P0-P3)

Untracked issues to create (P1–P3)

Summary

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions