Episode Memories: Write-time narrative synthesis for multi-hop retrieval

## Problem

Multi-session, multi-hop questions like "How many properties did I view before making an offer on the Brookside townhouse?" fail with current retrieval. The answer spans 4+ separate sessions with no semantic overlap to the query. Vector search returns results about Brookside itself, missing the earlier property viewings entirely.

Current architecture is **read-optimized**: store raw memories, try to be clever at query time with graph expansion and temporal filtering. But multi-hop temporal reasoning requires a query planner / agentic RAG layer on top — which defeats the purpose of a memory system.

## Proposed Solution: Episode Memories

Flip the paradigm: **write-time reasoning instead of read-time reasoning.**

At ingest time, detect when memories belong to an ongoing *episode* (an activity/goal spanning multiple sessions) and auto-generate progressive summary memories that capture the narrative state.

### How it works

1. **EpisodeDetector** — At ingest, classify whether a new memory belongs to an existing episode
   - Embedding similarity to existing episode summaries
   - Or LLM-based classification (more accurate, higher cost)
   - Episodes represent activities/goals: "house hunting", "planning a trip", "job search"

2. **EpisodeSummarizer** — When a memory joins an episode, regenerate the summary
   - Progressive: each new memory triggers an update, not a full recompute
   - Example output: *"User is house hunting. Properties viewed: bungalow (rejected — kitchen renovation needed), Cedar Creek (rejected — over budget), 1-bed condo (rejected — highway noise), 2-bed condo (rejected — outbid). Offer made on Brookside townhouse $340k, accepted."*

3. **Episode summaries as first-class memories**
   - Stored with `source_type: "episode_summary"`
   - Directly searchable via vector search — no multi-hop needed
   - Graph links to constituent memories for drill-down

### Why this is novel

- Most memory systems optimize the READ path (embeddings, reranking, graph traversal)
- This optimizes the WRITE path — invest compute at ingest to make retrieval trivially easy
- Mirrors how human memory works (schemas/narratives, not isolated facts)
- More token-efficient at query time (one summary vs. 10 raw messages)
- Solves multi-hop without needing agentic retrieval

## Implementation Sketch

### New components
- `EpisodeDetector` service — episode matching/creation at ingest time
- `EpisodeSummarizer` service — progressive summary generation
- `EpisodeStore` — persistence layer (could extend GraphStore or be standalone)

### Integration points
- Hook into `remember()` flow after memory storage
- Episode summaries written back via `remember()` with special source_type
- Graph edges: episode_summary → constituent memories
- Recall: episode summaries naturally surface via existing vector search

### Configuration
- `episodes.enabled: true/false`
- `episodes.detector: "embedding" | "llm"`
- `episodes.summarizer_model: "gpt-4o-mini"` (or local)
- `episodes.similarity_threshold: 0.7`
- `episodes.max_summary_length: 500`

## Success Criteria

- LongMemEval multi-session questions that currently score 0% should achieve >60% accuracy
- Episode detection precision >80% (memories correctly assigned to episodes)
- Write latency increase <2x (episode detection + summary update)
- No degradation on single-hop queries (existing LoCoMo benchmark stays at 100%)

## Discovery: LongMemEval Benchmark (2026-02-10)

Found during v0.7.3 LongMemEval benchmark run. Q1 ("How many properties before Brookside?") returned 10 results all about Brookside itself — 0% retrieval quality on multi-hop questions. Temporal reasoning and graph expansion exist but have no query planner to decompose multi-step questions. Episode memories solve this at the source.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Episode Memories: Write-time narrative synthesis for multi-hop retrieval #190

Problem

Proposed Solution: Episode Memories

How it works

Why this is novel

Implementation Sketch

New components

Integration points

Configuration

Success Criteria

Discovery: LongMemEval Benchmark (2026-02-10)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Episode Memories: Write-time narrative synthesis for multi-hop retrieval #190

Description

Problem

Proposed Solution: Episode Memories

How it works

Why this is novel

Implementation Sketch

New components

Integration points

Configuration

Success Criteria

Discovery: LongMemEval Benchmark (2026-02-10)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions