getzep · yulongbai-nov · Dec 15, 2025 · Dec 15, 2025 · Dec 15, 2025 · Dec 15, 2025
diff --git a/.specs/agent-memory-ontology/design.md b/.specs/agent-memory-ontology/design.md
@@ -0,0 +1,137 @@
+# Design Document: Agent Memory Ontology (Graphiti Service)
+
+## Overview
+
+Copilot Chat and Codex both ingest conversation history into Graphiti so agents can recall durable context (preferences, terminology, ownership, project state) across sessions and workspaces. Today, each client has to “decide” what schema to use (entity types, relation types) and how to format episodes. This leads to drift and makes it hard to evolve the memory graph consistently.
+
+This feature adds a **Graphiti-side ontology registry** so clients can select a named schema (starting with `agent_memory_v1`) and get consistent extraction behavior. It also hardens the Graphiti service’s async ingestion so queued jobs run reliably without blocking the request path.
+
+### Goals
+
+- Provide a **single source of truth** for the “agent memory” schema (entity types + relation types) on the Graphiti service.
+- Allow clients to opt into the schema via an explicit `schema_id`, and default automatically for `<graphiti_episode …>` payloads.
+- Keep ingestion **non-blocking** (fast `202 Accepted`) and resilient (job failures don’t stop the worker).
+- Keep overhead low: schema should not require per-edge/per-node attribute extraction by default.
+
+### Non-goals
+
+- Changing Graphiti core extraction prompts or algorithms (`graphiti_core/*`).
+- Enforcing authorization / ACLs based on ownership facts (ownership is modeled, not enforced).
+- Mandating a client-side message format beyond the existing `<graphiti_episode …>` convention.
+
+## Current Architecture
+
+### Graphiti service ingest (today)
+
+- `POST /messages` accepts `group_id` and a list of message DTOs.
+- Each message is enqueued into an in-process async worker queue.
+- Worker calls `graphiti.add_episode(...)` which runs Graphiti core extraction and writes nodes/edges into Neo4j.
+
+### Pain points
+
+- **Schema drift:** no service-level mechanism to select/standardize `entity_types`, `edge_types`, or `edge_type_map`.
+- **Async ingest reliability risk:** background jobs must not depend on per-request resources that can be closed once the HTTP request finishes.
+
+## Proposed Architecture
+
+### Ontology registry + schema selection
+
+Introduce a `graph_service.ontologies` module with:
+
+- `schema_id` strings (start with `agent_memory_v1`)
+- `entity_types`, `edge_types`, `edge_type_map`, `excluded_entity_types` for each schema
+- a resolver: `resolve_ontology(schema_id: str | None, message_content: str) -> Ontology | None`
+
+Ingest routing:
+
+- Extend `AddMessagesRequest` with optional `schema_id`.
+- For each message:
+  - If `request.schema_id` is present, use it.
+  - Else, if `message.content` contains `<graphiti_episode`, default to `agent_memory_v1`.
+  - Else, use default Graphiti behavior (no custom schema).
+
+### Reliable background ingest lifecycle
+
+Run the async worker for the lifetime of the FastAPI app and keep a single `ZepGraphiti` instance in `app.state`:
+
+- Create Graphiti client once during app startup (lifespan).
+- Start the ingest worker at startup, stop it at shutdown.
+- Ensure worker errors do not crash the loop and that queue accounting is correct (`task_done`).
+
+This keeps the API responsive while ensuring background jobs can safely use Graphiti resources.
+
+### Shared identity and canonical group ids (cross-client memory)
+
+To share durable memory across multiple agent clients (e.g. Copilot Chat + Codex), all clients must write/read from the same “user scope” group (and ideally the same “workspace scope” group). Relying on each client to implement hashing/normalization is error-prone (language drift).
+
+Add a small Graphiti-service endpoint that resolves a canonical `group_id` given a `(scope, key)` pair:
+
+- `scope`: `user | workspace | session`
+- `key`: a stable identifier for that scope
+
+The service computes `group_id` deterministically (hash + prefix) and returns it. Clients can cache the result and use it for ingest + recall, enabling shared memory.
+
+**Identity key recommendation:** GitHub login is typically stable across tools (VS Code GitHub auth, `gh auth status`), and is available without relying on private email addresses. Prefer keys like `github_login:<login>`.
+
+## Components
+
+- `server/graph_service/ontologies/agent_memory_v1.py`
+  - Defines `agent_memory_v1` schema: entity types + edge types (docstring-driven, no fields).
+- `server/graph_service/ontologies/registry.py`
+  - Central registry + resolver helpers.
+- `server/graph_service/group_ids.py`
+  - Canonical group id hashing + resolver for `(scope, key)`.
+- `server/graph_service/routers/groups.py`
+  - `POST /groups/resolve` endpoint returning canonical `group_id`.
+- `server/graph_service/dto/ingest.py`
+  - Adds `schema_id` to `AddMessagesRequest`.
+- `server/graph_service/routers/ingest.py`
+  - Selects ontology per message and passes types/maps into `graphiti.add_episode(...)`.
+  - Worker resiliency.
+- `server/graph_service/main.py` / `server/graph_service/zep_graphiti.py`
+  - App-scoped Graphiti initialization and dependency injection.
+
+## Data & Control Flow
+
+```
+Copilot Chat / Codex
+  └─ POST /messages (group_id, messages[], schema_id?)
+       └─ enqueue jobs (202 Accepted)
+            └─ async worker executes sequentially
+                 └─ Graphiti.add_episode(..., entity_types/edge_types/edge_type_map?)
+                      └─ extract nodes + edges (LLM)
+                      └─ write to Neo4j + build embeddings
+  └─ POST /search (group_ids[], query)
+       └─ returns relevant edges (“facts”) for recall
+```
+
+## Integration Points
+
+- **Clients** can remain unchanged if they already wrap durable/structured memory as `<graphiti_episode …>…</graphiti_episode>`.
+- **Explicit opt-in**: clients may set `schema_id=agent_memory_v1` in `POST /messages` for deterministic behavior.
+- **Shared group ids**: clients can call `POST /groups/resolve` once per session/workspace/user identity and cache the returned group ids.
+- Docker compose: pass through `OPENAI_BASE_URL`, `MODEL_NAME`, `EMBEDDING_MODEL_NAME` so service behavior matches client/test environments.
+
+## Migration / Rollout Strategy
+
+- Backward compatible: `schema_id` is optional.
+- Safe default: schema only auto-applies when `<graphiti_episode` is present, minimizing unintended changes for generic ingestion callers.
+- Versioning: evolve via new schema ids (`agent_memory_v2`) rather than mutating `agent_memory_v1` semantics.
+
+## Performance / Reliability / Security / UX Considerations
+
+- **Performance:** keep schema types fieldless to avoid extra attribute-extraction LLM calls; ingestion remains async.
+- **Reliability:** app-scoped Graphiti client ensures background jobs don’t fail due to closed drivers; worker continues after exceptions.
+- **Security:** schema encourages modeling stable identifiers (hashed ids) and avoids requiring PII; clients should continue to redact secrets before promotion.
+
+## Risks and Mitigations
+
+- **Extraction quality depends on model/prompting:** keep ontology small and descriptive; version schema if changes needed.
+- **Schema drift across deployments:** include schema id in docs/examples; keep registry centralized.
+- **Operational misconfiguration:** document required env vars and health endpoints (`/healthcheck`).
+
+## Future Enhancements
+
+- Structured JSON episodes (`EpisodeType.json`) for deterministic ingestion of ownership/metadata without LLM parsing.
+- MCP server parity: expose the same schema ids/ontology selection through MCP tools.
+- Organization/team scope groups and cross-group linking policies.
diff --git a/.specs/agent-memory-ontology/requirements.md b/.specs/agent-memory-ontology/requirements.md
@@ -0,0 +1,77 @@
+# Requirements Document
+
+## Introduction
+
+This feature standardizes “agent memory” extraction for multiple clients by defining a Graphiti service-side schema (`agent_memory_v1`) and enabling schema selection during ingestion, while keeping ingestion asynchronous and reliable.
+
+### Goals
+
+- Centralize agent-memory schema (relations/entities) in Graphiti service.
+- Keep ingestion fast and resilient.
+- Preserve backwards compatibility for existing ingestion clients.
+
+### Non-goals
+
+- Authorization / ACL enforcement based on ownership facts.
+- Changes to Graphiti core extraction algorithms.
+
+## Glossary
+
+- **Schema ID**: A stable string that selects an ontology definition for ingestion (e.g. `agent_memory_v1`).
+- **Ontology**: A bundle of entity type definitions, relation type definitions, and relation signature rules used by Graphiti extraction.
+- **Episode**: A single ingested message/event processed into nodes and edges.
+- **Fact**: An extracted edge returned by Graphiti search for use as recall memory.
+
+## Requirements
+
+### Requirement 1: Schema Selection on Ingest
+
+**User Story:** As an integrator, I want to select a named schema when ingesting messages, so that extracted facts are consistent across clients.
+
+#### Acceptance Criteria
+
+1.1 THE Graphiti Service SHALL accept an optional `schema_id` field on `POST /messages`.
+1.2 WHEN `schema_id` is provided, THE Graphiti Service SHALL apply the corresponding ontology when calling `graphiti.add_episode(...)`.
+1.3 WHEN `schema_id` is not provided AND a message content contains `<graphiti_episode`, THE Graphiti Service SHALL apply `agent_memory_v1`.
+1.4 WHEN `schema_id` is not provided AND a message content does not contain `<graphiti_episode`, THE Graphiti Service SHALL ingest using default Graphiti behavior.
+1.5 WHEN an unknown `schema_id` is provided, THE Graphiti Service SHALL return a `422` validation error or a `400` client error with a clear message.
+
+### Requirement 2: Durable Agent Memory Ontology v1
+
+**User Story:** As an agent developer, I want the service to recognize common agent-memory relations (ownership, preferences, terminology, tasks), so that recall is more accurate and structured.
+
+#### Acceptance Criteria
+
+2.1 THE Graphiti Service SHALL define an `agent_memory_v1` ontology in a service-owned registry.
+2.2 THE `agent_memory_v1` ontology SHALL define a bounded set of relation types intended for generic coding + project workflows.
+2.3 THE `agent_memory_v1` ontology SHALL avoid mandatory attribute extraction (fieldless type models) to limit LLM overhead by default.
+
+### Requirement 3: Reliable Asynchronous Ingestion
+
+**User Story:** As an operator, I want background ingestion jobs to run reliably after `POST /messages` returns, so that clients can enqueue memory without slowing the agent loop.
+
+#### Acceptance Criteria
+
+3.1 THE Graphiti Service SHALL start exactly one background worker during app startup and stop it on shutdown.
+3.2 THE Graphiti Service SHALL NOT close Graphiti resources that background jobs depend on before queued jobs finish.
+3.3 WHEN a background job raises an exception, THE Graphiti Service SHALL log the error and continue processing subsequent jobs.
+
+### Requirement 4: Documentation and Examples
+
+**User Story:** As a developer, I want clear docs and a small demo showing schema-enabled ingestion and recall, so that I can validate deployments quickly.
+
+#### Acceptance Criteria
+
+4.1 THE repository SHALL document the health endpoint (`/healthcheck`) and basic redeploy steps.
+4.2 THE repository SHALL include a minimal demo/example that uses `agent_memory_v1` ingestion and shows how to retrieve facts via `/search`.
+
+### Requirement 5: Canonical Group IDs for Shared Memory
+
+**User Story:** As an agent user, I want Copilot Chat and Codex to share the same durable memory, so that preferences and terminology carry across tools.
+
+#### Acceptance Criteria
+
+5.1 THE Graphiti Service SHALL expose an endpoint that resolves a canonical `group_id` from a `(scope, key)` pair.
+5.2 THE endpoint SHALL support at least `user`, `workspace`, and `session` scopes.
+5.3 WHEN given the same `(scope, key)`, THE Graphiti Service SHALL return the same `group_id` across requests and deployments.
+5.4 The repository SHALL document recommended `key` derivation for the `user` scope using GitHub login (e.g. `github_login:<login>`).
diff --git a/.specs/agent-memory-ontology/tasks.md b/.specs/agent-memory-ontology/tasks.md
@@ -0,0 +1,29 @@
+# Implementation Plan
+
+- [x] 1. Add ontology specs _Requirements: 1, 2, 3, 4_
+- [x] 2. Stabilize async ingest lifecycle _Requirements: 3.1, 3.2, 3.3_
+  - [x] 2.1 Keep a single Graphiti instance in `app.state` _Requirements: 3.2_
+  - [x] 2.2 Start/stop worker in app lifespan _Requirements: 3.1_
+  - [x] 2.3 Ensure worker continues after failures _Requirements: 3.3_
+- [x] 3. Implement `agent_memory_v1` ontology registry _Requirements: 2.1, 2.2, 2.3_
+  - [x] 3.1 Add `graph_service/ontologies/agent_memory_v1.py` _Requirements: 2.1_
+  - [x] 3.2 Add `graph_service/ontologies/registry.py` resolver _Requirements: 1.2, 1.3, 1.4_
+- [x] 4. Extend ingest API for schema selection _Requirements: 1.1, 1.2, 1.3, 1.4, 1.5_
+  - [x] 4.1 Add `schema_id` to `AddMessagesRequest` _Requirements: 1.1_
+  - [x] 4.2 Apply schema per message in `POST /messages` _Requirements: 1.2, 1.3, 1.4_
+  - [x] 4.3 Validate unknown schema ids _Requirements: 1.5_
+- [x] 5. Add server tests and docs/demo _Requirements: 3.3, 4.1, 4.2_
+  - [x] 5.1 Add/adjust pytest discovery so root tests stay isolated _Requirements: 3.3_
+  - [x] 5.2 Add demo under `examples/` or `server/README.md` _Requirements: 4.1, 4.2_
+- [x] 6. Verify, commit, push, PR _Requirements: 1, 2, 3, 4_
+- [x] 7. Add canonical group id resolver _Requirements: 5.1, 5.2, 5.3_
+  - [x] 7.1 Add group id hashing helper in `graph_service` _Requirements: 5.3_
+  - [x] 7.2 Add `POST /groups/resolve` endpoint _Requirements: 5.1, 5.2_
+- [x] 8. Document identity key recommendations _Requirements: 5.4_
+  - [x] 8.1 Document GitHub-login based keys in `server/README.md` _Requirements: 5.4_
+- [x] 9. Verify, commit, push, PR (origin) _Requirements: 5_
+
+## Current Status Summary
+
+- Phase: implementation (complete; PR open)
+- Next: merge PR and redeploy Graphiti service.
diff --git a/docker-compose.yml b/docker-compose.yml
@@ -21,6 +21,9 @@ services:
         condition: service_healthy
     environment:
       - OPENAI_API_KEY=${OPENAI_API_KEY}
+      - OPENAI_BASE_URL=${OPENAI_BASE_URL}
+      - MODEL_NAME=${MODEL_NAME}
+      - EMBEDDING_MODEL_NAME=${EMBEDDING_MODEL_NAME}
       - NEO4J_URI=bolt://neo4j:${NEO4J_PORT:-7687}
       - NEO4J_USER=${NEO4J_USER:-neo4j}
       - NEO4J_PASSWORD=${NEO4J_PASSWORD:-password}
@@ -80,6 +83,9 @@ services:
       retries: 3
     environment:
       - OPENAI_API_KEY=${OPENAI_API_KEY}
+      - OPENAI_BASE_URL=${OPENAI_BASE_URL}
+      - MODEL_NAME=${MODEL_NAME}
+      - EMBEDDING_MODEL_NAME=${EMBEDDING_MODEL_NAME}
       - FALKORDB_HOST=falkordb
       - FALKORDB_PORT=6379
       - FALKORDB_DATABASE=default_db

diff --git a/examples/agent_memory_ontology/README.md b/examples/agent_memory_ontology/README.md
@@ -0,0 +1,49 @@
+# Agent Memory Ontology Demo (`agent_memory_v1`)
+
+This demo shows how to ingest agent memory (preferences/terminology/etc.) into the Graphiti service using the service-owned `agent_memory_v1` schema, and then retrieve facts via `/search`.
+
+## Prerequisites
+
+- Graphiti service running (from repo root):
+
+```bash
+docker compose up -d --build graph neo4j
+```
+
+- Verify health:
+
+```bash
+curl -sS http://localhost:8000/healthcheck
+```
+
+## Ingest a memory directive
+
+```bash
+curl -sS http://localhost:8000/messages \
+  -H 'content-type: application/json' \
+  -d '{
+    "group_id": "workspace-demo",
+    "schema_id": "agent_memory_v1",
+    "messages": [{
+      "role_type": "user",
+      "role": "user",
+      "content": "<graphiti_episode kind=\"memory_directive\">preference (workspace): Keep diffs small and focused.</graphiti_episode>",
+      "source_description": "demo"
+    }]
+  }'
+```
+
+Ingestion is asynchronous; depending on your LLM/embedding backend, processing may take a little while.
+
+## Retrieve facts
+
+```bash
+curl -sS http://localhost:8000/search \
+  -H 'content-type: application/json' \
+  -d '{
+    "group_ids": ["workspace-demo"],
+    "query": "What are my preferences for diffs?",
+    "max_facts": 5
+  }'
+```
+
diff --git a/pytest.ini b/pytest.ini
@@ -1,4 +1,5 @@
 [pytest]
+testpaths = tests
 markers =
     integration: marks tests as integration tests
 asyncio_default_fixture_loop_scope = function