Synapto: RL-Native Memory System for AI Agents

⚠️ Research Prototype - This project is under active development and not production-ready. APIs may change without notice.

Synapto is a research prototype exploring whether reinforcement learning-based memory decisions can outperform heuristic approaches for AI agents. It provides learned policies for what, where, and when to store and retrieve memories.

Project Status

Component	Status
Memory Stores (Redis, PostgreSQL, pgvector)	✅ Implemented
Dueling DQN with Prioritized Replay	✅ Implemented
MCP Server Integration	✅ Implemented
Online RL Training	🚧 In Progress
Benchmark Framework	🚧 In Progress
GNN Path Optimizer	❌ Not Started
Decision Graph Hot Paths	❌ Not Started
Multi-tenant Support	❌ Not Started

Features

RL Decision Controller: Dueling DQN with prioritized experience replay for memory routing decisions
3-Tier Memory Architecture:
- Working Memory (Redis): Sub-millisecond access, session-scoped, TTL-based expiration
- Episodic Memory (PostgreSQL): Timestamped events, timeline queries
- Semantic Memory (pgvector): Vector similarity search, entity tagging
MCP Server: Integration with Claude Code via Model Context Protocol
Configurable Embeddings: Local (sentence-transformers) or OpenAI API
Benchmark Framework: Compare RL vs heuristic baselines

Quick Start

Prerequisites

Python 3.11+
Docker and Docker Compose
(Optional) OpenAI API key for OpenAI embeddings

Installation

# Clone the repository
cd /Users/arjun/Personal/synapto

# Start infrastructure
docker-compose up -d redis postgres

# Install dependencies
pip install -e ".[dev]"

Basic Usage

# Interactive mode
synapto interactive

# Store a memory
synapto store "I prefer Python for data science" --importance 0.8 --tags "preference,python"

# Retrieve memories
synapto retrieve "programming preferences" --k 5

# View stats
synapto stats

Using with Claude Code

Add to ~/.claude/claude_code_config.json:

{
  "mcpServers": {
    "synapto-memory": {
      "command": "python",
      "args": ["-m", "synapto.mcp.server"],
      "cwd": "/Users/arjun/Personal/synapto",
      "env": {
        "SYNAPTO_REDIS_URL": "redis://localhost:6379",
        "SYNAPTO_DATABASE_URL": "postgresql://synapto:synapto_dev@localhost:5432/synapto",
        "SYNAPTO_EMBEDDING_PROVIDER": "local"
      }
    }
  }
}

Then in Claude Code:

Use synapto_store to remember that I prefer vim keybindings
What are my editor preferences?

Architecture

┌────────────────────────────────────────────────────────────────┐
│                        SYNAPTO MVP                                │
├────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                  MCP SERVER (FastMCP)                     │  │
│  │  Tools: synapto_store, synapto_retrieve, synapto_feedback,      │  │
│  │         synapto_context, synapto_stats                        │  │
│  └─────────────────────────┬────────────────────────────────┘  │
│                            │                                    │
│  ┌─────────────────────────▼────────────────────────────────┐  │
│  │               RL DECISION CONTROLLER                      │  │
│  │  • Dueling DQN with Double DQN updates                    │  │
│  │  • Prioritized Experience Replay                          │  │
│  │  • 14 discrete actions (store/retrieve/maintenance)       │  │
│  │  • Multi-objective reward function                        │  │
│  └─────────────────────────┬────────────────────────────────┘  │
│                            │                                    │
│  ┌─────────────────────────▼────────────────────────────────┐  │
│  │                    MEMORY STORES                          │  │
│  │  ┌──────────┐  ┌───────────┐  ┌──────────────┐           │  │
│  │  │ WORKING  │  │ EPISODIC  │  │   SEMANTIC   │           │  │
│  │  │ (Redis)  │  │ (Postgres)│  │  (pgvector)  │           │  │
│  │  │  <1ms    │  │   ~10ms   │  │    ~20ms     │           │  │
│  │  └──────────┘  └───────────┘  └──────────────┘           │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                  EMBEDDING SERVICE                        │  │
│  │  Local: sentence-transformers (bge-base-en-v1.5, 768d)   │  │
│  │  API: OpenAI text-embedding-3-small (1536d)              │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
└────────────────────────────────────────────────────────────────┘

RL Action Space

The agent selects from 14 discrete actions:

Category	Actions
Store	STORE_WORKING, STORE_EPISODIC, STORE_SEMANTIC, STORE_SKIP
Retrieve	RETRIEVE_WORKING, RETRIEVE_EPISODIC, RETRIEVE_SEMANTIC, RETRIEVE_ALL
Maintenance	CONSOLIDATE, PROMOTE, DEMOTE, FORGET
Meta	PRELOAD, REINDEX

Reward Function

Multi-objective reward with tunable weights:

R = 0.6 × task_success + 0.2 × precision + 0.1 × latency_bonus + 0.1 × efficiency

Configuration

Environment variables:

Variable	Default	Description
`SYNAPTO_REDIS_URL`	`redis://localhost:6379`	Redis connection URL
`SYNAPTO_DATABASE_URL`	`postgresql://synapto:synapto_dev@localhost:5432/synapto`	PostgreSQL URL
`SYNAPTO_EMBEDDING_PROVIDER`	`local`	`local` or `openai`
`OPENAI_API_KEY`	-	Required if using OpenAI embeddings

Running Tests

# Start test infrastructure
docker-compose up -d redis postgres

# Run tests
pytest tests/ -v

# With coverage
pytest tests/ --cov=synapto --cov-report=html

Benchmarking

Compare RL policy against heuristic baselines:

# Run benchmark
python benchmarks/run_benchmark.py \
  --policies rl,random,recency,semantic \
  --episodes 100 \
  --scenario coding \
  --output benchmarks/results/

# Generate synthetic training data
python scripts/generate_data.py --output data/synthetic_scenarios.json

# Pre-train on synthetic data
python scripts/train_offline.py --data data/synthetic_scenarios.json --output models/pretrained.pt

Project Structure

synapto/
├── synapto/
│   ├── config.py           # Configuration management
│   ├── engine.py           # SynaptoEngine orchestrator
│   ├── cli.py              # Command-line interface
│   ├── mcp/                # MCP server
│   │   ├── server.py
│   │   └── tools.py
│   ├── rl/                 # RL components
│   │   ├── agent.py        # Dueling DQN
│   │   ├── state.py        # State representation
│   │   ├── actions.py      # Action definitions
│   │   ├── rewards.py      # Reward function
│   │   ├── replay_buffer.py
│   │   └── trainer.py
│   └── memory/             # Memory stores
│       ├── base.py
│       ├── working.py      # Redis
│       ├── episodic.py     # PostgreSQL
│       ├── semantic.py     # pgvector
│       └── embeddings.py
├── tests/
├── benchmarks/
├── scripts/
├── docker-compose.yml
└── pyproject.toml

Current Limitations & Drawbacks

RL Training Challenges

Issue	Description	Mitigation
Cold Start Problem	RL agent starts with random policy, poor initial performance	Pre-trained model included, but may not generalize to all use cases
Training Instability	DQN training can diverge with small sample sizes	Fallback to heuristic policies when RL confidence is low
Reward Design	Current reward function is hand-tuned, may not capture all objectives	Configurable weights, but optimal values are task-dependent
Exploration vs Exploitation	Agent may over-explore or under-explore	Epsilon decay schedule needs tuning per deployment

Architectural Limitations

Single-node only: No distributed deployment support yet
No authentication/authorization: Memory is not user-isolated
No encryption at rest: Sensitive data stored in plaintext
Limited temporal reasoning: Episodic memory queries are basic compared to Graphiti/Zep
No graph traversal: Semantic memory uses vector similarity only, missing multi-hop reasoning
No hot path caching: Every query hits the database (target <10ms not achieved)

Performance Gaps vs Research Goals

Metric	Target	Current Status
Working memory retrieval	<5ms (p95)	~2-5ms ✅
Semantic retrieval	<50ms (p95)	~30-80ms ⚠️
RL vs random improvement	>20%	Not validated ❌
Memory capacity	10k+	Not stress-tested ❌

Known Issues

Embedding model loading: First request is slow (~5-10s) while loading sentence-transformers
PostgreSQL connection pool: May exhaust connections under high load
Redis TTL race conditions: Memories may expire during active use
No graceful degradation: If Redis/PostgreSQL is down, entire system fails

What's Missing vs Full Vision

From the research design, these features are not yet implemented:

Decision Graph Engine: GNN-based path optimization for <10ms retrieval
Hot Path Cache: LRU cache for frequent query patterns
Proactive Memory Loading: Predictive prefetching based on context
Memory Consolidation: Automatic merging of related memories
Bi-temporal Queries: Tracking both system time and real-world valid time
Multi-tool Integration: Currently only Claude Code via MCP

Research Validation

Success criteria (not yet validated):

RL policy outperforms random by >20%
RL policy matches or beats best heuristic
Working memory retrieval < 5ms (p95)
Semantic retrieval < 50ms (p95)
Works with 10k+ memories

License

MIT

Contributing

This is an early-stage research prototype. Contributions welcome for:

Additional heuristic baselines for comparison
Benchmark scenarios (coding, research, multi-session)
RL algorithm improvements (PPO, SAC alternatives)
Memory store optimizations
Decision Graph / GNN path optimizer implementation
Bug fixes and documentation

Note: The codebase is evolving rapidly. Please open an issue before starting major work to avoid conflicts.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
benchmarks		benchmarks
scripts		scripts
synapto		synapto
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Synapto: RL-Native Memory System for AI Agents

Project Status

Features

Quick Start

Prerequisites

Installation

Basic Usage

Using with Claude Code

Architecture

RL Action Space

Reward Function

Configuration

Running Tests

Benchmarking

Project Structure

Current Limitations & Drawbacks

RL Training Challenges

Architectural Limitations

Performance Gaps vs Research Goals

Known Issues

What's Missing vs Full Vision

Research Validation

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Synapto: RL-Native Memory System for AI Agents

Project Status

Features

Quick Start

Prerequisites

Installation

Basic Usage

Using with Claude Code

Architecture

RL Action Space

Reward Function

Configuration

Running Tests

Benchmarking

Project Structure

Current Limitations & Drawbacks

RL Training Challenges

Architectural Limitations

Performance Gaps vs Research Goals

Known Issues

What's Missing vs Full Vision

Research Validation

License

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages