Skip to content

Arjxm/Synapto

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Synapto: RL-Native Memory System for AI Agents

⚠️ Research Prototype - This project is under active development and not production-ready. APIs may change without notice.

Synapto is a research prototype exploring whether reinforcement learning-based memory decisions can outperform heuristic approaches for AI agents. It provides learned policies for what, where, and when to store and retrieve memories.

Project Status

Component Status
Memory Stores (Redis, PostgreSQL, pgvector) ✅ Implemented
Dueling DQN with Prioritized Replay ✅ Implemented
MCP Server Integration ✅ Implemented
Online RL Training 🚧 In Progress
Benchmark Framework 🚧 In Progress
GNN Path Optimizer ❌ Not Started
Decision Graph Hot Paths ❌ Not Started
Multi-tenant Support ❌ Not Started

Features

  • RL Decision Controller: Dueling DQN with prioritized experience replay for memory routing decisions
  • 3-Tier Memory Architecture:
    • Working Memory (Redis): Sub-millisecond access, session-scoped, TTL-based expiration
    • Episodic Memory (PostgreSQL): Timestamped events, timeline queries
    • Semantic Memory (pgvector): Vector similarity search, entity tagging
  • MCP Server: Integration with Claude Code via Model Context Protocol
  • Configurable Embeddings: Local (sentence-transformers) or OpenAI API
  • Benchmark Framework: Compare RL vs heuristic baselines

Quick Start

Prerequisites

  • Python 3.11+
  • Docker and Docker Compose
  • (Optional) OpenAI API key for OpenAI embeddings

Installation

# Clone the repository
cd /Users/arjun/Personal/synapto

# Start infrastructure
docker-compose up -d redis postgres

# Install dependencies
pip install -e ".[dev]"

Basic Usage

# Interactive mode
synapto interactive

# Store a memory
synapto store "I prefer Python for data science" --importance 0.8 --tags "preference,python"

# Retrieve memories
synapto retrieve "programming preferences" --k 5

# View stats
synapto stats

Using with Claude Code

Add to ~/.claude/claude_code_config.json:

{
  "mcpServers": {
    "synapto-memory": {
      "command": "python",
      "args": ["-m", "synapto.mcp.server"],
      "cwd": "/Users/arjun/Personal/synapto",
      "env": {
        "SYNAPTO_REDIS_URL": "redis://localhost:6379",
        "SYNAPTO_DATABASE_URL": "postgresql://synapto:synapto_dev@localhost:5432/synapto",
        "SYNAPTO_EMBEDDING_PROVIDER": "local"
      }
    }
  }
}

Then in Claude Code:

Use synapto_store to remember that I prefer vim keybindings
What are my editor preferences?

Architecture

┌────────────────────────────────────────────────────────────────┐
│                        SYNAPTO MVP                                │
├────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                  MCP SERVER (FastMCP)                     │  │
│  │  Tools: synapto_store, synapto_retrieve, synapto_feedback,      │  │
│  │         synapto_context, synapto_stats                        │  │
│  └─────────────────────────┬────────────────────────────────┘  │
│                            │                                    │
│  ┌─────────────────────────▼────────────────────────────────┐  │
│  │               RL DECISION CONTROLLER                      │  │
│  │  • Dueling DQN with Double DQN updates                    │  │
│  │  • Prioritized Experience Replay                          │  │
│  │  • 14 discrete actions (store/retrieve/maintenance)       │  │
│  │  • Multi-objective reward function                        │  │
│  └─────────────────────────┬────────────────────────────────┘  │
│                            │                                    │
│  ┌─────────────────────────▼────────────────────────────────┐  │
│  │                    MEMORY STORES                          │  │
│  │  ┌──────────┐  ┌───────────┐  ┌──────────────┐           │  │
│  │  │ WORKING  │  │ EPISODIC  │  │   SEMANTIC   │           │  │
│  │  │ (Redis)  │  │ (Postgres)│  │  (pgvector)  │           │  │
│  │  │  <1ms    │  │   ~10ms   │  │    ~20ms     │           │  │
│  │  └──────────┘  └───────────┘  └──────────────┘           │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                  EMBEDDING SERVICE                        │  │
│  │  Local: sentence-transformers (bge-base-en-v1.5, 768d)   │  │
│  │  API: OpenAI text-embedding-3-small (1536d)              │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
└────────────────────────────────────────────────────────────────┘

RL Action Space

The agent selects from 14 discrete actions:

Category Actions
Store STORE_WORKING, STORE_EPISODIC, STORE_SEMANTIC, STORE_SKIP
Retrieve RETRIEVE_WORKING, RETRIEVE_EPISODIC, RETRIEVE_SEMANTIC, RETRIEVE_ALL
Maintenance CONSOLIDATE, PROMOTE, DEMOTE, FORGET
Meta PRELOAD, REINDEX

Reward Function

Multi-objective reward with tunable weights:

R = 0.6 × task_success + 0.2 × precision + 0.1 × latency_bonus + 0.1 × efficiency

Configuration

Environment variables:

Variable Default Description
SYNAPTO_REDIS_URL redis://localhost:6379 Redis connection URL
SYNAPTO_DATABASE_URL postgresql://synapto:synapto_dev@localhost:5432/synapto PostgreSQL URL
SYNAPTO_EMBEDDING_PROVIDER local local or openai
OPENAI_API_KEY - Required if using OpenAI embeddings

Running Tests

# Start test infrastructure
docker-compose up -d redis postgres

# Run tests
pytest tests/ -v

# With coverage
pytest tests/ --cov=synapto --cov-report=html

Benchmarking

Compare RL policy against heuristic baselines:

# Run benchmark
python benchmarks/run_benchmark.py \
  --policies rl,random,recency,semantic \
  --episodes 100 \
  --scenario coding \
  --output benchmarks/results/

# Generate synthetic training data
python scripts/generate_data.py --output data/synthetic_scenarios.json

# Pre-train on synthetic data
python scripts/train_offline.py --data data/synthetic_scenarios.json --output models/pretrained.pt

Project Structure

synapto/
├── synapto/
│   ├── config.py           # Configuration management
│   ├── engine.py           # SynaptoEngine orchestrator
│   ├── cli.py              # Command-line interface
│   ├── mcp/                # MCP server
│   │   ├── server.py
│   │   └── tools.py
│   ├── rl/                 # RL components
│   │   ├── agent.py        # Dueling DQN
│   │   ├── state.py        # State representation
│   │   ├── actions.py      # Action definitions
│   │   ├── rewards.py      # Reward function
│   │   ├── replay_buffer.py
│   │   └── trainer.py
│   └── memory/             # Memory stores
│       ├── base.py
│       ├── working.py      # Redis
│       ├── episodic.py     # PostgreSQL
│       ├── semantic.py     # pgvector
│       └── embeddings.py
├── tests/
├── benchmarks/
├── scripts/
├── docker-compose.yml
└── pyproject.toml

Current Limitations & Drawbacks

RL Training Challenges

Issue Description Mitigation
Cold Start Problem RL agent starts with random policy, poor initial performance Pre-trained model included, but may not generalize to all use cases
Training Instability DQN training can diverge with small sample sizes Fallback to heuristic policies when RL confidence is low
Reward Design Current reward function is hand-tuned, may not capture all objectives Configurable weights, but optimal values are task-dependent
Exploration vs Exploitation Agent may over-explore or under-explore Epsilon decay schedule needs tuning per deployment

Architectural Limitations

  • Single-node only: No distributed deployment support yet
  • No authentication/authorization: Memory is not user-isolated
  • No encryption at rest: Sensitive data stored in plaintext
  • Limited temporal reasoning: Episodic memory queries are basic compared to Graphiti/Zep
  • No graph traversal: Semantic memory uses vector similarity only, missing multi-hop reasoning
  • No hot path caching: Every query hits the database (target <10ms not achieved)

Performance Gaps vs Research Goals

Metric Target Current Status
Working memory retrieval <5ms (p95) ~2-5ms ✅
Semantic retrieval <50ms (p95) ~30-80ms ⚠️
RL vs random improvement >20% Not validated ❌
Memory capacity 10k+ Not stress-tested ❌

Known Issues

  1. Embedding model loading: First request is slow (~5-10s) while loading sentence-transformers
  2. PostgreSQL connection pool: May exhaust connections under high load
  3. Redis TTL race conditions: Memories may expire during active use
  4. No graceful degradation: If Redis/PostgreSQL is down, entire system fails

What's Missing vs Full Vision

From the research design, these features are not yet implemented:

  • Decision Graph Engine: GNN-based path optimization for <10ms retrieval
  • Hot Path Cache: LRU cache for frequent query patterns
  • Proactive Memory Loading: Predictive prefetching based on context
  • Memory Consolidation: Automatic merging of related memories
  • Bi-temporal Queries: Tracking both system time and real-world valid time
  • Multi-tool Integration: Currently only Claude Code via MCP

Research Validation

Success criteria (not yet validated):

  • RL policy outperforms random by >20%
  • RL policy matches or beats best heuristic
  • Working memory retrieval < 5ms (p95)
  • Semantic retrieval < 50ms (p95)
  • Works with 10k+ memories

License

MIT

Contributing

This is an early-stage research prototype. Contributions welcome for:

  • Additional heuristic baselines for comparison
  • Benchmark scenarios (coding, research, multi-session)
  • RL algorithm improvements (PPO, SAC alternatives)
  • Memory store optimizations
  • Decision Graph / GNN path optimizer implementation
  • Bug fixes and documentation

Note: The codebase is evolving rapidly. Please open an issue before starting major work to avoid conflicts.

About

RL-Native Memory System for AI Agents

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors