diff --git a/.github/workflows/auto-assign.yml b/.github/workflows/auto-assign.yml index 439d2fd3..8c1b6eee 100644 --- a/.github/workflows/auto-assign.yml +++ b/.github/workflows/auto-assign.yml @@ -20,7 +20,7 @@ jobs: numOfAssignee: 1 assign-prs: - if: github.event_name == 'pull_request' + if: github.event_name == 'pull_request' && github.event.pull_request.head.repo.fork == false runs-on: ubuntu-latest permissions: pull-requests: write diff --git a/.github/workflows/claude-code-review.yml b/.github/workflows/claude-code-review.yml index b5e8cfd4..15e7e8f0 100644 --- a/.github/workflows/claude-code-review.yml +++ b/.github/workflows/claude-code-review.yml @@ -12,6 +12,10 @@ on: jobs: claude-review: + # Skip fork PRs: Claude Code review needs secrets/OIDC that are not available + # on fork-based pull_request events. Same-repo PRs still run normally. + if: ${{ github.event.pull_request.head.repo.fork == false }} + # Optional: Filter by PR author # if: | # github.event.pull_request.user.login == 'external-contributor' || diff --git a/.gitignore b/.gitignore index a74c241d..4d85a2f0 100644 --- a/.gitignore +++ b/.gitignore @@ -1,5 +1,13 @@ node_modules/ .venv-skill-tools +.memory-lancedb-pro/ +.DS_Store memory-plugin-feature-dev memory-plugin-host-validation memory-plugin-release-consistency +maintain-memory-lancedb-pro/ +skills/handle-memory-lancedb-pro-issue/ +skills/maintain-memory-lancedb-pro/ +skills/validate-memory-lancedb-pro-pr/ +test/addressing-identity-regression.mjs +validate-memory-lancedb-pro-pr/ diff --git a/README.md b/README.md index 9863ebab..51adce7c 100644 --- a/README.md +++ b/README.md @@ -2,90 +2,105 @@ # 🧠 memory-lancedb-pro · 🦞OpenClaw Plugin -**The production-grade long-term memory plugin for [OpenClaw](https://github.com/openclaw/openclaw)** +**AI Memory Assistant for [OpenClaw](https://github.com/openclaw/openclaw) Agents** *Give your AI agent a brain that actually remembers — across sessions, across agents, across time.* +A LanceDB-backed OpenClaw memory plugin that stores preferences, decisions, and project context, then auto-recalls them in future sessions. + [![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-blue)](https://github.com/openclaw/openclaw) +[![OpenClaw 2026.3+](https://img.shields.io/badge/OpenClaw-2026.3%2B-brightgreen)](https://github.com/openclaw/openclaw) [![npm version](https://img.shields.io/npm/v/memory-lancedb-pro)](https://www.npmjs.com/package/memory-lancedb-pro) [![LanceDB](https://img.shields.io/badge/LanceDB-Vectorstore-orange)](https://lancedb.com) [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) -**English** | [简体中文](README_CN.md) +

v1.1.0-beta.10 — OpenClaw 2026.3+ Hook Adaptation

+ +

+ ✅ Fully adapted for OpenClaw 2026.3+ new plugin architecture
+ 🔄 Uses before_prompt_build hooks (replacing deprecated before_agent_start)
+ 🩺 Run openclaw doctor --fix after upgrading +

+ +[English](README.md) | [简体中文](README_CN.md) | [繁體中文](README_TW.md) | [日本語](README_JA.md) | [한국어](README_KO.md) | [Français](README_FR.md) | [Español](README_ES.md) | [Deutsch](README_DE.md) | [Italiano](README_IT.md) | [Русский](README_RU.md) | [Português (Brasil)](README_PT-BR.md) --- -## ✨ Why memory-lancedb-pro? +## Why memory-lancedb-pro? -Most AI agents have amnesia. They forget everything the moment you start a new chat. This plugin fixes that. It gives your OpenClaw agent **persistent, intelligent long-term memory** — without you managing any of it. +Most AI agents have amnesia. They forget everything the moment you start a new chat. -| | What you get | -|---|---| -| 🔍 **Hybrid Retrieval** | Vector + BM25 full-text search, fused with cross-encoder reranking | -| 🧠 **Smart Extraction** | LLM-powered 6-category memory extraction — no manual `memory_store` needed | -| ⏳ **Memory Lifecycle** | Weibull decay + 3-tier promotion — important memories surface, stale ones fade | -| 🔒 **Multi-Scope Isolation** | Per-agent, per-user, per-project memory boundaries | -| 🔌 **Any Embedding Provider** | OpenAI, Jina, Gemini, Ollama, or any OpenAI-compatible API | -| 🛠️ **Full Operations Toolkit** | CLI, backup, migration, upgrade, export/import — not a toy | +**memory-lancedb-pro** is a production-grade long-term memory plugin for OpenClaw that turns your agent into an **AI Memory Assistant** — it automatically captures what matters, lets noise naturally fade, and retrieves the right memory at the right time. No manual tagging, no configuration headaches. ---- +### Your AI Memory Assistant in Action -## 🆚 Compared to Built-in `memory-lancedb` +**Without memory — every session starts from zero:** -| Feature | Built-in `memory-lancedb` | **memory-lancedb-pro** | -| --- | :---: | :---: | -| Vector search | ✅ | ✅ | -| BM25 full-text search | ❌ | ✅ | -| Hybrid fusion (Vector + BM25) | ❌ | ✅ | -| Cross-encoder rerank (Jina / custom) | ❌ | ✅ | -| Recency boost & time decay | ❌ | ✅ | -| Length normalization | ❌ | ✅ | -| MMR diversity | ❌ | ✅ | -| Multi-scope isolation | ❌ | ✅ | -| Noise filtering | ❌ | ✅ | -| Adaptive retrieval | ❌ | ✅ | -| Management CLI | ❌ | ✅ | -| Session memory | ❌ | ✅ | -| Task-aware embeddings | ❌ | ✅ | -| **LLM Smart Extraction (6-category)** | ❌ | ✅ (v1.1.0) | -| **Weibull Decay + Tier Promotion** | ❌ | ✅ (v1.1.0) | -| **Legacy Memory Upgrade** | ❌ | ✅ (v1.1.0) | -| Any OpenAI-compatible embedding | Limited | ✅ | +> **You:** "Use tabs for indentation, always add error handling." +> *(next session)* +> **You:** "I already told you — tabs, not spaces!" 😤 +> *(next session)* +> **You:** "...seriously, tabs. And error handling. Again." ---- +**With memory-lancedb-pro — your agent learns and remembers:** -## 📺 Video Tutorial +> **You:** "Use tabs for indentation, always add error handling." +> *(next session — agent auto-recalls your preferences)* +> **Agent:** *(silently applies tabs + error handling)* ✅ +> **You:** "Why did we pick PostgreSQL over MongoDB last month?" +> **Agent:** "Based on our discussion on Feb 12, the main reasons were..." ✅ -> Full walkthrough: installation, configuration, and hybrid retrieval internals. +That's the difference an **AI Memory Assistant** makes — it learns your style, recalls past decisions, and delivers personalized responses without you repeating yourself. -[![YouTube Video](https://img.shields.io/badge/YouTube-Watch%20Now-red?style=for-the-badge&logo=youtube)](https://youtu.be/MtukF1C8epQ) -🔗 **https://youtu.be/MtukF1C8epQ** +### What else can it do? -[![Bilibili Video](https://img.shields.io/badge/Bilibili-Watch%20Now-00A1D6?style=for-the-badge&logo=bilibili&logoColor=white)](https://www.bilibili.com/video/BV1zUf2BGEgn/) -🔗 **https://www.bilibili.com/video/BV1zUf2BGEgn/** +| | What you get | +|---|---| +| **Auto-Capture** | Your agent learns from every conversation — no manual `memory_store` needed | +| **Smart Extraction** | LLM-powered 6-category classification: profiles, preferences, entities, events, cases, patterns | +| **Intelligent Forgetting** | Weibull decay model — important memories stay, noise naturally fades away | +| **Hybrid Retrieval** | Vector + BM25 full-text search, fused with cross-encoder reranking | +| **Context Injection** | Relevant memories automatically surface before each reply | +| **Multi-Scope Isolation** | Per-agent, per-user, per-project memory boundaries | +| **Any Provider** | OpenAI, Jina, Gemini, Ollama, or any OpenAI-compatible API | +| **Full Toolkit** | CLI, backup, migration, upgrade, export/import — production-ready | --- -## 🚀 Quick Start (30 seconds) +## Quick Start + +### Option A: One-Click Install Script (Recommended) -### 1. Install +The community-maintained **[setup script](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** handles install, upgrade, and repair in one command: ```bash -npm i memory-lancedb-pro@beta +curl -fsSL https://raw.githubusercontent.com/CortexReach/toolbox/main/memory-lancedb-pro-setup/setup-memory.sh -o setup-memory.sh +bash setup-memory.sh +``` + +> See [Ecosystem](#ecosystem) below for the full list of scenarios the script covers and other community tools. + +### Option B: Manual Install + +**Via OpenClaw CLI (recommended):** +```bash +openclaw plugins install memory-lancedb-pro@beta ``` -### 2. Configure +**Or via npm:** +```bash +npm i memory-lancedb-pro@beta +``` +> If using npm, you will also need to add the plugin's install directory as an **absolute** path in `plugins.load.paths` in your `openclaw.json`. This is the most common setup issue. Add to your `openclaw.json`: ```json { "plugins": { - "slots": { - "memory": "memory-lancedb-pro" - }, + "slots": { "memory": "memory-lancedb-pro" }, "entries": { "memory-lancedb-pro": { "enabled": true, @@ -100,9 +115,7 @@ Add to your `openclaw.json`: "smartExtraction": true, "extractMinMessages": 2, "extractMaxChars": 8000, - "sessionMemory": { - "enabled": false - } + "sessionMemory": { "enabled": false } } } } @@ -114,31 +127,57 @@ Add to your `openclaw.json`: - `autoCapture` + `smartExtraction` → your agent learns from every conversation automatically - `autoRecall` → relevant memories are injected before each reply - `extractMinMessages: 2` → extraction triggers in normal two-turn chats -- `sessionMemory: false` → avoids polluting retrieval with session summaries on day one +- `sessionMemory.enabled: false` → avoids polluting retrieval with session summaries on day one -### 3. Validate & restart +Validate & restart: ```bash openclaw config validate openclaw gateway restart -openclaw logs --follow --plain | rg "memory-lancedb-pro" +openclaw logs --follow --plain | grep "memory-lancedb-pro" ``` You should see: - `memory-lancedb-pro: smart extraction enabled` - `memory-lancedb-pro@...: plugin registered` -🎉 **Done!** Your agent now has long-term memory. +Done! Your agent now has long-term memory. + +
+More installation paths (existing users, upgrades) + +**Already using OpenClaw?** + +1. Add the plugin with an **absolute** `plugins.load.paths` entry +2. Bind the memory slot: `plugins.slots.memory = "memory-lancedb-pro"` +3. Verify: `openclaw plugins info memory-lancedb-pro && openclaw memory-pro stats` + +**Upgrading from pre-v1.1.0?** + +```bash +# 1) Backup +openclaw memory-pro export --scope global --output memories-backup.json +# 2) Dry run +openclaw memory-pro upgrade --dry-run +# 3) Run upgrade +openclaw memory-pro upgrade +# 4) Verify +openclaw memory-pro stats +``` + +See `CHANGELOG-v1.1.0.md` for behavior changes and upgrade rationale. + +
-💬 OpenClaw Quick Import via Telegram Bot (click to expand) +Telegram Bot Quick Import (click to expand) If you are using OpenClaw's Telegram integration, the easiest way is to send an import command directly to the main Bot instead of manually editing config. Send this message: ```text -Help me connect this memory plugin with the best user-experience config: https://github.com/CortexReach/memory-lancedb-pro +Help me connect this memory plugin with the most user-friendly configuration: https://github.com/CortexReach/memory-lancedb-pro Requirements: 1. Set it as the only active memory plugin @@ -152,72 +191,77 @@ Requirements: 9. retrieval mode=hybrid, vectorWeight=0.7, bm25Weight=0.3 10. rerank=cross-encoder, candidatePoolSize=12, minScore=0.6, hardMinScore=0.62 11. Generate the final openclaw.json config directly, not just an explanation +``` -{ - "embedding": { - "provider": "openai-compatible", - "apiKey": "${JINA_API_KEY}", - "model": "jina-embeddings-v5-text-small", - "baseURL": "https://api.jina.ai/v1", - "dimensions": 1024, - "taskQuery": "retrieval.query", - "taskPassage": "retrieval.passage", - "normalized": true - }, - "dbPath": "~/.openclaw/memory/lancedb-pro", - "autoCapture": true, - "autoRecall": true, - "captureAssistant": false, - "smartExtraction": true, - "extractMinMessages": 2, - "extractMaxChars": 8000, - "sessionMemory": { - "enabled": false - }, - "retrieval": { - "mode": "hybrid", - "vectorWeight": 0.7, - "bm25Weight": 0.3, - "rerank": "cross-encoder", - "rerankProvider": "jina", - "rerankEndpoint": "https://api.jina.ai/v1/rerank", - "rerankModel": "jina-reranker-v3", - "candidatePoolSize": 12, - "minScore": 0.6, - "hardMinScore": 0.62, - "rerankApiKey": "${JINA_API_KEY}" - }, - "llm": { - "apiKey": "${OPENAI_API_KEY}", - "model": "gpt-4o-mini", - "baseURL": "https://api.openai.com/v1" - } -} +
+ +--- + +## Ecosystem + +memory-lancedb-pro is the core plugin. The community has built tools around it to make setup and daily use even smoother: + +### Setup Script — One-Click Install, Upgrade & Repair + +> **[CortexReach/toolbox/memory-lancedb-pro-setup](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** + +Not just a simple installer — the script intelligently handles a wide range of real-world scenarios: + +| Your situation | What the script does | +|---|---| +| Never installed | Fresh download → install deps → pick config → write to openclaw.json → restart | +| Installed via `git clone`, stuck on old commit | Auto `git fetch` + `checkout` to latest → reinstall deps → verify | +| Config has invalid fields | Auto-detect via schema filter, remove unsupported fields | +| Installed via `npm` | Skips git update, reminds you to run `npm update` yourself | +| `openclaw` CLI broken due to invalid config | Fallback: read workspace path directly from `openclaw.json` file | +| `extensions/` instead of `plugins/` | Auto-detect plugin location from config or filesystem | +| Already up to date | Run health checks only, no changes | + +```bash +bash setup-memory.sh # Install or upgrade +bash setup-memory.sh --dry-run # Preview only +bash setup-memory.sh --beta # Include pre-release versions +bash setup-memory.sh --uninstall # Revert config and remove plugin ``` -If you already have your own OpenAI-compatible services, just replace the relevant block: +Built-in provider presets: **Jina / DashScope / SiliconFlow / OpenAI / Ollama**, or bring your own OpenAI-compatible API. For full usage (including `--ref`, `--selfcheck-only`, and more), see the [setup script README](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup). -- `embedding`: change `apiKey` / `model` / `baseURL` / `dimensions` -- `retrieval`: change `rerankProvider` / `rerankEndpoint` / `rerankModel` / `rerankApiKey` -- `llm`: change `apiKey` / `model` / `baseURL` +### Claude Code / OpenClaw Skill — AI-Guided Configuration -For example, to replace only the LLM: +> **[CortexReach/memory-lancedb-pro-skill](https://github.com/CortexReach/memory-lancedb-pro-skill)** -```json -{ - "llm": { - "apiKey": "${GROQ_API_KEY}", - "model": "openai/gpt-oss-120b", - "baseURL": "https://api.groq.com/openai/v1" - } -} +Install this skill and your AI agent (Claude Code or OpenClaw) gains deep knowledge of every feature in memory-lancedb-pro. Just say **"help me enable the best config"** and get: + +- **Guided 7-step configuration workflow** with 4 deployment plans: + - Full Power (Jina + OpenAI) / Budget (free SiliconFlow reranker) / Simple (OpenAI only) / Fully Local (Ollama, zero API cost) +- **All 9 MCP tools** used correctly: `memory_recall`, `memory_store`, `memory_forget`, `memory_update`, `memory_stats`, `memory_list`, `self_improvement_log`, `self_improvement_extract_skill`, `self_improvement_review` *(full toolset requires `enableManagementTools: true` — the default Quick Start config exposes the 4 core tools)* +- **Common pitfall avoidance**: workspace plugin enablement, `autoRecall` default-false, jiti cache, env vars, scope isolation, and more + +**Install for Claude Code:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.claude/skills/memory-lancedb-pro ``` - +**Install for OpenClaw:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.openclaw/workspace/skills/memory-lancedb-pro-skill +``` + +--- + +## Video Tutorial + +> Full walkthrough: installation, configuration, and hybrid retrieval internals. + +[![YouTube Video](https://img.shields.io/badge/YouTube-Watch%20Now-red?style=for-the-badge&logo=youtube)](https://youtu.be/MtukF1C8epQ) +**https://youtu.be/MtukF1C8epQ** + +[![Bilibili Video](https://img.shields.io/badge/Bilibili-Watch%20Now-00A1D6?style=for-the-badge&logo=bilibili&logoColor=white)](https://www.bilibili.com/video/BV1zUf2BGEgn/) +**https://www.bilibili.com/video/BV1zUf2BGEgn/** --- -## 🏗️ Architecture +## Architecture ``` ┌─────────────────────────────────────────────────────────┐ @@ -241,56 +285,51 @@ For example, to replace only the LLM: └─────────────┘ └──────────┘ ``` -> 📖 For a deep-dive into the full architecture (data flow, lifecycle, storage internals), see [docs/memory_architecture_analysis.md](docs/memory_architecture_analysis.md). +> For a deep-dive into the full architecture, see [docs/memory_architecture_analysis.md](docs/memory_architecture_analysis.md).
-📄 File Reference (click to expand) +File Reference (click to expand) | File | Purpose | | --- | --- | -| `index.ts` | Plugin entry point. Registers with OpenClaw Plugin API, parses config, mounts `before_agent_start` (auto-recall), `agent_end` (auto-capture), and `command:new` (session memory) hooks | -| `openclaw.plugin.json` | Plugin metadata + full JSON Schema config declaration (with `uiHints`) | -| `package.json` | NPM package info. Depends on `@lancedb/lancedb`, `openai`, `@sinclair/typebox` | -| `cli.ts` | CLI commands: `memory list/search/stats/delete/delete-bulk/export/import/reembed/upgrade/migrate` | -| `src/store.ts` | LanceDB storage layer. Table creation / FTS indexing / Vector search / BM25 search / CRUD / bulk delete / stats | -| `src/embedder.ts` | Embedding abstraction. Compatible with any OpenAI-API provider. Supports task-aware embedding (`taskQuery`/`taskPassage`) | -| `src/retriever.ts` | Hybrid retrieval engine. Vector + BM25 → RRF fusion → Rerank → Lifecycle Decay → Length Norm → Hard Min Score → Noise Filter → MMR | -| `src/scopes.ts` | Multi-scope access control: `global`, `agent:`, `custom:`, `project:`, `user:` | +| `index.ts` | Plugin entry point. Registers with OpenClaw Plugin API, parses config, mounts lifecycle hooks via `api.on()` and command hooks via `api.registerHook()` | +| `openclaw.plugin.json` | Plugin metadata + full JSON Schema config declaration | +| `cli.ts` | CLI commands: `memory-pro list/search/stats/delete/delete-bulk/export/import/reembed/upgrade/migrate` | +| `src/store.ts` | LanceDB storage layer. Table creation / FTS indexing / Vector search / BM25 search / CRUD | +| `src/embedder.ts` | Embedding abstraction. Compatible with any OpenAI-compatible API provider | +| `src/retriever.ts` | Hybrid retrieval engine. Vector + BM25 → Hybrid Fusion → Rerank → Lifecycle Decay → Filter | +| `src/scopes.ts` | Multi-scope access control | | `src/tools.ts` | Agent tool definitions: `memory_recall`, `memory_store`, `memory_forget`, `memory_update` + management tools | | `src/noise-filter.ts` | Filters out agent refusals, meta-questions, greetings, and low-quality content | | `src/adaptive-retrieval.ts` | Determines whether a query needs memory retrieval | | `src/migrate.ts` | Migration from built-in `memory-lancedb` to Pro | -| `src/smart-extractor.ts` | **(v1.1.0)** LLM-powered 6-category extraction with L0/L1/L2 layered storage and two-stage dedup | -| `src/memory-categories.ts` | **(v1.1.0)** 6-category system: profile, preferences, entities, events, cases, patterns | -| `src/decay-engine.ts` | **(v1.1.0)** Weibull stretched-exponential decay model | -| `src/tier-manager.ts` | **(v1.1.0)** Three-tier promotion/demotion: Peripheral ⟷ Working ⟷ Core | -| `src/memory-upgrader.ts` | **(v1.1.0)** Batch upgrade legacy memories to new smart format | -| `src/llm-client.ts` | **(v1.1.0)** LLM client for structured JSON output | -| `src/extraction-prompts.ts` | **(v1.1.0)** LLM prompt templates for extraction, dedup, and merge | -| `src/smart-metadata.ts` | **(v1.1.0)** Metadata normalization for L0/L1/L2, tier, confidence, access counters, and lifecycle fields | +| `src/smart-extractor.ts` | LLM-powered 6-category extraction with L0/L1/L2 layered storage and two-stage dedup | +| `src/decay-engine.ts` | Weibull stretched-exponential decay model | +| `src/tier-manager.ts` | Three-tier promotion/demotion: Peripheral ↔ Working ↔ Core |
--- -## 📦 Core Features +## Core Features ### Hybrid Retrieval ``` Query → embedQuery() ─┐ - ├─→ RRF Fusion → Rerank → Lifecycle Decay Boost → Length Norm → Filter + ├─→ Hybrid Fusion → Rerank → Lifecycle Decay Boost → Length Norm → Filter Query → BM25 FTS ─────┘ ``` - **Vector Search** — semantic similarity via LanceDB ANN (cosine distance) - **BM25 Full-Text Search** — exact keyword matching via LanceDB FTS index -- **Fusion** — vector score as base, BM25 hits get a 15% boost (tuned beyond traditional RRF) +- **Hybrid Fusion** — vector score as base, BM25 hits receive a weighted boost (not standard RRF — tuned for real-world recall quality) - **Configurable Weights** — `vectorWeight`, `bm25Weight`, `minScore` ### Cross-Encoder Reranking -- Supports **Jina**, **SiliconFlow**, **Voyage AI**, **Pinecone**, or any compatible endpoint +- Built-in adapters for **Jina**, **SiliconFlow**, **Voyage AI**, and **Pinecone** +- Compatible with any Jina-compatible endpoint (e.g., Hugging Face TEI, DashScope) - Hybrid scoring: 60% cross-encoder + 40% original fused score - Graceful degradation: falls back to cosine similarity on API failure @@ -298,7 +337,7 @@ Query → BM25 FTS ─────┘ | Stage | Effect | | --- | --- | -| **RRF Fusion** | Combines semantic and exact-match recall | +| **Hybrid Fusion** | Combines semantic and exact-match recall | | **Cross-Encoder Rerank** | Promotes semantically precise hits | | **Lifecycle Decay Boost** | Weibull freshness + access frequency + importance × confidence | | **Length Normalization** | Prevents long entries from dominating (anchor: 500 chars) | @@ -315,8 +354,8 @@ Query → BM25 FTS ─────┘ ### Memory Lifecycle Management (v1.1.0) - **Weibull Decay Engine**: composite score = recency + frequency + intrinsic value -- **Decay-Aware Retrieval**: results re-ranked by lifecycle decay -- **Three-Tier Promotion**: `Peripheral ⟷ Working ⟷ Core` with configurable thresholds +- **Three-Tier Promotion**: `Peripheral ↔ Working ↔ Core` with configurable thresholds +- **Access Reinforcement**: frequently recalled memories decay slower (spaced-repetition style) - **Importance-Modulated Half-Life**: important memories decay slower ### Multi-Scope Isolation @@ -328,7 +367,9 @@ Query → BM25 FTS ─────┘ ### Auto-Capture & Auto-Recall - **Auto-Capture** (`agent_end`): extracts preference/fact/decision/entity from conversations, deduplicates, stores up to 3 per turn -- **Auto-Recall** (`before_agent_start`): injects `` context (up to 3 entries) +- **Auto-Recall** (`before_prompt_build`): injects `` context (up to 3 entries) + +> **Note (v1.1.0-beta.9+):** Auto-recall now uses the `before_prompt_build` hook instead of the deprecated `before_agent_start`. See [Hook Adaptation](#hook-adaptation-openclaw-20263) below for details. ### Noise Filtering & Adaptive Retrieval @@ -337,15 +378,35 @@ Query → BM25 FTS ─────┘ - Forces retrieval for memory keywords ("remember", "previously", "last time") - CJK-aware thresholds (Chinese: 6 chars vs English: 15 chars) -### Legacy Memory Upgrade (v1.1.0) +--- -- One-command upgrade: `openclaw memory-pro upgrade` -- LLM or no-LLM mode for offline use -- Automatic detection at startup with upgrade suggestion +
+Compared to Built-in memory-lancedb (click to expand) + +| Feature | Built-in `memory-lancedb` | **memory-lancedb-pro** | +| --- | :---: | :---: | +| Vector search | Yes | Yes | +| BM25 full-text search | - | Yes | +| Hybrid fusion (Vector + BM25) | - | Yes | +| Cross-encoder rerank (multi-provider) | - | Yes | +| Recency boost & time decay | - | Yes | +| Length normalization | - | Yes | +| MMR diversity | - | Yes | +| Multi-scope isolation | - | Yes | +| Noise filtering | - | Yes | +| Adaptive retrieval | - | Yes | +| Management CLI | - | Yes | +| Session memory | - | Yes | +| Task-aware embeddings | - | Yes | +| **LLM Smart Extraction (6-category)** | - | Yes (v1.1.0) | +| **Weibull Decay + Tier Promotion** | - | Yes (v1.1.0) | +| Any OpenAI-compatible embedding | Limited | Yes | + +
--- -## ⚙️ Configuration +## Configuration
Full Configuration Example @@ -410,26 +471,20 @@ Query → BM25 FTS ─────┘ } ``` -OpenClaw-specific defaults: - -- `autoCapture`: enabled by default -- `autoRecall`: disabled by default in the plugin schema, but for most new users this README recommends turning it on -- `embedding.chunking`: enabled by default -- `sessionMemory.enabled`: disabled by default; set to `true` explicitly if you want the `/new` session-summary hook -
Embedding Providers -This plugin works with **any OpenAI-compatible embedding API**: +Works with **any OpenAI-compatible embedding API**: | Provider | Model | Base URL | Dimensions | | --- | --- | --- | --- | | **Jina** (recommended) | `jina-embeddings-v5-text-small` | `https://api.jina.ai/v1` | 1024 | | **OpenAI** | `text-embedding-3-small` | `https://api.openai.com/v1` | 1536 | +| **Voyage** | `voyage-4-lite` / `voyage-4` | `https://api.voyageai.com/v1` | 1024 / 1024 | | **Google Gemini** | `gemini-embedding-001` | `https://generativelanguage.googleapis.com/v1beta/openai/` | 3072 | -| **Ollama** (local) | `nomic-embed-text` | `http://localhost:11434/v1` | _provider-specific_ | +| **Ollama** (local) | `nomic-embed-text` | `http://localhost:11434/v1` | provider-specific |
@@ -438,66 +493,14 @@ This plugin works with **any OpenAI-compatible embedding API**: Cross-encoder reranking supports multiple providers via `rerankProvider`: -| Provider | `rerankProvider` | Endpoint | Example Model | -| --- | --- | --- | --- | -| **Jina** (default) | `jina` | `https://api.jina.ai/v1/rerank` | `jina-reranker-v3` | -| **SiliconFlow** (free tier available) | `siliconflow` | `https://api.siliconflow.com/v1/rerank` | `BAAI/bge-reranker-v2-m3` | -| **Voyage AI** | `voyage` | `https://api.voyageai.com/v1/rerank` | `rerank-2.5` | -| **Pinecone** | `pinecone` | `https://api.pinecone.io/rerank` | `bge-reranker-v2-m3` | - -
-SiliconFlow config example - -```json -{ - "retrieval": { - "rerank": "cross-encoder", - "rerankProvider": "siliconflow", - "rerankEndpoint": "https://api.siliconflow.com/v1/rerank", - "rerankApiKey": "sk-xxx", - "rerankModel": "BAAI/bge-reranker-v2-m3" - } -} -``` - -
- -
-Voyage config example - -```json -{ - "retrieval": { - "rerank": "cross-encoder", - "rerankProvider": "voyage", - "rerankEndpoint": "https://api.voyageai.com/v1/rerank", - "rerankApiKey": "${VOYAGE_API_KEY}", - "rerankModel": "rerank-2.5" - } -} -``` - -
- -
-Pinecone config example - -```json -{ - "retrieval": { - "rerank": "cross-encoder", - "rerankProvider": "pinecone", - "rerankEndpoint": "https://api.pinecone.io/rerank", - "rerankApiKey": "pcsk_xxx", - "rerankModel": "bge-reranker-v2-m3" - } -} -``` - -
+| Provider | `rerankProvider` | Example Model | +| --- | --- | --- | +| **Jina** (default) | `jina` | `jina-reranker-v3` | +| **SiliconFlow** (free tier available) | `siliconflow` | `BAAI/bge-reranker-v2-m3` | +| **Voyage AI** | `voyage` | `rerank-2.5` | +| **Pinecone** | `pinecone` | `bge-reranker-v2-m3` | -Notes: -- `voyage` sends `{ model, query, documents }` without `top_n`. Responses are parsed from `data[].relevance_score`. +Any Jina-compatible rerank endpoint also works — set `rerankProvider: "jina"` and point `rerankEndpoint` to your service (e.g., Hugging Face TEI, DashScope `qwen3-rerank`). @@ -509,259 +512,98 @@ When `smartExtraction` is enabled (default: `true`), the plugin uses an LLM to i | Field | Type | Default | Description | |-------|------|---------|-------------| | `smartExtraction` | boolean | `true` | Enable/disable LLM-powered 6-category extraction | +| `llm.auth` | string | `api-key` | `api-key` uses `llm.apiKey` / `embedding.apiKey`; `oauth` uses a plugin-scoped OAuth token file by default | | `llm.apiKey` | string | *(falls back to `embedding.apiKey`)* | API key for the LLM provider | | `llm.model` | string | `openai/gpt-oss-120b` | LLM model name | | `llm.baseURL` | string | *(falls back to `embedding.baseURL`)* | LLM API endpoint | +| `llm.oauthProvider` | string | `openai-codex` | OAuth provider id used when `llm.auth` is `oauth` | +| `llm.oauthPath` | string | `~/.openclaw/.memory-lancedb-pro/oauth.json` | OAuth token file used when `llm.auth` is `oauth` | +| `llm.timeoutMs` | number | `30000` | LLM request timeout in milliseconds | | `extractMinMessages` | number | `2` | Minimum messages before extraction triggers | | `extractMaxChars` | number | `8000` | Maximum characters sent to the LLM | -Minimal config (reuses embedding API key): -```json -{ - "embedding": { "apiKey": "${OPENAI_API_KEY}", "model": "text-embedding-3-small" }, - "smartExtraction": true -} -``` -Full config (separate LLM endpoint): +OAuth `llm` config (use existing Codex / ChatGPT login cache for LLM calls): ```json { - "embedding": { "apiKey": "${OPENAI_API_KEY}", "model": "text-embedding-3-small" }, - "smartExtraction": true, - "llm": { "apiKey": "${OPENAI_API_KEY}", "model": "gpt-4o-mini", "baseURL": "https://api.openai.com/v1" }, - "extractMinMessages": 2, - "extractMaxChars": 8000 + "llm": { + "auth": "oauth", + "oauthProvider": "openai-codex", + "model": "gpt-5.4", + "oauthPath": "${HOME}/.openclaw/.memory-lancedb-pro/oauth.json", + "timeoutMs": 30000 + } } ``` -Disable: `{ "smartExtraction": false }` +Notes for `llm.auth: "oauth"`: + +- `llm.oauthProvider` is currently `openai-codex`. +- OAuth tokens default to `~/.openclaw/.memory-lancedb-pro/oauth.json`. +- You can set `llm.oauthPath` if you want to store that file somewhere else. +- `auth login` snapshots the previous api-key `llm` config next to the OAuth file, and `auth logout` restores that snapshot when available. +- Switching from `api-key` to `oauth` does not automatically carry over `llm.baseURL`. Set it manually in OAuth mode only when you intentionally want a custom ChatGPT/Codex-compatible backend.
Lifecycle Configuration (Decay + Tier) -These settings control freshness ranking and automatic tier transitions. - -| Field | Type | Default | Description | -|-------|------|---------|-------------| -| `decay.recencyHalfLifeDays` | number | `30` | Base half-life for Weibull recency decay | -| `decay.frequencyWeight` | number | `0.3` | Weight of access frequency in composite score | -| `decay.intrinsicWeight` | number | `0.3` | Weight of `importance × confidence` | -| `decay.betaCore` | number | `0.8` | Weibull beta for `core` memories | -| `decay.betaWorking` | number | `1.0` | Weibull beta for `working` memories | -| `decay.betaPeripheral` | number | `1.3` | Weibull beta for `peripheral` memories | -| `tier.coreAccessThreshold` | number | `10` | Min recall count before promoting to `core` | -| `tier.coreCompositeThreshold` | number | `0.7` | Min lifecycle score before promoting to `core` | -| `tier.peripheralCompositeThreshold` | number | `0.15` | Below this score, `working` may demote | -| `tier.peripheralAgeDays` | number | `60` | Age threshold for demoting stale memories | - -```json -{ - "decay": { "recencyHalfLifeDays": 21, "betaCore": 0.7, "betaPeripheral": 1.5 }, - "tier": { "coreAccessThreshold": 8, "peripheralAgeDays": 45 } -} -``` +| Field | Default | Description | +|-------|---------|-------------| +| `decay.recencyHalfLifeDays` | `30` | Base half-life for Weibull recency decay | +| `decay.frequencyWeight` | `0.3` | Weight of access frequency in composite score | +| `decay.intrinsicWeight` | `0.3` | Weight of `importance × confidence` | +| `decay.betaCore` | `0.8` | Weibull beta for `core` memories | +| `decay.betaWorking` | `1.0` | Weibull beta for `working` memories | +| `decay.betaPeripheral` | `1.3` | Weibull beta for `peripheral` memories | +| `tier.coreAccessThreshold` | `10` | Min recall count before promoting to `core` | +| `tier.peripheralAgeDays` | `60` | Age threshold for demoting stale memories |
-Access Reinforcement (1.0.26) +Access Reinforcement Frequently recalled memories decay more slowly (spaced-repetition style). Config keys (under `retrieval`): -- `reinforcementFactor` (0–2, default: `0.5`) — set `0` to disable -- `maxHalfLifeMultiplier` (1–10, default: `3`) — hard cap on effective half-life - -Note: reinforcement is whitelisted to `source: "manual"` only, to avoid auto-recall accidentally strengthening noise. +- `reinforcementFactor` (0-2, default: `0.5`) — set `0` to disable +- `maxHalfLifeMultiplier` (1-10, default: `3`) — hard cap on effective half-life
--- -## 📥 Installation - -
-Path A — New to OpenClaw (recommended) - -1. Clone into your workspace: - -```bash -cd /path/to/your/openclaw/workspace -git clone https://github.com/CortexReach/memory-lancedb-pro.git plugins/memory-lancedb-pro -cd plugins/memory-lancedb-pro -npm install -``` - -2. Add to `openclaw.json` (relative path): - -```json -{ - "plugins": { - "load": { "paths": ["plugins/memory-lancedb-pro"] }, - "entries": { - "memory-lancedb-pro": { - "enabled": true, - "config": { - "embedding": { - "apiKey": "${JINA_API_KEY}", - "model": "jina-embeddings-v5-text-small", - "baseURL": "https://api.jina.ai/v1", - "dimensions": 1024, - "taskQuery": "retrieval.query", - "taskPassage": "retrieval.passage", - "normalized": true - } - } - } - }, - "slots": { "memory": "memory-lancedb-pro" } - } -} -``` - -3. Restart and verify: - -```bash -openclaw config validate -openclaw gateway restart -openclaw plugins info memory-lancedb-pro -openclaw hooks list --json -openclaw memory-pro stats -``` - -4. Smoke test: store one memory → search by keyword → search by natural language. - -
- -
-Path B — Already using OpenClaw, adding this plugin - -1. Keep your existing agents, channels, and models unchanged -2. Add the plugin with an **absolute** `plugins.load.paths` entry: - -```json -{ "plugins": { "load": { "paths": ["/absolute/path/to/memory-lancedb-pro"] } } } -``` - -3. Bind the memory slot: `plugins.slots.memory = "memory-lancedb-pro"` -4. Verify: `openclaw plugins info memory-lancedb-pro && openclaw memory-pro stats` - -
- -
-Path C — Upgrading from older memory-lancedb-pro (pre-v1.1.0) - -Command boundaries: -- `upgrade` — for **older `memory-lancedb-pro` data** -- `migrate` — only from built-in **`memory-lancedb`** -- `reembed` — only when rebuilding embeddings after model change - -Safe upgrade sequence: - -```bash -# 1) Backup -openclaw memory-pro export --scope global --output memories-backup.json - -# 2) Dry run -openclaw memory-pro upgrade --dry-run - -# 3) Run upgrade -openclaw memory-pro upgrade - -# 4) Verify -openclaw memory-pro stats -openclaw memory-pro search "your known keyword" --scope global --limit 5 -``` - -See `CHANGELOG-v1.1.0.md` for behavior changes and upgrade rationale. - -
- -
-Post-install verification checklist - -```bash -openclaw config validate -openclaw gateway restart -openclaw plugins info memory-lancedb-pro -openclaw hooks list --json -openclaw memory-pro stats -openclaw memory-pro list --scope global --limit 5 -``` - -Then validate: -- ✅ one exact-id search hit -- ✅ one natural-language search hit -- ✅ one `memory_store` → `memory_recall` round trip -- ✅ if session memory is enabled, one real `/new` test - -
- -
-AI-safe install notes (anti-hallucination) - -If you are following this README with an AI assistant, **do not assume defaults**. Always run: - -```bash -openclaw config get agents.defaults.workspace -openclaw config get plugins.load.paths -openclaw config get plugins.slots.memory -openclaw config get plugins.entries.memory-lancedb-pro -``` - -Tips: -- Prefer **absolute paths** in `plugins.load.paths` -- If you use `${JINA_API_KEY}` in config, ensure the **Gateway service process** has that env var -- After changing plugin config, run `openclaw gateway restart` - -
- -
-Jina API keys (embedding + rerank) - -- **Embedding**: set `embedding.apiKey` to your Jina key (use env var `${JINA_API_KEY}` recommended) -- **Rerank** (when `rerankProvider: "jina"`): you can use the **same** Jina key for `retrieval.rerankApiKey` -- Different rerank provider? Use that provider's key for `retrieval.rerankApiKey` - -Key storage: avoid committing secrets into git. When using `${...}` env vars, ensure the Gateway service process has them. - -
- -
-What is the "OpenClaw workspace"? - -The **agent workspace** is the agent's working directory (default: `~/.openclaw/workspace`). Relative paths are resolved against the workspace. - -> Note: OpenClaw config typically lives at `~/.openclaw/openclaw.json` (separate from the workspace). - -**Common mistake:** cloning the plugin elsewhere while keeping a relative path in config. Use an absolute path (Path B) or clone into `/plugins/` (Path A). - -
- ---- - -## 🔧 CLI Commands +## CLI Commands ```bash openclaw memory-pro list [--scope global] [--category fact] [--limit 20] [--json] openclaw memory-pro search "query" [--scope global] [--limit 10] [--json] openclaw memory-pro stats [--scope global] [--json] +openclaw memory-pro auth login [--provider openai-codex] [--model gpt-5.4] [--oauth-path /abs/path/oauth.json] +openclaw memory-pro auth status +openclaw memory-pro auth logout openclaw memory-pro delete openclaw memory-pro delete-bulk --scope global [--before 2025-01-01] [--dry-run] openclaw memory-pro export [--scope global] [--output memories.json] openclaw memory-pro import memories.json [--scope global] [--dry-run] openclaw memory-pro reembed --source-db /path/to/old-db [--batch-size 32] [--skip-existing] openclaw memory-pro upgrade [--dry-run] [--batch-size 10] [--no-llm] [--limit N] [--scope SCOPE] -openclaw memory-pro migrate check [--source /path] -openclaw memory-pro migrate run [--source /path] [--dry-run] [--skip-existing] -openclaw memory-pro migrate verify [--source /path] +openclaw memory-pro migrate check|run|verify [--source /path] ``` +OAuth login flow: + +1. Run `openclaw memory-pro auth login` +2. If `--provider` is omitted in an interactive terminal, the CLI shows an OAuth provider picker before opening the browser +3. The command prints an authorization URL and opens your browser unless `--no-browser` is set +4. After the callback succeeds, the command saves the plugin OAuth file (default: `~/.openclaw/.memory-lancedb-pro/oauth.json`), snapshots the previous api-key `llm` config for logout, and replaces the plugin `llm` config with OAuth settings (`auth`, `oauthProvider`, `model`, `oauthPath`) +5. `openclaw memory-pro auth logout` deletes that OAuth file and restores the previous api-key `llm` config when that snapshot exists + --- -## 📚 Advanced Topics +## Advanced Topics
If injected memories show up in replies @@ -779,40 +621,26 @@ Sometimes the model may echo the injected `` block.
-Session Memory +Auto-recall timeout tuning -- Triggered on `/new` command — saves previous session summary to LanceDB -- Disabled by default (OpenClaw already has native `.jsonl` session persistence) -- Configurable message count (default: 15) +Auto-recall has a configurable timeout (default 5s) to prevent stalling agent startup. If you're behind a proxy or using a high-latency embedding API, increase it: -See [docs/openclaw-integration-playbook.md](docs/openclaw-integration-playbook.md) for deployment modes and `/new` verification. +```json +{ "plugins": { "entries": { "memory-lancedb-pro": { "config": { "autoRecallTimeoutMs": 8000 } } } } } +``` + +If auto-recall consistently times out, check your embedding API latency first. The timeout only affects the automatic injection path — manual `memory_recall` tool calls are not affected.
-JSONL Session Distillation (auto-memories from chat logs) - -OpenClaw persists full session transcripts as JSONL: `~/.openclaw/agents//sessions/*.jsonl` - -**Recommended (2026-02+)**: non-blocking `/new` pipeline: -- Trigger: `command:new` → enqueue tiny JSON task (no LLM calls in hook) -- Worker: systemd service runs Gemini Map-Reduce on session JSONL -- Store: writes 0–20 high-signal lessons via `openclaw memory-pro import` -- Keywords: each memory includes `Keywords (zh)` with entity keywords copied verbatim from transcript - -Example files: `examples/new-session-distill/` - -**Legacy option**: hourly distiller cron using `scripts/jsonl_distill.py`: -- Incremental reads (byte-offset cursor), filters noise, uses a dedicated agent to distill -- Stores via `memory_store` into the right scope -- Safe: never modifies session logs +Session Memory -Setup: -1. Create agent: `openclaw agents add memory-distiller --non-interactive --workspace ~/.openclaw/workspace-memory-distiller --model openai-codex/gpt-5.2` -2. Init cursor: `python3 "$PLUGIN_DIR/scripts/jsonl_distill.py" init` -3. Add cron: see full command in the [legacy distillation docs](docs/openclaw-integration-playbook.md) +- Triggered on `/new` command — saves previous session summary to LanceDB +- Disabled by default (OpenClaw already has native `.jsonl` session persistence) +- Configurable message count (default: 15) -Rollback: `openclaw cron disable ` → `openclaw agents delete memory-distiller` → `rm -rf ~/.openclaw/state/jsonl-distill/` +See [docs/openclaw-integration-playbook.md](docs/openclaw-integration-playbook.md) for deployment modes and `/new` verification.
@@ -834,35 +662,32 @@ When the user sends `/remember `: 2. Confirm with the stored memory ID ``` -Built-in tools: `memory_store`, `memory_recall`, `memory_forget`, `memory_update` — registered automatically when the plugin loads. -
-Iron Rules for AI Agents (铁律) +Iron Rules for AI Agents > Copy the block below into your `AGENTS.md` so your agent enforces these rules automatically. ```markdown -## Rule 1 — 双层记忆存储(铁律) +## Rule 1 — Dual-layer memory storage Every pitfall/lesson learned → IMMEDIATELY store TWO memories: -- **Technical layer**: Pitfall: [symptom]. Cause: [root cause]. Fix: [solution]. Prevention: [how to avoid] - (category: fact, importance ≥ 0.8) -- **Principle layer**: Decision principle ([tag]): [behavioral rule]. Trigger: [when]. Action: [what to do] - (category: decision, importance ≥ 0.85) -- After each store, immediately `memory_recall` to verify retrieval. +- Technical layer: Pitfall: [symptom]. Cause: [root cause]. Fix: [solution]. Prevention: [how to avoid] + (category: fact, importance >= 0.8) +- Principle layer: Decision principle ([tag]): [behavioral rule]. Trigger: [when]. Action: [what to do] + (category: decision, importance >= 0.85) -## Rule 2 — LanceDB 卫生 +## Rule 2 — LanceDB hygiene Entries must be short and atomic (< 500 chars). No raw conversation summaries or duplicates. ## Rule 3 — Recall before retry -On ANY tool failure, ALWAYS `memory_recall` with relevant keywords BEFORE retrying. +On ANY tool failure, ALWAYS memory_recall with relevant keywords BEFORE retrying. -## Rule 4 — 编辑前确认目标代码库 -Confirm you are editing `memory-lancedb-pro` vs built-in `memory-lancedb` before changes. +## Rule 4 — Confirm target codebase +Confirm you are editing memory-lancedb-pro vs built-in memory-lancedb before changes. -## Rule 5 — 插件代码变更必须清 jiti 缓存 -After modifying `.ts` files under `plugins/`, MUST run `rm -rf /tmp/jiti/` BEFORE `openclaw gateway restart`. +## Rule 5 — Clear jiti cache after plugin code changes +After modifying .ts files under plugins/, MUST run rm -rf /tmp/jiti/ BEFORE openclaw gateway restart. ```
@@ -877,14 +702,16 @@ LanceDB table `memories`: | `id` | string (UUID) | Primary key | | `text` | string | Memory text (FTS indexed) | | `vector` | float[] | Embedding vector | -| `category` | string | `preference` / `fact` / `decision` / `entity` / `other` | +| `category` | string | Storage category: `preference` / `fact` / `decision` / `entity` / `reflection` / `other` | | `scope` | string | Scope identifier (e.g., `global`, `agent:main`) | -| `importance` | float | Importance score 0–1 | +| `importance` | float | Importance score 0-1 | | `timestamp` | int64 | Creation timestamp (ms) | | `metadata` | string (JSON) | Extended metadata | Common `metadata` keys in v1.1.0: `l0_abstract`, `l1_overview`, `l2_content`, `memory_category`, `tier`, `access_count`, `confidence`, `last_accessed_at` +> **Note on categories:** The top-level `category` field uses 6 storage categories. The 6-category semantic labels from Smart Extraction (`profile` / `preferences` / `entities` / `events` / `cases` / `patterns`) are stored in `metadata.memory_category`. +
@@ -898,31 +725,106 @@ On LanceDB 0.26+, some numeric columns may be returned as `BigInt`. Upgrade to * --- -## 🧪 Beta: Smart Memory v1.1.0 +## Hook Adaptation (OpenClaw 2026.3+) -> Status: Beta — available via `npm i memory-lancedb-pro@beta`. Stable users on `latest` are not affected. +Starting with v1.1.0-beta.9, the plugin's lifecycle hooks have been updated for compatibility with the refactored OpenClaw plugin system. -| Feature | Description | -|---------|-------------| -| **Smart Extraction** | LLM-powered 6-category extraction with L0/L1/L2 metadata. Falls back to regex when disabled. | -| **Lifecycle Scoring** | Weibull decay integrated into retrieval — high-frequency and high-importance memories rank higher. | -| **Tier Management** | Three-tier system (Core → Working → Peripheral) with automatic promotion/demotion. | +### What changed -Feedback: [GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issues) · Revert: `npm i memory-lancedb-pro@latest` +| Hook | Before | After | Why | +|------|--------|-------|-----| +| Auto-recall | `before_agent_start` | `before_prompt_build` (priority 10) | `before_agent_start` is deprecated; `before_prompt_build` is the recommended hook for prompt mutation | +| Reflection invariants | `before_agent_start` | `before_prompt_build` (priority 12) | Same reason as above | +| Reflection derived focus | `before_prompt_build` | `before_prompt_build` (priority 15) | Unchanged event, added explicit priority | +| All other lifecycle hooks | unchanged | unchanged | `agent_end`, `after_tool_call`, `session_end`, `message_received`, `before_message_write` | + +### Hook API distinction + +OpenClaw exposes two hook registration methods. They write to **different registries**: + +| Method | Registry | Dispatch | Use for | +|--------|----------|----------|---------| +| `api.on(event, handler, opts)` | `registry.typedHooks` | Dispatched by the lifecycle hook runner | Lifecycle events: `before_prompt_build`, `agent_end`, `after_tool_call`, `session_end`, `message_received`, `before_message_write` | +| `api.registerHook(event, handler, opts)` | `registry.hooks` | Dispatched by the internal hook system | Command/bootstrap events: `command:new`, `command:reset`, `agent:bootstrap` | + +Using the wrong method causes hooks to register silently without firing. This plugin uses `api.on()` for all lifecycle hooks and `api.registerHook()` for command hooks. + +### Verifying hooks after install + +```bash +openclaw plugins info memory-lancedb-pro +``` + +You should see: + +``` +Legacy before_agent_start: no + +Typed hooks: + agent_end + before_message_write + before_prompt_build (priority 10) + message_received + +Custom hooks: + memory-lancedb-pro-session-memory: command:new +``` + +If `Legacy before_agent_start: yes` appears, you are running an older version of the plugin. + +### Migration from older versions + +If you are upgrading from v1.1.0-beta.8 or earlier: + +1. Replace the plugin files (copy or `openclaw plugins install`) +2. Clear the jiti cache: `rm -rf /tmp/jiti/` +3. Restart the gateway: `openclaw gateway restart` +4. Verify: `openclaw plugins info memory-lancedb-pro` should show `Legacy before_agent_start: no` + +No config changes or data migration required. All existing memories, scopes, and settings are preserved. + +### OpenClaw version requirements + +- **Minimum:** OpenClaw 2026.3.22 +- **Recommended:** OpenClaw latest (2026.3.23+) + +This version uses `before_prompt_build` hooks (replacing the deprecated `before_agent_start`), which requires OpenClaw 2026.3.22 or later. Running `openclaw doctor --fix` after upgrading will automatically migrate plugin config (e.g. `minimax-portal-auth` → `minimax`, Brave search as a standalone plugin). + +To upgrade OpenClaw: + +```bash +npm update -g openclaw +openclaw --version # verify >= 2026.3.22 +openclaw doctor --fix # resolve any stale config after upgrade +``` --- -## 📖 Documentation +## Documentation | Document | Description | | --- | --- | -| [OpenClaw Integration Playbook](docs/openclaw-integration-playbook.md) | Deployment modes, `/new` verification, regression matrix | +| [OpenClaw Integration Playbook](docs/openclaw-integration-playbook.md) | Deployment modes, verification, regression matrix | | [Memory Architecture Analysis](docs/memory_architecture_analysis.md) | Full architecture deep-dive | | [CHANGELOG v1.1.0](docs/CHANGELOG-v1.1.0.md) | v1.1.0 behavior changes and upgrade rationale | | [Long-Context Chunking](docs/long-context-chunking.md) | Chunking strategy for long documents | --- +## Beta: Smart Memory v1.1.0 + +> Status: Beta — available via `npm i memory-lancedb-pro@beta`. Stable users on `latest` are not affected. + +| Feature | Description | +|---------|-------------| +| **Smart Extraction** | LLM-powered 6-category extraction with L0/L1/L2 metadata. Falls back to regex when disabled. | +| **Lifecycle Scoring** | Weibull decay integrated into retrieval — high-frequency and high-importance memories rank higher. | +| **Tier Management** | Three-tier system (Core → Working → Peripheral) with automatic promotion/demotion. | + +Feedback: [GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issues) · Revert: `npm i memory-lancedb-pro@latest` + +--- + ## Dependencies | Package | Purpose | @@ -933,7 +835,7 @@ Feedback: [GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issu --- -## 🤝 Contributors +## Contributors

@win4r @@ -949,7 +851,7 @@ Feedback: [GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issu Full list: [Contributors](https://github.com/CortexReach/memory-lancedb-pro/graphs/contributors) -## ⭐ Star History +## Star History @@ -965,7 +867,6 @@ MIT --- - ## My WeChat QR Code diff --git a/README_CN.md b/README_CN.md index 891f4e89..acbfb92b 100644 --- a/README_CN.md +++ b/README_CN.md @@ -2,90 +2,96 @@ # 🧠 memory-lancedb-pro · 🦞OpenClaw Plugin -**[OpenClaw](https://github.com/openclaw/openclaw) 生产级长期记忆插件** +**[OpenClaw](https://github.com/openclaw/openclaw) 智能体的 AI 记忆助理** -*让你的 AI Agent 真正拥有"记忆"——跨会话、跨 Agent、跨时间。* +*让你的 AI 智能体拥有真正的记忆力——跨会话、跨智能体、跨时间。* + +基于 LanceDB 的 OpenClaw 长期记忆插件,自动存储偏好、决策和项目上下文,在后续会话中自动回忆。 [![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-blue)](https://github.com/openclaw/openclaw) [![npm version](https://img.shields.io/npm/v/memory-lancedb-pro)](https://www.npmjs.com/package/memory-lancedb-pro) [![LanceDB](https://img.shields.io/badge/LanceDB-Vectorstore-orange)](https://lancedb.com) [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) -[English](README.md) | **简体中文** +[English](README.md) | [简体中文](README_CN.md) | [繁體中文](README_TW.md) | [日本語](README_JA.md) | [한국어](README_KO.md) | [Français](README_FR.md) | [Español](README_ES.md) | [Deutsch](README_DE.md) | [Italiano](README_IT.md) | [Русский](README_RU.md) | [Português (Brasil)](README_PT-BR.md) --- -## ✨ 为什么选择 memory-lancedb-pro? +## 为什么选 memory-lancedb-pro? -大多数 AI Agent 都有"失忆症"——每次新对话都从零开始。这个插件解决了这个问题。它给你的 OpenClaw Agent 提供**持久化、智能化的长期记忆**——完全自动,无需手动管理。 +大多数 AI 智能体都有"失忆症"——每次新对话,之前聊过的全部清零。 -| | 你能得到什么 | -|---|---| -| 🔍 **混合检索** | 向量 + BM25 全文搜索,搭配跨编码器 Rerank | -| 🧠 **智能提取** | LLM 驱动的 6 类别记忆提取——不用手动调 `memory_store` | -| ⏳ **记忆生命周期** | Weibull 衰减 + 三层晋升——重要记忆浮上来,过时记忆沉下去 | -| 🔒 **多 Scope 隔离** | 按 Agent、用户、项目维度隔离记忆 | -| 🔌 **任意 Embedding 提供商** | OpenAI、Jina、Gemini、Ollama 或任何 OpenAI 兼容 API | -| 🛠️ **完整运维工具链** | CLI、备份、迁移、升级、导入导出——不是玩具 | +**memory-lancedb-pro** 是 OpenClaw 的生产级长期记忆插件,把你的智能体变成一个真正的 **AI 记忆助理**——自动捕捉重要信息,让噪音自然衰减,在恰当的时候回忆起恰当的内容。无需手动标记,无需复杂配置。 ---- +### AI 记忆助理实际效果 -## 🆚 对比内置 `memory-lancedb` +**没有记忆——每次都从零开始:** -| 功能 | 内置 `memory-lancedb` | **memory-lancedb-pro** | -| --- | :---: | :---: | -| 向量搜索 | ✅ | ✅ | -| BM25 全文检索 | ❌ | ✅ | -| 混合融合(Vector + BM25) | ❌ | ✅ | -| 跨编码器 Rerank(Jina / 自定义) | ❌ | ✅ | -| 时效性加成 & 时间衰减 | ❌ | ✅ | -| 长度归一化 | ❌ | ✅ | -| MMR 多样性去重 | ❌ | ✅ | -| 多 Scope 隔离 | ❌ | ✅ | -| 噪声过滤 | ❌ | ✅ | -| 自适应检索 | ❌ | ✅ | -| 管理 CLI | ❌ | ✅ | -| Session 记忆 | ❌ | ✅ | -| Task-aware Embedding | ❌ | ✅ | -| **LLM 智能提取(6 类别)** | ❌ | ✅(v1.1.0) | -| **Weibull 衰减 + 三层晋升** | ❌ | ✅(v1.1.0) | -| **旧记忆一键升级** | ❌ | ✅(v1.1.0) | -| 任意 OpenAI 兼容 Embedding | 有限 | ✅ | +> **你:** "缩进用 tab,所有函数都要加错误处理。" +> *(下一次会话)* +> **你:** "我都说了用 tab 不是空格!" 😤 +> *(再下一次会话)* +> **你:** "……我真的说了第三遍了,tab,还有错误处理。" ---- +**有了 memory-lancedb-pro——你的智能体学会了、记住了:** -## 📺 视频教程 +> **你:** "缩进用 tab,所有函数都要加错误处理。" +> *(下一次会话——智能体自动回忆你的偏好)* +> **智能体:** *(默默改成 tab 缩进,并补上错误处理)* ✅ +> **你:** "上个月我们为什么选了 PostgreSQL 而不是 MongoDB?" +> **智能体:** "根据我们 2 月 12 日的讨论,主要原因是……" ✅ -> 完整演示:安装、配置,以及混合检索的底层原理。 +这就是 **AI 记忆助理** 的价值——学习你的风格,回忆过去的决策,提供个性化的回应,不再让你重复自己。 -[![YouTube Video](https://img.shields.io/badge/YouTube-立即观看-red?style=for-the-badge&logo=youtube)](https://youtu.be/MtukF1C8epQ) -🔗 **https://youtu.be/MtukF1C8epQ** +### 还能做什么? -[![Bilibili Video](https://img.shields.io/badge/Bilibili-立即观看-00A1D6?style=for-the-badge&logo=bilibili&logoColor=white)](https://www.bilibili.com/video/BV1zUf2BGEgn/) -🔗 **https://www.bilibili.com/video/BV1zUf2BGEgn/** +| | 你能得到的 | +|---|---| +| **自动捕捉** | 智能体从每次对话中学习——不需要手动调用 `memory_store` | +| **智能提取** | LLM 驱动的 6 类分类:用户画像、偏好、实体、事件、案例、模式 | +| **智能遗忘** | Weibull 衰减模型——重要记忆留存,噪音自然消退 | +| **混合检索** | 向量 + BM25 全文搜索,融合交叉编码器重排序 | +| **上下文注入** | 相关记忆在每次回复前自动浮现 | +| **多作用域隔离** | 按智能体、按用户、按项目隔离记忆边界 | +| **任意 Provider** | OpenAI、Jina、Gemini、Ollama 或任意 OpenAI 兼容 API | +| **完整工具链** | CLI、备份、迁移、升级、导入导出——生产可用 | --- -## 🚀 30 秒快速接入 +## 快速开始 -### 1. 安装 +### 方式 A:一键安装脚本(推荐) + +社区维护的 **[安装脚本](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** 一条命令搞定安装、升级和修复: ```bash -npm i memory-lancedb-pro@beta +curl -fsSL https://raw.githubusercontent.com/CortexReach/toolbox/main/memory-lancedb-pro-setup/setup-memory.sh -o setup-memory.sh +bash setup-memory.sh ``` -### 2. 配置 +> 脚本覆盖的完整场景和其他社区工具,详见下方 [生态工具](#生态工具)。 -添加到 `openclaw.json`: +### 方式 B:手动安装 + +**通过 OpenClaw CLI(推荐):** +```bash +openclaw plugins install memory-lancedb-pro@beta +``` + +**或通过 npm:** +```bash +npm i memory-lancedb-pro@beta +``` +> 如果用 npm 安装,你还需要在 `openclaw.json` 的 `plugins.load.paths` 中添加插件安装目录的 **绝对路径**。这是最常见的安装问题。 + +在 `openclaw.json` 中添加配置: ```json { "plugins": { - "slots": { - "memory": "memory-lancedb-pro" - }, + "slots": { "memory": "memory-lancedb-pro" }, "entries": { "memory-lancedb-pro": { "enabled": true, @@ -100,9 +106,7 @@ npm i memory-lancedb-pro@beta "smartExtraction": true, "extractMinMessages": 2, "extractMaxChars": 8000, - "sessionMemory": { - "enabled": false - } + "sessionMemory": { "enabled": false } } } } @@ -110,119 +114,150 @@ npm i memory-lancedb-pro@beta } ``` -**为什么这样配?** -- `autoCapture` + `smartExtraction` → Agent 自动从每次对话中学习 -- `autoRecall` → 回复前自动注入最相关的历史记忆 -- `extractMinMessages: 2` → 两轮对话就能触发智能提取 -- `sessionMemory: false` → 避免一开始就让 session summary 污染检索 +**为什么用这些默认值?** +- `autoCapture` + `smartExtraction` → 智能体自动从每次对话中学习 +- `autoRecall` → 相关记忆在每次回复前自动注入 +- `extractMinMessages: 2` → 正常两轮对话即触发提取 +- `sessionMemory.enabled: false` → 避免会话摘要在初期污染检索结果 -### 3. 校验并重启 +验证并重启: ```bash openclaw config validate openclaw gateway restart -openclaw logs --follow --plain | rg "memory-lancedb-pro" +openclaw logs --follow --plain | grep "memory-lancedb-pro" ``` -你应该看到: +你应该能看到: - `memory-lancedb-pro: smart extraction enabled` - `memory-lancedb-pro@...: plugin registered` -🎉 **搞定!** 你的 Agent 现在有长期记忆了。 +完成!你的智能体现在拥有长期记忆了。

-💬 通过 OpenClaw 的 Telegram Bot 一键导入配置(点击展开) +更多安装路径(已有用户、升级) -如果你在用 OpenClaw 的 Telegram 集成,最便捷的方式不是手动改配置,而是直接对主 Bot 发送一段接入指令。 +**已在使用 OpenClaw?** -可直接发送: +1. 在 `plugins.load.paths` 中添加插件的 **绝对路径** +2. 绑定记忆插槽:`plugins.slots.memory = "memory-lancedb-pro"` +3. 验证:`openclaw plugins info memory-lancedb-pro && openclaw memory-pro stats` + +**从 v1.1.0 之前的版本升级?** + +```bash +# 1) 备份 +openclaw memory-pro export --scope global --output memories-backup.json +# 2) 试运行 +openclaw memory-pro upgrade --dry-run +# 3) 执行升级 +openclaw memory-pro upgrade +# 4) 验证 +openclaw memory-pro stats +``` + +详见 `CHANGELOG-v1.1.0.md` 了解行为变更和升级说明。 + +
+ +
+Telegram Bot 快捷导入(点击展开) + +如果你在使用 OpenClaw 的 Telegram 集成,最简单的方式是直接给主 Bot 发消息,而不是手动编辑配置文件。 + +以下为英文原文,方便直接复制发送给 Bot: ```text -帮我接入该记忆库, 用体验最好的配置:https://github.com/CortexReach/memory-lancedb-pro - -要求: -1. 直接接成当前唯一启用的 memory 插件 -2. embedding 用 Jina -3. reranker 用 Jina -4. 智能提取的 llm 用 gpt-4o-mini -5. 开启 autoCapture、autoRecall、smartExtraction +Help me connect this memory plugin with the most user-friendly configuration: https://github.com/CortexReach/memory-lancedb-pro + +Requirements: +1. Set it as the only active memory plugin +2. Use Jina for embedding +3. Use Jina for reranker +4. Use gpt-4o-mini for the smart-extraction LLM +5. Enable autoCapture, autoRecall, smartExtraction 6. extractMinMessages=2 7. sessionMemory.enabled=false 8. captureAssistant=false -9. retrieval 用 hybrid,vectorWeight=0.7,bm25Weight=0.3 -10. rerank=cross-encoder,candidatePoolSize=12,minScore=0.6,hardMinScore=0.62 -11. 生成可直接落到 openclaw.json 的最终配置,不要只给解释 +9. retrieval mode=hybrid, vectorWeight=0.7, bm25Weight=0.3 +10. rerank=cross-encoder, candidatePoolSize=12, minScore=0.6, hardMinScore=0.62 +11. Generate the final openclaw.json config directly, not just an explanation +``` -{ - "embedding": { - "provider": "openai-compatible", - "apiKey": "${JINA_API_KEY}", - "model": "jina-embeddings-v5-text-small", - "baseURL": "https://api.jina.ai/v1", - "dimensions": 1024, - "taskQuery": "retrieval.query", - "taskPassage": "retrieval.passage", - "normalized": true - }, - "dbPath": "~/.openclaw/memory/lancedb-pro", - "autoCapture": true, - "autoRecall": true, - "captureAssistant": false, - "smartExtraction": true, - "extractMinMessages": 2, - "extractMaxChars": 8000, - "sessionMemory": { - "enabled": false - }, - "retrieval": { - "mode": "hybrid", - "vectorWeight": 0.7, - "bm25Weight": 0.3, - "rerank": "cross-encoder", - "rerankProvider": "jina", - "rerankEndpoint": "https://api.jina.ai/v1/rerank", - "rerankModel": "jina-reranker-v3", - "candidatePoolSize": 12, - "minScore": 0.6, - "hardMinScore": 0.62, - "rerankApiKey": "${JINA_API_KEY}" - }, - "llm": { - "apiKey": "${OPENAI_API_KEY}", - "model": "gpt-4o-mini", - "baseURL": "https://api.openai.com/v1" - } -} +
+ +--- + +## 生态工具 + +memory-lancedb-pro 是核心插件。社区围绕它构建了配套工具,让安装和日常使用更加顺畅: + +### 安装脚本——一键安装、升级和修复 + +> **[CortexReach/toolbox/memory-lancedb-pro-setup](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** + +不只是简单的安装器——脚本能智能处理各种常见场景: + +| 你的情况 | 脚本会做什么 | +|---|---| +| 从未安装 | 全新下载 → 安装依赖 → 选择配置 → 写入 openclaw.json → 重启 | +| 通过 `git clone` 安装,卡在旧版本 | 自动 `git fetch` + `checkout` 到最新 → 重装依赖 → 验证 | +| 配置中有无效字段 | 自动检测并通过 schema 过滤移除不支持的字段 | +| 通过 `npm` 安装 | 跳过 git 更新,提醒你自行运行 `npm update` | +| `openclaw` CLI 因无效配置崩溃 | 降级方案:直接从 `openclaw.json` 文件读取工作目录路径 | +| `extensions/` 而非 `plugins/` | 从配置或文件系统自动检测插件位置 | +| 已是最新版 | 仅执行健康检查,不做改动 | + +```bash +bash setup-memory.sh # 安装或升级 +bash setup-memory.sh --dry-run # 仅预览 +bash setup-memory.sh --beta # 包含预发布版本 +bash setup-memory.sh --uninstall # 还原配置并移除插件 ``` -如果你已经有自己的 OpenAI-compatible 服务,只需替换对应区块: +内置 Provider 预设:**Jina / DashScope / SiliconFlow / OpenAI / Ollama**,或自带任意 OpenAI 兼容 API。完整用法(含 `--ref`、`--selfcheck-only` 等)详见 [安装脚本 README](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)。 -- `embedding`:改 `apiKey` / `model` / `baseURL` / `dimensions` -- `retrieval`:改 `rerankProvider` / `rerankEndpoint` / `rerankModel` / `rerankApiKey` -- `llm`:改 `apiKey` / `model` / `baseURL` +### Claude Code / OpenClaw Skill——AI 引导式配置 -例如只替换 LLM: +> **[CortexReach/memory-lancedb-pro-skill](https://github.com/CortexReach/memory-lancedb-pro-skill)** -```json -{ - "llm": { - "apiKey": "${GROQ_API_KEY}", - "model": "openai/gpt-oss-120b", - "baseURL": "https://api.groq.com/openai/v1" - } -} +安装这个 Skill,你的 AI 智能体(Claude Code 或 OpenClaw)就能深度掌握 memory-lancedb-pro 的所有功能。只需说 **"help me enable the best config"** 即可获得: + +- **7 步引导式配置流程**,提供 4 套部署方案: + - 满血版(Jina + OpenAI)/ 省钱版(免费 SiliconFlow 重排序)/ 简约版(仅 OpenAI)/ 全本地版(Ollama,零 API 成本) +- **全部 9 个 MCP 工具** 的正确用法:`memory_recall`、`memory_store`、`memory_forget`、`memory_update`、`memory_stats`、`memory_list`、`self_improvement_log`、`self_improvement_extract_skill`、`self_improvement_review` *(完整工具集需要设置 `enableManagementTools: true`——默认快速配置仅暴露 4 个核心工具)* +- **常见坑规避**:workspace 插件启用、`autoRecall` 默认 false、jiti 缓存、环境变量、作用域隔离等 + +**Claude Code 安装:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.claude/skills/memory-lancedb-pro ``` -
+**OpenClaw 安装:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.openclaw/workspace/skills/memory-lancedb-pro-skill +``` --- -## 🏗️ 架构概览 +## 视频教程 + +> 完整演示:安装、配置、混合检索内部原理。 + +[![YouTube Video](https://img.shields.io/badge/YouTube-Watch%20Now-red?style=for-the-badge&logo=youtube)](https://youtu.be/MtukF1C8epQ) +**https://youtu.be/MtukF1C8epQ** + +[![Bilibili Video](https://img.shields.io/badge/Bilibili-Watch%20Now-00A1D6?style=for-the-badge&logo=bilibili&logoColor=white)](https://www.bilibili.com/video/BV1zUf2BGEgn/) +**https://www.bilibili.com/video/BV1zUf2BGEgn/** + +--- + +## 架构 ``` ┌─────────────────────────────────────────────────────────┐ │ index.ts (入口) │ -│ 插件注册 · 配置解析 · 生命周期钩子 · 自动捕获/回忆 │ +│ 插件注册 · 配置解析 · 生命周期钩子 │ └────────┬──────────┬──────────┬──────────┬───────────────┘ │ │ │ │ ┌────▼───┐ ┌────▼───┐ ┌───▼────┐ ┌──▼──────────┐ @@ -237,115 +272,130 @@ openclaw logs --follow --plain | rg "memory-lancedb-pro" └────────────────┘ ┌─────────────┐ ┌──────────┐ │ tools.ts │ │ cli.ts │ - │ (Agent API) │ │ (CLI) │ + │ (智能体 API)│ │ (CLI) │ └─────────────┘ └──────────┘ ``` -> 📖 完整架构深度解析请看 [docs/memory_architecture_analysis.md](docs/memory_architecture_analysis.md) +> 完整架构解析见 [docs/memory_architecture_analysis.md](docs/memory_architecture_analysis.md)。
-📄 文件说明(点击展开) +文件说明(点击展开) | 文件 | 用途 | | --- | --- | -| `index.ts` | 插件入口。注册到 OpenClaw Plugin API,解析配置,挂载 `before_agent_start`(自动回忆)、`agent_end`(自动捕获)、`command:new`(Session 记忆)等钩子 | -| `openclaw.plugin.json` | 插件元数据 + 完整 JSON Schema 配置声明(含 `uiHints`) | -| `package.json` | NPM 包信息,依赖 `@lancedb/lancedb`、`openai`、`@sinclair/typebox` | -| `cli.ts` | CLI 命令:`memory list/search/stats/delete/delete-bulk/export/import/reembed/upgrade/migrate` | -| `src/store.ts` | LanceDB 存储层。表创建 / FTS 索引 / Vector Search / BM25 / CRUD / 批量删除 / 统计 | -| `src/embedder.ts` | Embedding 抽象层。兼容任意 OpenAI API Provider,支持 task-aware embedding | -| `src/retriever.ts` | 混合检索引擎。Vector + BM25 → RRF 融合 → Rerank → 生命周期衰减 → Length Norm → Noise Filter → MMR | -| `src/scopes.ts` | 多 Scope 访问控制:`global`、`agent:`、`custom:`、`project:`、`user:` | -| `src/tools.ts` | Agent 工具:`memory_recall`、`memory_store`、`memory_forget`、`memory_update` + 管理工具 | -| `src/noise-filter.ts` | 过滤 Agent 拒绝回复、Meta 问题、寒暄等低质量记忆 | -| `src/adaptive-retrieval.ts` | 判断 query 是否需要触发记忆检索 | +| `index.ts` | 插件入口,注册 OpenClaw 插件 API、解析配置、挂载生命周期钩子 | +| `openclaw.plugin.json` | 插件元数据 + 完整 JSON Schema 配置声明 | +| `cli.ts` | CLI 命令:`memory-pro list/search/stats/delete/delete-bulk/export/import/reembed/upgrade/migrate` | +| `src/store.ts` | LanceDB 存储层:建表 / 全文索引 / 向量搜索 / BM25 搜索 / CRUD | +| `src/embedder.ts` | Embedding 抽象层,兼容任意 OpenAI 兼容 API | +| `src/retriever.ts` | 混合检索引擎:向量 + BM25 → 混合融合 → 重排序 → 生命周期衰减 → 过滤 | +| `src/scopes.ts` | 多作用域访问控制 | +| `src/tools.ts` | 智能体工具定义:`memory_recall`、`memory_store`、`memory_forget`、`memory_update` + 管理工具 | +| `src/noise-filter.ts` | 过滤智能体拒绝回复、元问题、打招呼等低质量内容 | +| `src/adaptive-retrieval.ts` | 判断查询是否需要记忆检索 | | `src/migrate.ts` | 从内置 `memory-lancedb` 迁移到 Pro | -| `src/smart-extractor.ts` | **(v1.1.0)** LLM 6 类别提取管线,含 L0/L1/L2 分层存储和两阶段去重 | -| `src/memory-categories.ts` | **(v1.1.0)** 6 类别分类系统:profile、preferences、entities、events、cases、patterns | -| `src/decay-engine.ts` | **(v1.1.0)** Weibull 拉伸指数衰减模型 | -| `src/tier-manager.ts` | **(v1.1.0)** 三层晋升/降级系统:Peripheral ⟷ Working ⟷ Core | -| `src/memory-upgrader.ts` | **(v1.1.0)** 旧记忆批量升级为新智能格式 | -| `src/llm-client.ts` | **(v1.1.0)** LLM 客户端,结构化 JSON 输出 | -| `src/extraction-prompts.ts` | **(v1.1.0)** 记忆提取、去重、合并的 LLM 提示模板 | -| `src/smart-metadata.ts` | **(v1.1.0)** Metadata 归一化,统一 L0/L1/L2、tier、confidence、access 计数 | +| `src/smart-extractor.ts` | LLM 驱动的 6 类提取,支持 L0/L1/L2 分层存储和两阶段去重 | +| `src/decay-engine.ts` | Weibull 拉伸指数衰减模型 | +| `src/tier-manager.ts` | 三级晋升/降级:外围 ↔ 工作 ↔ 核心 |
--- -## 📦 核心特性 +## 核心功能 ### 混合检索 ``` -Query → embedQuery() ─┐ - ├─→ RRF 融合 → Rerank → 生命周期衰减加权 → 长度归一化 → 过滤 -Query → BM25 FTS ─────┘ +查询 → embedQuery() ─┐ + ├─→ 混合融合 → 重排序 → 生命周期衰减加权 → 长度归一化 → 过滤 +查询 → BM25 全文 ─────┘ ``` -- **向量搜索** — 语义相似度搜索(cosine distance via LanceDB ANN) -- **BM25 全文搜索** — 关键词精确匹配(LanceDB FTS 索引) -- **融合策略** — 向量分数为主,BM25 命中给予 15% 加成(非传统 RRF,经调优) +- **向量搜索** — 基于 LanceDB ANN 的语义相似度(余弦距离) +- **BM25 全文搜索** — 通过 LanceDB FTS 索引进行精确关键词匹配 +- **混合融合** — 以向量分数为基础,BM25 命中结果获得加权提升(非标准 RRF——针对实际召回质量调优) - **可配置权重** — `vectorWeight`、`bm25Weight`、`minScore` -### 跨编码器 Rerank +### 交叉编码器重排序 -- 支持 **Jina**、**SiliconFlow**、**Voyage AI**、**Pinecone** 或任意兼容端点 -- 混合评分:60% cross-encoder + 40% 原始融合分 -- 降级策略:API 失败时回退到 cosine similarity rerank +- 内置 **Jina**、**SiliconFlow**、**Voyage AI** 和 **Pinecone** 适配器 +- 兼容任意 Jina 兼容端点(如 Hugging Face TEI、DashScope) +- 混合打分:60% 交叉编码器 + 40% 原始融合分数 +- 优雅降级:API 失败时回退到余弦相似度 -### 多层评分管线 +### 多阶段评分管线 | 阶段 | 效果 | | --- | --- | -| **RRF 融合** | 同时结合语义召回和关键词召回 | -| **跨编码器重排** | 提升语义更准确的结果 | -| **生命周期衰减加权** | Weibull 新鲜度 + 访问频率 + importance × confidence | -| **长度归一化** | 防止长条目霸占查询结果(锚点 500 字符) | -| **硬最低分** | 低于阈值直接丢弃(默认 0.35) | -| **MMR 多样性** | cosine 相似度 > 0.85 → 降级 | +| **混合融合** | 结合语义召回和精确匹配召回 | +| **交叉编码器重排序** | 提升语义精确命中的排名 | +| **生命周期衰减加权** | Weibull 时效性 + 访问频率 + 重要性 × 置信度 | +| **长度归一化** | 防止长条目主导结果(锚点:500 字符) | +| **硬最低分** | 移除无关结果(默认:0.35) | +| **MMR 多样性** | 余弦相似度 > 0.85 → 降权 | ### 智能记忆提取(v1.1.0) -- **LLM 驱动 6 类别提取**:profile、preferences、entities、events、cases、patterns +- **LLM 驱动的 6 类提取**:用户画像、偏好、实体、事件、案例、模式 - **L0/L1/L2 分层存储**:L0(一句话索引)→ L1(结构化摘要)→ L2(完整叙述) - **两阶段去重**:向量相似度预过滤(≥0.7)→ LLM 语义决策(CREATE/MERGE/SKIP) -- **类别感知合并**:`profile` 始终合并,`events`/`cases` 仅新增 +- **类别感知合并**:`profile` 始终合并,`events`/`cases` 仅追加 ### 记忆生命周期管理(v1.1.0) -- **Weibull 衰减引擎**:复合分数 = 时效 + 频率 + 内在价值 -- **衰减感知检索**:召回结果按生命周期衰减重排 -- **三层晋升系统**:`Peripheral ⟷ Working ⟷ Core`,可配置阈值 +- **Weibull 衰减引擎**:综合分数 = 时效性 + 频率 + 内在价值 +- **三级晋升**:`外围 ↔ 工作 ↔ 核心`,阈值可配置 +- **访问强化**:频繁被召回的记忆衰减更慢(类似间隔重复机制) - **重要性调制半衰期**:重要记忆衰减更慢 -### 多 Scope 隔离 +### 多作用域隔离 + +- 内置作用域:`global`、`agent:`、`custom:`、`project:`、`user:` +- 通过 `scopes.agentAccess` 实现智能体级别的访问控制 +- 默认:每个智能体访问 `global` + 自己的 `agent:` 作用域 + +### 自动捕捉与自动回忆 -- 内置 Scope:`global`、`agent:`、`custom:`、`project:`、`user:` -- 通过 `scopes.agentAccess` 配置 Agent 级访问控制 -- 默认:Agent 可访问 `global` + 自己的 `agent:` Scope +- **自动捕捉**(`agent_end`):从对话中提取偏好/事实/决策/实体,去重后每轮最多存储 3 条 +- **自动回忆**(`before_agent_start`):注入 `` 上下文(最多 3 条) -### 自动捕获 & 自动回忆 +### 噪音过滤与自适应检索 -- **Auto-Capture**(`agent_end`):从对话中提取 preference/fact/decision/entity,去重后存储(每次最多 3 条) -- **Auto-Recall**(`before_agent_start`):注入 `` 上下文(最多 3 条) +- 过滤低质量内容:智能体拒绝回复、元问题、打招呼 +- 跳过检索:打招呼、斜杠命令、简单确认、表情符号 +- 强制检索:记忆关键词("记得"、"之前"、"上次") +- 中文感知阈值(中文:6 字符 vs 英文:15 字符) -### 噪声过滤 & 自适应检索 +--- -- 过滤低质量内容:Agent 拒绝回复、Meta 问题、寒暄 -- 跳过问候、slash 命令、简单确认、emoji 的记忆检索 -- 强制检索含记忆关键词的 query("remember"、"之前"、"上次"等) -- CJK 字符更低阈值(中文 6 字符 vs 英文 15 字符) +
+与内置 memory-lancedb 的对比(点击展开) -### 旧记忆一键升级(v1.1.0) +| 功能 | 内置 `memory-lancedb` | **memory-lancedb-pro** | +| --- | :---: | :---: | +| 向量搜索 | 有 | 有 | +| BM25 全文搜索 | - | 有 | +| 混合融合(向量 + BM25) | - | 有 | +| 交叉编码器重排序(多 Provider) | - | 有 | +| 时效性提升和时间衰减 | - | 有 | +| 长度归一化 | - | 有 | +| MMR 多样性 | - | 有 | +| 多作用域隔离 | - | 有 | +| 噪音过滤 | - | 有 | +| 自适应检索 | - | 有 | +| 管理 CLI | - | 有 | +| 会话记忆 | - | 有 | +| 任务感知 Embedding | - | 有 | +| **LLM 智能提取(6 类)** | - | 有(v1.1.0) | +| **Weibull 衰减 + 层级晋升** | - | 有(v1.1.0) | +| 任意 OpenAI 兼容 Embedding | 有限 | 有 | -- 一条命令升级:`openclaw memory-pro upgrade` -- LLM 或无 LLM 模式(离线可用) -- 启动自动检测并提示升级 +
--- -## ⚙️ 配置 +## 配置
完整配置示例 @@ -388,8 +438,8 @@ Query → BM25 FTS ─────┘ "scopes": { "default": "global", "definitions": { - "global": { "description": "共享知识库" }, - "agent:discord-bot": { "description": "Discord 机器人私有" } + "global": { "description": "Shared knowledge" }, + "agent:discord-bot": { "description": "Discord bot private" } }, "agentAccess": { "discord-bot": ["global", "agent:discord-bot"] @@ -410,457 +460,210 @@ Query → BM25 FTS ─────┘ } ``` -OpenClaw 默认行为: - -- `autoCapture`:默认开启 -- `autoRecall`:插件 schema 默认关闭,但本 README 建议大多数新用户显式开启 -- `embedding.chunking`:默认开启 -- `sessionMemory.enabled`:默认关闭;需要显式设为 `true` 才注册 `/new` Hook -
-Embedding 提供商 +Embedding 服务商 -本插件支持 **任意 OpenAI 兼容的 Embedding API**: +兼容 **任意 OpenAI 兼容 Embedding API**: -| 提供商 | 模型 | Base URL | 维度 | +| 服务商 | 模型 | Base URL | 维度 | | --- | --- | --- | --- | | **Jina**(推荐) | `jina-embeddings-v5-text-small` | `https://api.jina.ai/v1` | 1024 | | **OpenAI** | `text-embedding-3-small` | `https://api.openai.com/v1` | 1536 | +| **Voyage** | `voyage-4-lite` / `voyage-4` | `https://api.voyageai.com/v1` | 1024 / 1024 | | **Google Gemini** | `gemini-embedding-001` | `https://generativelanguage.googleapis.com/v1beta/openai/` | 3072 | -| **Ollama**(本地) | `nomic-embed-text` | `http://localhost:11434/v1` | _与本地模型一致_ | - -
- -
-Rerank 提供商 - -通过 `rerankProvider` 配置跨编码器 Rerank: - -| 提供商 | `rerankProvider` | Endpoint | 示例模型 | -| --- | --- | --- | --- | -| **Jina**(默认) | `jina` | `https://api.jina.ai/v1/rerank` | `jina-reranker-v3` | -| **SiliconFlow**(有免费额度) | `siliconflow` | `https://api.siliconflow.com/v1/rerank` | `BAAI/bge-reranker-v2-m3` | -| **Voyage AI** | `voyage` | `https://api.voyageai.com/v1/rerank` | `rerank-2.5` | -| **Pinecone** | `pinecone` | `https://api.pinecone.io/rerank` | `bge-reranker-v2-m3` | - -
-SiliconFlow 配置示例 - -```json -{ - "retrieval": { - "rerank": "cross-encoder", - "rerankProvider": "siliconflow", - "rerankEndpoint": "https://api.siliconflow.com/v1/rerank", - "rerankApiKey": "sk-xxx", - "rerankModel": "BAAI/bge-reranker-v2-m3" - } -} -``` - -
- -
-Voyage 配置示例 - -```json -{ - "retrieval": { - "rerank": "cross-encoder", - "rerankProvider": "voyage", - "rerankEndpoint": "https://api.voyageai.com/v1/rerank", - "rerankApiKey": "${VOYAGE_API_KEY}", - "rerankModel": "rerank-2.5" - } -} -``` +| **Ollama**(本地) | `nomic-embed-text` | `http://localhost:11434/v1` | 取决于模型 |
-Pinecone 配置示例 +重排序服务商 -```json -{ - "retrieval": { - "rerank": "cross-encoder", - "rerankProvider": "pinecone", - "rerankEndpoint": "https://api.pinecone.io/rerank", - "rerankApiKey": "pcsk_xxx", - "rerankModel": "bge-reranker-v2-m3" - } -} -``` +交叉编码器重排序通过 `rerankProvider` 支持多个服务商: -
+| 服务商 | `rerankProvider` | 示例模型 | +| --- | --- | --- | +| **Jina**(默认) | `jina` | `jina-reranker-v3` | +| **SiliconFlow**(有免费额度) | `siliconflow` | `BAAI/bge-reranker-v2-m3` | +| **Voyage AI** | `voyage` | `rerank-2.5` | +| **Pinecone** | `pinecone` | `bge-reranker-v2-m3` | -说明:`voyage` 发送 `{ model, query, documents }` 格式(不含 `top_n`),响应从 `data[].relevance_score` 解析。 +任何 Jina 兼容的重排序端点也可以使用——设置 `rerankProvider: "jina"` 并将 `rerankEndpoint` 指向你的服务(如 Hugging Face TEI、DashScope `qwen3-rerank`)。
-智能提取配置(LLM)— v1.1.0 +智能提取(LLM)— v1.1.0 -启用 `smartExtraction`(默认 `true`)后,插件用 LLM 智能提取和分类记忆,替代正则触发。 +当 `smartExtraction` 启用(默认 `true`)时,插件使用 LLM 智能提取和分类记忆,替代基于正则的触发方式。 | 字段 | 类型 | 默认值 | 说明 | |------|------|--------|------| | `smartExtraction` | boolean | `true` | 是否启用 LLM 智能 6 类别提取 | +| `llm.auth` | string | `api-key` | `api-key` 使用 `llm.apiKey` / `embedding.apiKey`;`oauth` 默认使用 plugin 级 OAuth token 文件 | | `llm.apiKey` | string | *(复用 `embedding.apiKey`)* | LLM 提供商 API Key | | `llm.model` | string | `openai/gpt-oss-120b` | LLM 模型名称 | | `llm.baseURL` | string | *(复用 `embedding.baseURL`)* | LLM API 端点 | -| `extractMinMessages` | number | `2` | 触发提取所需最少消息数 | +| `llm.oauthProvider` | string | `openai-codex` | `llm.auth` 为 `oauth` 时使用的 OAuth provider id | +| `llm.oauthPath` | string | `~/.openclaw/.memory-lancedb-pro/oauth.json` | `llm.auth` 为 `oauth` 时使用的 OAuth token 文件 | +| `llm.timeoutMs` | number | `30000` | LLM 请求超时(毫秒) | +| `extractMinMessages` | number | `2` | 触发提取的最小消息数 | | `extractMaxChars` | number | `8000` | 发送给 LLM 的最大字符数 | -最简配置(复用 embedding API Key): -```json -{ - "embedding": { "apiKey": "${OPENAI_API_KEY}", "model": "text-embedding-3-small" }, - "smartExtraction": true -} -``` - -完整配置(独立 LLM 端点): -```json -{ - "embedding": { "apiKey": "${OPENAI_API_KEY}", "model": "text-embedding-3-small" }, - "smartExtraction": true, - "llm": { "apiKey": "${OPENAI_API_KEY}", "model": "gpt-4o-mini", "baseURL": "https://api.openai.com/v1" }, - "extractMinMessages": 2, - "extractMaxChars": 8000 -} -``` - -禁用:`{ "smartExtraction": false }` - -
- -
-生命周期配置(Decay + Tier) - -控制记忆新鲜度排序与自动层级迁移。 - -| 字段 | 类型 | 默认值 | 说明 | -|------|------|--------|------| -| `decay.recencyHalfLifeDays` | number | `30` | Weibull 时效衰减基础半衰期 | -| `decay.frequencyWeight` | number | `0.3` | 访问频率在复合分数中的权重 | -| `decay.intrinsicWeight` | number | `0.3` | `importance × confidence` 的权重 | -| `decay.betaCore` | number | `0.8` | `core` 记忆的 Weibull beta | -| `decay.betaWorking` | number | `1.0` | `working` 记忆的 Weibull beta | -| `decay.betaPeripheral` | number | `1.3` | `peripheral` 记忆的 Weibull beta | -| `tier.coreAccessThreshold` | number | `10` | 晋升到 `core` 所需最小 recall 次数 | -| `tier.coreCompositeThreshold` | number | `0.7` | 晋升到 `core` 所需最小生命周期分数 | -| `tier.peripheralCompositeThreshold` | number | `0.15` | 低于此分数的 `working` 可能降级 | -| `tier.peripheralAgeDays` | number | `60` | 陈旧低访问记忆降级年龄阈值 | - -```json -{ - "decay": { "recencyHalfLifeDays": 21, "betaCore": 0.7, "betaPeripheral": 1.5 }, - "tier": { "coreAccessThreshold": 8, "peripheralAgeDays": 45 } -} -``` - -
- -
-访问强化(1.0.26) - -经常被用到的记忆衰减更慢(类似间隔重复)。 - -配置项(位于 `retrieval` 下): -- `reinforcementFactor`(0–2,默认 `0.5`)— 设为 `0` 可关闭 -- `maxHalfLifeMultiplier`(1–10,默认 `3`)— 有效 half-life 硬上限 - -说明:强化逻辑只对 `source: "manual"` 生效,避免 auto-recall 意外"强化"噪声。 - -
- ---- - -## 📥 安装 - -
-路径 A:第一次用 OpenClaw(推荐) - -1. 克隆到 workspace: - -```bash -cd /path/to/your/openclaw/workspace -git clone https://github.com/CortexReach/memory-lancedb-pro.git plugins/memory-lancedb-pro -cd plugins/memory-lancedb-pro -npm install -``` - -2. 添加到 `openclaw.json`(相对路径): +OAuth `llm` 配置(使用现有 Codex / ChatGPT 登录缓存来发送 LLM 请求): ```json { - "plugins": { - "load": { "paths": ["plugins/memory-lancedb-pro"] }, - "entries": { - "memory-lancedb-pro": { - "enabled": true, - "config": { - "embedding": { - "apiKey": "${JINA_API_KEY}", - "model": "jina-embeddings-v5-text-small", - "baseURL": "https://api.jina.ai/v1", - "dimensions": 1024, - "taskQuery": "retrieval.query", - "taskPassage": "retrieval.passage", - "normalized": true - } - } - } - }, - "slots": { "memory": "memory-lancedb-pro" } + "llm": { + "auth": "oauth", + "oauthProvider": "openai-codex", + "model": "gpt-5.4", + "oauthPath": "${HOME}/.openclaw/.memory-lancedb-pro/oauth.json", + "timeoutMs": 30000 } } ``` -3. 重启并验证: +`llm.auth: "oauth"` 说明: -```bash -openclaw config validate -openclaw gateway restart -openclaw plugins info memory-lancedb-pro -openclaw hooks list --json -openclaw memory-pro stats -``` - -4. 烟测:写入 1 条记忆 → 关键词搜索 → 自然语言搜索。 +- `llm.oauthProvider` 当前仅支持 `openai-codex`。 +- OAuth token 默认存放在 `~/.openclaw/.memory-lancedb-pro/oauth.json`。 +- 如需自定义路径,可设置 `llm.oauthPath`。 +- `auth login` 会在 OAuth 文件旁边快照原来的 `api-key` 模式 `llm` 配置;`auth logout` 在可用时会恢复这份快照。 +- 从 `api-key` 切到 `oauth` 时不会自动沿用 `llm.baseURL`;只有在你明确需要自定义 ChatGPT/Codex 兼容后端时,才应在 `oauth` 模式下手动设置。
-路径 B:已在用 OpenClaw,现在加入插件 - -1. 保持现有 agents、channels、models 不变 -2. 用**绝对路径**把插件加到 `plugins.load.paths`: - -```json -{ "plugins": { "load": { "paths": ["/absolute/path/to/memory-lancedb-pro"] } } } -``` - -3. 绑定 memory slot:`plugins.slots.memory = "memory-lancedb-pro"` -4. 验证:`openclaw plugins info memory-lancedb-pro && openclaw memory-pro stats` +生命周期配置(衰减 + 层级) + +| 字段 | 默认值 | 说明 | +|------|--------|------| +| `decay.recencyHalfLifeDays` | `30` | Weibull 时效性衰减的基础半衰期 | +| `decay.frequencyWeight` | `0.3` | 访问频率在综合分数中的权重 | +| `decay.intrinsicWeight` | `0.3` | `重要性 × 置信度` 的权重 | +| `decay.betaCore` | `0.8` | `核心` 记忆的 Weibull beta | +| `decay.betaWorking` | `1.0` | `工作` 记忆的 Weibull beta | +| `decay.betaPeripheral` | `1.3` | `外围` 记忆的 Weibull beta | +| `tier.coreAccessThreshold` | `10` | 晋升到 `核心` 所需的最小召回次数 | +| `tier.peripheralAgeDays` | `60` | 降级过期记忆的天数阈值 |
-路径 C:从旧版 memory-lancedb-pro 升级(v1.1.0 之前) +访问强化 -命令边界: -- `upgrade` — 用于**旧版 `memory-lancedb-pro` 数据升级** -- `migrate` — 只用于从内置 **`memory-lancedb`** 迁移 -- `reembed` — 只在更换 embedding 模型后重建向量时使用 +频繁被召回的记忆衰减更慢(类似间隔重复机制)。 -推荐安全顺序: - -```bash -# 1) 备份 -openclaw memory-pro export --scope global --output memories-backup.json - -# 2) 先检查 -openclaw memory-pro upgrade --dry-run - -# 3) 正式升级 -openclaw memory-pro upgrade - -# 4) 验证 -openclaw memory-pro stats -openclaw memory-pro search "your known keyword" --scope global --limit 5 -``` - -详见 `docs/CHANGELOG-v1.1.0.md`。 - -
- -
-安装后验证清单 - -```bash -openclaw config validate -openclaw gateway restart -openclaw plugins info memory-lancedb-pro -openclaw hooks list --json -openclaw memory-pro stats -openclaw memory-pro list --scope global --limit 5 -``` - -然后验证: -- ✅ 1 个唯一标识符搜索命中 -- ✅ 1 个自然语言搜索命中 -- ✅ 1 轮 `memory_store` → `memory_recall` -- ✅ 如启用 session memory,补 1 轮真实 `/new` - -
- -
-AI 安装指引(防幻觉版) - -如果你是用 AI 按 README 操作,**不要假设任何默认值**。先运行: - -```bash -openclaw config get agents.defaults.workspace -openclaw config get plugins.load.paths -openclaw config get plugins.slots.memory -openclaw config get plugins.entries.memory-lancedb-pro -``` - -建议: -- `plugins.load.paths` 优先用**绝对路径** -- 如果配置里用 `${JINA_API_KEY}`,务必确保 Gateway **服务进程环境**里有该变量 -- 修改插件配置后运行 `openclaw gateway restart` - -
- -
-Jina API Key(Embedding + Rerank) - -- **Embedding**:`embedding.apiKey` 填 Jina key(推荐用 `${JINA_API_KEY}`) -- **Rerank**(`rerankProvider: "jina"`):通常可复用同一个 Jina key -- 其它 rerank provider → 用该 provider 的 key - -Key 存储:不要提交到 git。使用 `${...}` 环境变量时确保 Gateway 服务进程有该变量。 - -
- -
-什么是 "OpenClaw workspace"? - -**Agent workspace** 是 Agent 的工作目录(默认:`~/.openclaw/workspace`)。相对路径以 workspace 为基准解析。 - -> 说明:OpenClaw 配置文件通常在 `~/.openclaw/openclaw.json`,与 workspace 分开。 - -**常见错误:** 把插件 clone 到别的目录,但配置里写相对路径。建议用绝对路径(路径 B)或 clone 到 `/plugins/`(路径 A)。 +配置项(在 `retrieval` 下): +- `reinforcementFactor`(0-2,默认 `0.5`)— 设为 `0` 可禁用 +- `maxHalfLifeMultiplier`(1-10,默认 `3`)— 有效半衰期的硬上限
--- -## 🔧 CLI 命令 +## CLI 命令 ```bash openclaw memory-pro list [--scope global] [--category fact] [--limit 20] [--json] -openclaw memory-pro search "query" [--scope global] [--limit 10] [--json] +openclaw memory-pro search "查询" [--scope global] [--limit 10] [--json] openclaw memory-pro stats [--scope global] [--json] +openclaw memory-pro auth login [--provider openai-codex] [--model gpt-5.4] [--oauth-path /abs/path/oauth.json] +openclaw memory-pro auth status +openclaw memory-pro auth logout openclaw memory-pro delete openclaw memory-pro delete-bulk --scope global [--before 2025-01-01] [--dry-run] openclaw memory-pro export [--scope global] [--output memories.json] openclaw memory-pro import memories.json [--scope global] [--dry-run] openclaw memory-pro reembed --source-db /path/to/old-db [--batch-size 32] [--skip-existing] openclaw memory-pro upgrade [--dry-run] [--batch-size 10] [--no-llm] [--limit N] [--scope SCOPE] -openclaw memory-pro migrate check [--source /path] -openclaw memory-pro migrate run [--source /path] [--dry-run] [--skip-existing] -openclaw memory-pro migrate verify [--source /path] +openclaw memory-pro migrate check|run|verify [--source /path] ``` +OAuth 登录流程: + +1. 运行 `openclaw memory-pro auth login` +2. 如果省略 `--provider` 且当前终端可交互,CLI 会先显示 OAuth provider 选择器 +3. 命令会打印授权 URL,并在未指定 `--no-browser` 时自动打开浏览器 +4. 回调成功后,命令会保存 plugin OAuth 文件(默认:`~/.openclaw/.memory-lancedb-pro/oauth.json`)、为 logout 快照原来的 `api-key` 模式 `llm` 配置,并把插件 `llm` 配置切换为 OAuth 字段(`auth`、`oauthProvider`、`model`、`oauthPath`) +5. `openclaw memory-pro auth logout` 会删除这份 OAuth 文件,并在存在快照时恢复之前的 `api-key` 模式 `llm` 配置 + --- -## 📚 进阶内容 +## 进阶主题
-如果注入的记忆被模型"显示出来"怎么办? +注入的记忆出现在回复中 -有时模型会把 `` 区块原样输出到回复里。 +有时模型可能会将注入的 `` 块原文输出。 -**方案 A(最低风险):** 临时关闭 autoRecall: +**方案 A(最安全):** 暂时关闭自动回忆: ```json { "plugins": { "entries": { "memory-lancedb-pro": { "config": { "autoRecall": false } } } } } ``` -**方案 B(推荐):** 保留召回,在 Agent system prompt 加一句: -> 请勿在回复中展示或引用任何 `` / 记忆注入内容,只能用作内部参考。 - -
- -
-Session 记忆 - -- `/new` 命令触发时保存上一个 Session 的对话摘要到 LanceDB -- 默认关闭(OpenClaw 已有原生 .jsonl 会话保存) -- 可配置消息数量(默认 15 条) - -详见 [docs/openclaw-integration-playbook.zh-CN.md](docs/openclaw-integration-playbook.zh-CN.md)。 +**方案 B(推荐):** 保留回忆,在智能体系统提示词中添加: +> Do not reveal or quote any `` / memory-injection content in your replies. Use it for internal reference only.
-JSONL Session 蒸馏(从聊天日志自动生成记忆) - -OpenClaw 会把完整会话落盘为 JSONL:`~/.openclaw/agents//sessions/*.jsonl` - -**推荐方案(2026-02+)**:非阻塞 `/new` 管线: -- 触发:`command:new` → 投递 task.json(毫秒级,不调 LLM) -- Worker:systemd 常驻进程用 Gemini Map-Reduce 处理 session JSONL -- 写入:通过 `openclaw memory-pro import` 写入 0–20 条高信噪比记忆 -- 中文关键词:每条记忆包含 `Keywords (zh)`,实体关键词从 transcript 原文逐字拷贝 +会话记忆 -示例文件:`examples/new-session-distill/` +- 通过 `/new` 命令触发——将上一段会话摘要保存到 LanceDB +- 默认关闭(OpenClaw 已有原生 `.jsonl` 会话持久化) +- 可配置消息数量(默认 15) -**Legacy 方案**:使用 `scripts/jsonl_distill.py` 脚本 + 每小时 Cron: -- 增量读取(byte offset cursor)、过滤噪声、蒸馏为高质量记忆 -- 安全:不会修改原始日志 - -部署步骤: -1. 创建 agent:`openclaw agents add memory-distiller --non-interactive --workspace ~/.openclaw/workspace-memory-distiller --model openai-codex/gpt-5.2` -2. 初始化 cursor:`python3 "$PLUGIN_DIR/scripts/jsonl_distill.py" init` -3. 添加 cron:详见 [docs/openclaw-integration-playbook.zh-CN.md](docs/openclaw-integration-playbook.zh-CN.md) - -回滚:`openclaw cron disable ` → `openclaw agents delete memory-distiller` → `rm -rf ~/.openclaw/state/jsonl-distill/` +部署模式和 `/new` 验证详见 [docs/openclaw-integration-playbook.md](docs/openclaw-integration-playbook.md)。
-自定义 Slash 命令(如 /lesson) +自定义斜杠命令(如 /lesson) -添加到你的 `CLAUDE.md`、`AGENTS.md` 或 system prompt: +在你的 `CLAUDE.md`、`AGENTS.md` 或系统提示词中添加: ```markdown ## /lesson 命令 当用户发送 `/lesson <内容>` 时: -1. 用 memory_store 存为 category=fact(原始知识) -2. 用 memory_store 存为 category=decision(可操作的结论) +1. 用 memory_store 保存为 category=fact(原始知识) +2. 用 memory_store 保存为 category=decision(可执行的结论) 3. 确认已保存的内容 ## /remember 命令 当用户发送 `/remember <内容>` 时: -1. 用 memory_store 存储,自动选择合适的 category 和 importance -2. 返回存储的 memory ID +1. 用 memory_store 以合适的 category 和 importance 保存 +2. 返回已存储的记忆 ID 确认 ``` -内置工具:`memory_store`、`memory_recall`、`memory_forget`、`memory_update` — 插件加载时自动注册。 -
-AI Agent 铁律(Iron Rules) +AI 智能体铁律 -> 将下方代码块复制到你的 `AGENTS.md` 中,让 Agent 自动遵守。 +> 将以下内容复制到你的 `AGENTS.md`,让智能体自动遵守这些规则。 ```markdown -## Rule 1 — 双层记忆存储(铁律) -Every pitfall/lesson learned → IMMEDIATELY store TWO memories: -- **Technical layer**: Pitfall: [symptom]. Cause: [root cause]. Fix: [solution]. Prevention: [how to avoid] - (category: fact, importance ≥ 0.8) -- **Principle layer**: Decision principle ([tag]): [behavioral rule]. Trigger: [when]. Action: [what to do] - (category: decision, importance ≥ 0.85) -- After each store, immediately `memory_recall` to verify retrieval. +## 规则 1 — 双层记忆存储 +每个踩坑/经验教训 → 立即存储两条记忆: +- 技术层:踩坑:[现象]。原因:[根因]。修复:[方案]。预防:[如何避免] + (category: fact, importance >= 0.8) +- 原则层:决策原则 ([标签]):[行为规则]。触发:[何时]。动作:[做什么] + (category: decision, importance >= 0.85) -## Rule 2 — LanceDB 卫生 -Entries must be short and atomic (< 500 chars). No raw conversation summaries or duplicates. +## 规则 2 — LanceDB 数据质量 +条目必须简短且原子化(< 500 字符)。不存储原始对话摘要或重复内容。 -## Rule 3 — Recall before retry -On ANY tool failure, ALWAYS `memory_recall` with relevant keywords BEFORE retrying. +## 规则 3 — 重试前先回忆 +任何工具调用失败时,必须先用 memory_recall 搜索相关关键词,再重试。 -## Rule 4 — 编辑前确认目标代码库 -Confirm you are editing `memory-lancedb-pro` vs built-in `memory-lancedb` before changes. +## 规则 4 — 确认目标代码库 +修改前确认你操作的是 memory-lancedb-pro 还是内置 memory-lancedb。 -## Rule 5 — 插件代码变更必须清 jiti 缓存 -After modifying `.ts` files under `plugins/`, MUST run `rm -rf /tmp/jiti/` BEFORE `openclaw gateway restart`. +## 规则 5 — 修改插件代码后清除 jiti 缓存 +修改 plugins/ 下的 .ts 文件后,必须先执行 rm -rf /tmp/jiti/ 再重启 openclaw gateway。 ```
@@ -873,51 +676,53 @@ LanceDB 表 `memories`: | 字段 | 类型 | 说明 | | --- | --- | --- | | `id` | string (UUID) | 主键 | -| `text` | string | 记忆文本(FTS 索引) | +| `text` | string | 记忆文本(全文索引) | | `vector` | float[] | Embedding 向量 | -| `category` | string | `preference` / `fact` / `decision` / `entity` / `other` | -| `scope` | string | Scope 标识(如 `global`、`agent:main`) | -| `importance` | float | 重要性分数 0–1 | -| `timestamp` | int64 | 创建时间戳 (ms) | +| `category` | string | 存储类别:`preference` / `fact` / `decision` / `entity` / `reflection` / `other` | +| `scope` | string | 作用域标识(如 `global`、`agent:main`) | +| `importance` | float | 重要性分数 0-1 | +| `timestamp` | int64 | 创建时间戳(毫秒) | | `metadata` | string (JSON) | 扩展元数据 | -v1.1.0 常见 `metadata` 字段:`l0_abstract`、`l1_overview`、`l2_content`、`memory_category`、`tier`、`access_count`、`confidence`、`last_accessed_at` +v1.1.0 常用 `metadata` 字段:`l0_abstract`、`l1_overview`、`l2_content`、`memory_category`、`tier`、`access_count`、`confidence`、`last_accessed_at` + +> **关于分类的说明:** 顶层 `category` 字段使用 6 个存储类别。智能提取的 6 类语义标签(`profile` / `preferences` / `entities` / `events` / `cases` / `patterns`)存储在 `metadata.memory_category` 中。
-常见问题 / 排错 +故障排除 ### "Cannot mix BigInt and other types"(LanceDB / Apache Arrow) -在 LanceDB 0.26+ 中,部分数值列可能以 `BigInt` 返回。请升级到 **memory-lancedb-pro >= 1.0.14** — 插件已统一做 `Number(...)` 转换。 +在 LanceDB 0.26+ 上,某些数值列可能以 `BigInt` 形式返回。升级到 **memory-lancedb-pro >= 1.0.14**——插件现在会在运算前使用 `Number(...)` 进行类型转换。
--- -## 🧪 Beta:智能记忆 v1.1.0 +## 文档 -> 状态:Beta 版 — 通过 `npm i memory-lancedb-pro@beta` 安装。`latest` 稳定通道不受影响。 +| 文档 | 说明 | +| --- | --- | +| [OpenClaw 集成手册](docs/openclaw-integration-playbook.md) | 部署模式、验证、回归矩阵 | +| [记忆架构分析](docs/memory_architecture_analysis.md) | 完整架构深度解析 | +| [CHANGELOG v1.1.0](docs/CHANGELOG-v1.1.0.md) | v1.1.0 行为变更和升级说明 | +| [长上下文分块](docs/long-context-chunking.md) | 长文档分块策略 | -| 功能 | 说明 | -|------|------| -| **智能提取** | LLM 驱动的 6 类别提取,含 L0/L1/L2 metadata。禁用时回退到正则。 | -| **生命周期评分** | Weibull 衰减集成到检索中——高频、高重要性的记忆排名更靠前。 | -| **分层管理** | 三层系统(Core → Working → Peripheral),根据访问频率和分数自动晋升/降级。 | +--- -反馈:[GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issues) · 回退:`npm i memory-lancedb-pro@latest` +## Beta:智能记忆 v1.1.0 ---- +> 状态:Beta——通过 `npm i memory-lancedb-pro@beta` 安装。使用 `latest` 的稳定版用户不受影响。 -## 📖 文档 +| 功能 | 说明 | +|------|------| +| **智能提取** | LLM 驱动的 6 类提取,支持 L0/L1/L2 元数据。禁用时回退到正则模式。 | +| **生命周期评分** | Weibull 衰减集成到检索中——高频和高重要性记忆排名更高。 | +| **层级管理** | 三级系统(核心 → 工作 → 外围),自动晋升/降级。 | -| 文档 | 说明 | -| --- | --- | -| [OpenClaw 集成操作手册](docs/openclaw-integration-playbook.zh-CN.md) | 部署模式、`/new` 验证、回归矩阵 | -| [记忆架构分析](docs/memory_architecture_analysis.md) | 完整架构深度解析 | -| [CHANGELOG v1.1.0](docs/CHANGELOG-v1.1.0.md) | v1.1.0 行为变化和升级背景 | -| [长上下文分块](docs/long-context-chunking.md) | 长文档分块策略 | +反馈:[GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issues) · 回退:`npm i memory-lancedb-pro@latest` --- @@ -931,7 +736,7 @@ v1.1.0 常见 `metadata` 字段:`l0_abstract`、`l1_overview`、`l2_content` --- -## 🤝 主要贡献者 +## 贡献者

@win4r @@ -947,7 +752,7 @@ v1.1.0 常见 `metadata` 字段:`l0_abstract`、`l1_overview`、`l2_content` 完整列表:[Contributors](https://github.com/CortexReach/memory-lancedb-pro/graphs/contributors) -## ⭐ Star 趋势 +## Star 趋势 @@ -957,13 +762,12 @@ v1.1.0 常见 `metadata` 字段:`l0_abstract`、`l1_overview`、`l2_content` -## License +## 许可证 MIT --- - -## 微信二维码 +## 我的微信 diff --git a/README_DE.md b/README_DE.md new file mode 100644 index 00000000..104c259b --- /dev/null +++ b/README_DE.md @@ -0,0 +1,773 @@ +

+ +# 🧠 memory-lancedb-pro · 🦞OpenClaw Plugin + +**KI-Gedächtnisassistent für [OpenClaw](https://github.com/openclaw/openclaw)-Agenten** + +*Geben Sie Ihrem KI-Agenten ein Gehirn, das sich wirklich erinnert — über Sitzungen, Agenten und Zeit hinweg.* + +Ein LanceDB-basiertes OpenClaw-Langzeitgedächtnis-Plugin, das Präferenzen, Entscheidungen und Projektkontext speichert und in zukünftigen Sitzungen automatisch abruft. + +[![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-blue)](https://github.com/openclaw/openclaw) +[![npm version](https://img.shields.io/npm/v/memory-lancedb-pro)](https://www.npmjs.com/package/memory-lancedb-pro) +[![LanceDB](https://img.shields.io/badge/LanceDB-Vectorstore-orange)](https://lancedb.com) +[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) + +[English](README.md) | [简体中文](README_CN.md) | [繁體中文](README_TW.md) | [日本語](README_JA.md) | [한국어](README_KO.md) | [Français](README_FR.md) | [Español](README_ES.md) | [Deutsch](README_DE.md) | [Italiano](README_IT.md) | [Русский](README_RU.md) | [Português (Brasil)](README_PT-BR.md) + +
+ +--- + +## Warum memory-lancedb-pro? + +Die meisten KI-Agenten leiden unter Amnesie. Sie vergessen alles, sobald Sie einen neuen Chat starten. + +**memory-lancedb-pro** ist ein produktionsreifes Langzeitgedächtnis-Plugin für OpenClaw, das Ihren Agenten in einen echten **KI-Gedächtnisassistenten** verwandelt — es erfasst automatisch, was wichtig ist, lässt Rauschen natürlich verblassen und ruft die richtige Erinnerung zum richtigen Zeitpunkt ab. Kein manuelles Taggen, keine Konfigurationsprobleme. + +### Ihr KI-Gedächtnisassistent in Aktion + +**Ohne Gedächtnis — jede Sitzung beginnt bei null:** + +> **Sie:** „Verwende Tabs für die Einrückung, füge immer Fehlerbehandlung hinzu." +> *(nächste Sitzung)* +> **Sie:** „Ich habe es dir schon gesagt — Tabs, nicht Leerzeichen!" 😤 +> *(nächste Sitzung)* +> **Sie:** „…ernsthaft, Tabs. Und Fehlerbehandlung. Schon wieder." + +**Mit memory-lancedb-pro — Ihr Agent lernt und erinnert sich:** + +> **Sie:** „Verwende Tabs für die Einrückung, füge immer Fehlerbehandlung hinzu." +> *(nächste Sitzung — Agent ruft automatisch Ihre Präferenzen ab)* +> **Agent:** *(wendet still Tabs + Fehlerbehandlung an)* ✅ +> **Sie:** „Warum haben wir letzten Monat PostgreSQL statt MongoDB gewählt?" +> **Agent:** „Basierend auf unserer Diskussion am 12. Februar waren die Hauptgründe…" ✅ + +Das ist der Unterschied, den ein **KI-Gedächtnisassistent** macht — er lernt Ihren Stil, erinnert sich an vergangene Entscheidungen und liefert personalisierte Antworten, ohne dass Sie sich wiederholen müssen. + +### Was kann es noch? + +| | Was Sie bekommen | +|---|---| +| **Auto-Capture** | Ihr Agent lernt aus jeder Unterhaltung — kein manuelles `memory_store` nötig | +| **Intelligente Extraktion** | LLM-gestützte 6-Kategorien-Klassifikation: Profile, Präferenzen, Entitäten, Ereignisse, Fälle, Muster | +| **Intelligentes Vergessen** | Weibull-Zerfallsmodell — wichtige Erinnerungen bleiben, Rauschen verblasst natürlich | +| **Hybride Suche** | Vektor + BM25 Volltextsuche, fusioniert mit Cross-Encoder-Reranking | +| **Kontextinjektion** | Relevante Erinnerungen tauchen automatisch vor jeder Antwort auf | +| **Multi-Scope-Isolation** | Gedächtnisgrenzen pro Agent, pro Benutzer, pro Projekt | +| **Jeder Anbieter** | OpenAI, Jina, Gemini, Ollama oder jede OpenAI-kompatible API | +| **Vollständiges Toolkit** | CLI, Backup, Migration, Upgrade, Export/Import — produktionsbereit | + +--- + +## Schnellstart + +### Option A: Ein-Klick-Installationsskript (empfohlen) + +Das community-gepflegte **[Setup-Skript](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** erledigt Installation, Upgrade und Reparatur in einem Befehl: + +```bash +curl -fsSL https://raw.githubusercontent.com/CortexReach/toolbox/main/memory-lancedb-pro-setup/setup-memory.sh -o setup-memory.sh +bash setup-memory.sh +``` + +> Siehe [Ökosystem](#ökosystem) unten für die vollständige Liste der abgedeckten Szenarien und andere Community-Tools. + +### Option B: Manuelle Installation + +**Über OpenClaw CLI (empfohlen):** +```bash +openclaw plugins install memory-lancedb-pro@beta +``` + +**Oder über npm:** +```bash +npm i memory-lancedb-pro@beta +``` +> Bei npm-Installation müssen Sie auch das Plugin-Installationsverzeichnis als **absoluten** Pfad in `plugins.load.paths` Ihrer `openclaw.json` hinzufügen. Dies ist das häufigste Einrichtungsproblem. + +Fügen Sie zu Ihrer `openclaw.json` hinzu: + +```json +{ + "plugins": { + "slots": { "memory": "memory-lancedb-pro" }, + "entries": { + "memory-lancedb-pro": { + "enabled": true, + "config": { + "embedding": { + "provider": "openai-compatible", + "apiKey": "${OPENAI_API_KEY}", + "model": "text-embedding-3-small" + }, + "autoCapture": true, + "autoRecall": true, + "smartExtraction": true, + "extractMinMessages": 2, + "extractMaxChars": 8000, + "sessionMemory": { "enabled": false } + } + } + } + } +} +``` + +**Warum diese Standardwerte?** +- `autoCapture` + `smartExtraction` → Ihr Agent lernt automatisch aus jeder Unterhaltung +- `autoRecall` → relevante Erinnerungen werden vor jeder Antwort injiziert +- `extractMinMessages: 2` → Extraktion wird bei normalen Zwei-Runden-Chats ausgelöst +- `sessionMemory.enabled: false` → vermeidet Verschmutzung der Suche durch Sitzungszusammenfassungen am Anfang + +Validieren und neu starten: + +```bash +openclaw config validate +openclaw gateway restart +openclaw logs --follow --plain | grep "memory-lancedb-pro" +``` + +Sie sollten sehen: +- `memory-lancedb-pro: smart extraction enabled` +- `memory-lancedb-pro@...: plugin registered` + +Fertig! Ihr Agent verfügt jetzt über Langzeitgedächtnis. + +
+Weitere Installationswege (bestehende Benutzer, Upgrades) + +**Bereits OpenClaw-Benutzer?** + +1. Fügen Sie das Plugin mit einem **absoluten** `plugins.load.paths`-Eintrag hinzu +2. Binden Sie den Memory-Slot: `plugins.slots.memory = "memory-lancedb-pro"` +3. Überprüfen: `openclaw plugins info memory-lancedb-pro && openclaw memory-pro stats` + +**Upgrade von vor v1.1.0?** + +```bash +# 1) Backup +openclaw memory-pro export --scope global --output memories-backup.json +# 2) Testlauf +openclaw memory-pro upgrade --dry-run +# 3) Upgrade ausführen +openclaw memory-pro upgrade +# 4) Überprüfen +openclaw memory-pro stats +``` + +Siehe `CHANGELOG-v1.1.0.md` für Verhaltensänderungen und Upgrade-Begründung. + +
+ +
+Telegram-Bot-Schnellimport (zum Aufklappen klicken) + +Wenn Sie die Telegram-Integration von OpenClaw verwenden, ist es am einfachsten, einen Importbefehl direkt an den Hauptbot zu senden, anstatt die Konfiguration manuell zu bearbeiten. + +Senden Sie diese Nachricht: + +```text +Help me connect this memory plugin with the most user-friendly configuration: https://github.com/CortexReach/memory-lancedb-pro + +Requirements: +1. Set it as the only active memory plugin +2. Use Jina for embedding +3. Use Jina for reranker +4. Use gpt-4o-mini for the smart-extraction LLM +5. Enable autoCapture, autoRecall, smartExtraction +6. extractMinMessages=2 +7. sessionMemory.enabled=false +8. captureAssistant=false +9. retrieval mode=hybrid, vectorWeight=0.7, bm25Weight=0.3 +10. rerank=cross-encoder, candidatePoolSize=12, minScore=0.6, hardMinScore=0.62 +11. Generate the final openclaw.json config directly, not just an explanation +``` + +
+ +--- + +## Ökosystem + +memory-lancedb-pro ist das Kern-Plugin. Die Community hat Tools darum herum gebaut, um Einrichtung und tägliche Nutzung noch reibungsloser zu machen: + +### Setup-Skript — Ein-Klick-Installation, Upgrade und Reparatur + +> **[CortexReach/toolbox/memory-lancedb-pro-setup](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** + +Nicht nur ein einfacher Installer — das Skript behandelt intelligent eine Vielzahl realer Szenarien: + +| Ihre Situation | Was das Skript macht | +|---|---| +| Nie installiert | Frischer Download → Abhängigkeiten installieren → Konfiguration wählen → in openclaw.json schreiben → Neustart | +| Per `git clone` installiert, auf altem Commit hängen geblieben | Automatisches `git fetch` + `checkout` auf neueste Version → Abhängigkeiten neu installieren → Verifizieren | +| Konfiguration hat ungültige Felder | Automatische Erkennung per Schema-Filter, nicht unterstützte Felder entfernen | +| Per `npm` installiert | Überspringt Git-Update, erinnert Sie daran, `npm update` selbst auszuführen | +| `openclaw` CLI durch ungültige Konfiguration defekt | Fallback: Workspace-Pfad direkt aus der `openclaw.json`-Datei lesen | +| `extensions/` statt `plugins/` | Automatische Erkennung des Plugin-Standorts aus Konfiguration oder Dateisystem | +| Bereits aktuell | Nur Gesundheitschecks ausführen, keine Änderungen | + +```bash +bash setup-memory.sh # Installieren oder upgraden +bash setup-memory.sh --dry-run # Nur Vorschau +bash setup-memory.sh --beta # Pre-Release-Versionen einschließen +bash setup-memory.sh --uninstall # Konfiguration zurücksetzen und Plugin entfernen +``` + +Eingebaute Anbieter-Presets: **Jina / DashScope / SiliconFlow / OpenAI / Ollama**, oder bringen Sie Ihre eigene OpenAI-kompatible API mit. Für die vollständige Nutzung (einschließlich `--ref`, `--selfcheck-only` und mehr) siehe das [Setup-Skript README](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup). + +### Claude Code / OpenClaw Skill — KI-geführte Konfiguration + +> **[CortexReach/memory-lancedb-pro-skill](https://github.com/CortexReach/memory-lancedb-pro-skill)** + +Installieren Sie diesen Skill und Ihr KI-Agent (Claude Code oder OpenClaw) erhält tiefgreifendes Wissen über alle Funktionen von memory-lancedb-pro. Sagen Sie einfach **„hilf mir die beste Konfiguration zu aktivieren"** und erhalten Sie: + +- **Geführter 7-Schritte-Konfigurationsworkflow** mit 4 Bereitstellungsplänen: + - Full Power (Jina + OpenAI) / Budget (kostenloser SiliconFlow Reranker) / Simple (nur OpenAI) / Vollständig lokal (Ollama, null API-Kosten) +- **Alle 9 MCP-Tools** korrekt verwendet: `memory_recall`, `memory_store`, `memory_forget`, `memory_update`, `memory_stats`, `memory_list`, `self_improvement_log`, `self_improvement_extract_skill`, `self_improvement_review` *(vollständiges Toolset erfordert `enableManagementTools: true` — die Standard-Schnellstart-Konfiguration stellt nur die 4 Kern-Tools bereit)* +- **Vermeidung häufiger Fallstricke**: Workspace-Plugin-Aktivierung, `autoRecall` standardmäßig false, jiti-Cache, Umgebungsvariablen, Scope-Isolation und mehr + +**Installation für Claude Code:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.claude/skills/memory-lancedb-pro +``` + +**Installation für OpenClaw:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.openclaw/workspace/skills/memory-lancedb-pro-skill +``` + +--- + +## Video-Tutorial + +> Vollständige Anleitung: Installation, Konfiguration und Funktionsweise der hybriden Suche. + +[![YouTube Video](https://img.shields.io/badge/YouTube-Watch%20Now-red?style=for-the-badge&logo=youtube)](https://youtu.be/MtukF1C8epQ) +**https://youtu.be/MtukF1C8epQ** + +[![Bilibili Video](https://img.shields.io/badge/Bilibili-Watch%20Now-00A1D6?style=for-the-badge&logo=bilibili&logoColor=white)](https://www.bilibili.com/video/BV1zUf2BGEgn/) +**https://www.bilibili.com/video/BV1zUf2BGEgn/** + +--- + +## Architektur + +``` +┌─────────────────────────────────────────────────────────┐ +│ index.ts (Einstiegspunkt) │ +│ Plugin-Registrierung · Config-Parsing · Lifecycle-Hooks│ +└────────┬──────────┬──────────┬──────────┬───────────────┘ + │ │ │ │ + ┌────▼───┐ ┌────▼───┐ ┌───▼────┐ ┌──▼──────────┐ + │ store │ │embedder│ │retriever│ │ scopes │ + │ .ts │ │ .ts │ │ .ts │ │ .ts │ + └────────┘ └────────┘ └────────┘ └─────────────┘ + │ │ + ┌────▼───┐ ┌─────▼──────────┐ + │migrate │ │noise-filter.ts │ + │ .ts │ │adaptive- │ + └────────┘ │retrieval.ts │ + └────────────────┘ + ┌─────────────┐ ┌──────────┐ + │ tools.ts │ │ cli.ts │ + │ (Agent-API) │ │ (CLI) │ + └─────────────┘ └──────────┘ +``` + +> Für eine detaillierte Analyse der vollständigen Architektur siehe [docs/memory_architecture_analysis.md](docs/memory_architecture_analysis.md). + +
+Dateireferenz (zum Aufklappen klicken) + +| Datei | Zweck | +| --- | --- | +| `index.ts` | Plugin-Einstiegspunkt. Registriert sich bei der OpenClaw Plugin API, parst Konfiguration, bindet Lifecycle-Hooks ein | +| `openclaw.plugin.json` | Plugin-Metadaten + vollständige JSON-Schema-Konfigurationsdeklaration | +| `cli.ts` | CLI-Befehle: `memory-pro list/search/stats/delete/delete-bulk/export/import/reembed/upgrade/migrate` | +| `src/store.ts` | LanceDB-Speicherschicht. Tabellenerstellung / FTS-Indexierung / Vektorsuche / BM25-Suche / CRUD | +| `src/embedder.ts` | Embedding-Abstraktion. Kompatibel mit jedem OpenAI-kompatiblen API-Anbieter | +| `src/retriever.ts` | Hybride Suchmaschine. Vektor + BM25 → Hybride Fusion → Rerank → Lifecycle-Zerfall → Filter | +| `src/scopes.ts` | Multi-Scope-Zugriffskontrolle | +| `src/tools.ts` | Agent-Tool-Definitionen: `memory_recall`, `memory_store`, `memory_forget`, `memory_update` + Verwaltungstools | +| `src/noise-filter.ts` | Filtert Agent-Ablehnungen, Meta-Fragen, Begrüßungen und minderwertige Inhalte | +| `src/adaptive-retrieval.ts` | Bestimmt, ob eine Abfrage Gedächtnisabruf benötigt | +| `src/migrate.ts` | Migration vom eingebauten `memory-lancedb` zu Pro | +| `src/smart-extractor.ts` | LLM-gestützte 6-Kategorien-Extraktion mit L0/L1/L2 Schichtspeicherung und zweistufiger Deduplizierung | +| `src/decay-engine.ts` | Weibull Stretched-Exponential-Zerfallsmodell | +| `src/tier-manager.ts` | Dreistufige Beförderung/Herabstufung: Peripheral ↔ Working ↔ Core | + +
+ +--- + +## Kernfunktionen + +### Hybride Suche + +``` +Query → embedQuery() ─┐ + ├─→ Hybride Fusion → Rerank → Lifecycle-Zerfall-Boost → Längennorm → Filter +Query → BM25 FTS ─────┘ +``` + +- **Vektorsuche** — semantische Ähnlichkeit über LanceDB ANN (Kosinus-Distanz) +- **BM25 Volltextsuche** — exakte Schlüsselwortübereinstimmung über LanceDB FTS-Index +- **Hybride Fusion** — Vektorscore als Basis, BM25-Treffer erhalten gewichteten Boost (kein Standard-RRF — optimiert für reale Abrufqualität) +- **Konfigurierbare Gewichte** — `vectorWeight`, `bm25Weight`, `minScore` + +### Cross-Encoder Reranking + +- Eingebaute Adapter für **Jina**, **SiliconFlow**, **Voyage AI** und **Pinecone** +- Kompatibel mit jedem Jina-kompatiblen Endpunkt (z.B. Hugging Face TEI, DashScope) +- Hybrid-Scoring: 60% Cross-Encoder + 40% ursprünglicher fusionierter Score +- Graceful Degradation: Rückfall auf Kosinus-Ähnlichkeit bei API-Ausfall + +### Mehrstufige Scoring-Pipeline + +| Stufe | Effekt | +| --- | --- | +| **Hybride Fusion** | Kombiniert semantische und exakte Suche | +| **Cross-Encoder Rerank** | Fördert semantisch präzise Treffer | +| **Lifecycle-Zerfall-Boost** | Weibull-Aktualität + Zugriffshäufigkeit + Wichtigkeit × Konfidenz | +| **Längennormalisierung** | Verhindert, dass lange Einträge dominieren (Anker: 500 Zeichen) | +| **Harter Mindestscore** | Entfernt irrelevante Ergebnisse (Standard: 0.35) | +| **MMR-Diversität** | Kosinus-Ähnlichkeit > 0.85 → herabgestuft | + +### Intelligente Gedächtnisextraktion (v1.1.0) + +- **LLM-gestützte 6-Kategorien-Extraktion**: Profil, Präferenzen, Entitäten, Ereignisse, Fälle, Muster +- **L0/L1/L2 Schichtspeicherung**: L0 (Einzeiler-Index) → L1 (strukturierte Zusammenfassung) → L2 (vollständige Erzählung) +- **Zweistufige Deduplizierung**: Vektor-Ähnlichkeits-Vorfilter (≥0.7) → LLM semantische Entscheidung (CREATE/MERGE/SKIP) +- **Kategoriebasierte Zusammenführung**: `profile` wird immer zusammengeführt, `events`/`cases` sind nur anfügbar + +### Gedächtnis-Lebenszyklusverwaltung (v1.1.0) + +- **Weibull-Zerfallsmotor**: Gesamtscore = Aktualität + Häufigkeit + intrinsischer Wert +- **Dreistufige Beförderung**: `Peripheral ↔ Working ↔ Core` mit konfigurierbaren Schwellenwerten +- **Zugriffsverstärkung**: Häufig abgerufene Erinnerungen zerfallen langsamer (Spaced-Repetition-Stil) +- **Wichtigkeitsmodulierte Halbwertszeit**: Wichtige Erinnerungen zerfallen langsamer + +### Multi-Scope-Isolation + +- Eingebaute Scopes: `global`, `agent:`, `custom:`, `project:`, `user:` +- Zugriffskontrolle auf Agentenebene über `scopes.agentAccess` +- Standard: Jeder Agent greift auf `global` + seinen eigenen `agent:`-Scope zu + +### Auto-Capture und Auto-Recall + +- **Auto-Capture** (`agent_end`): Extrahiert Präferenzen/Fakten/Entscheidungen/Entitäten aus Gesprächen, dedupliziert, speichert bis zu 3 pro Runde +- **Auto-Recall** (`before_agent_start`): Injiziert ``-Kontext (bis zu 3 Einträge) + +### Rauschfilterung und adaptive Suche + +- Filtert minderwertige Inhalte: Agent-Ablehnungen, Meta-Fragen, Begrüßungen +- Überspringt Suche bei: Begrüßungen, Slash-Befehlen, einfachen Bestätigungen, Emoji +- Erzwingt Suche bei Gedächtnis-Schlüsselwörtern („erinnere dich", „vorher", „letztes Mal") +- CJK-bewusste Schwellenwerte (Chinesisch: 6 Zeichen vs Englisch: 15 Zeichen) + +--- + +
+Vergleich mit dem eingebauten memory-lancedb (zum Aufklappen klicken) + +| Funktion | Eingebautes `memory-lancedb` | **memory-lancedb-pro** | +| --- | :---: | :---: | +| Vektorsuche | Ja | Ja | +| BM25 Volltextsuche | - | Ja | +| Hybride Fusion (Vektor + BM25) | - | Ja | +| Cross-Encoder Rerank (Multi-Anbieter) | - | Ja | +| Aktualitäts-Boost und Zeitzerfall | - | Ja | +| Längennormalisierung | - | Ja | +| MMR-Diversität | - | Ja | +| Multi-Scope-Isolation | - | Ja | +| Rauschfilterung | - | Ja | +| Adaptive Suche | - | Ja | +| Verwaltungs-CLI | - | Ja | +| Sitzungsgedächtnis | - | Ja | +| Aufgabenbezogene Embeddings | - | Ja | +| **LLM Intelligente Extraktion (6 Kategorien)** | - | Ja (v1.1.0) | +| **Weibull-Zerfall + Stufenbeförderung** | - | Ja (v1.1.0) | +| Beliebiges OpenAI-kompatibles Embedding | Eingeschränkt | Ja | + +
+ +--- + +## Konfiguration + +
+Vollständiges Konfigurationsbeispiel + +```json +{ + "embedding": { + "apiKey": "${JINA_API_KEY}", + "model": "jina-embeddings-v5-text-small", + "baseURL": "https://api.jina.ai/v1", + "dimensions": 1024, + "taskQuery": "retrieval.query", + "taskPassage": "retrieval.passage", + "normalized": true + }, + "dbPath": "~/.openclaw/memory/lancedb-pro", + "autoCapture": true, + "autoRecall": true, + "retrieval": { + "mode": "hybrid", + "vectorWeight": 0.7, + "bm25Weight": 0.3, + "minScore": 0.3, + "rerank": "cross-encoder", + "rerankApiKey": "${JINA_API_KEY}", + "rerankModel": "jina-reranker-v3", + "rerankEndpoint": "https://api.jina.ai/v1/rerank", + "rerankProvider": "jina", + "candidatePoolSize": 20, + "recencyHalfLifeDays": 14, + "recencyWeight": 0.1, + "filterNoise": true, + "lengthNormAnchor": 500, + "hardMinScore": 0.35, + "timeDecayHalfLifeDays": 60, + "reinforcementFactor": 0.5, + "maxHalfLifeMultiplier": 3 + }, + "enableManagementTools": false, + "scopes": { + "default": "global", + "definitions": { + "global": { "description": "Shared knowledge" }, + "agent:discord-bot": { "description": "Discord bot private" } + }, + "agentAccess": { + "discord-bot": ["global", "agent:discord-bot"] + } + }, + "sessionMemory": { + "enabled": false, + "messageCount": 15 + }, + "smartExtraction": true, + "llm": { + "apiKey": "${OPENAI_API_KEY}", + "model": "gpt-4o-mini", + "baseURL": "https://api.openai.com/v1" + }, + "extractMinMessages": 2, + "extractMaxChars": 8000 +} +``` + +
+ +
+Embedding-Anbieter + +Funktioniert mit **jeder OpenAI-kompatiblen Embedding-API**: + +| Anbieter | Modell | Basis-URL | Dimensionen | +| --- | --- | --- | --- | +| **Jina** (empfohlen) | `jina-embeddings-v5-text-small` | `https://api.jina.ai/v1` | 1024 | +| **OpenAI** | `text-embedding-3-small` | `https://api.openai.com/v1` | 1536 | +| **Voyage** | `voyage-4-lite` / `voyage-4` | `https://api.voyageai.com/v1` | 1024 / 1024 | +| **Google Gemini** | `gemini-embedding-001` | `https://generativelanguage.googleapis.com/v1beta/openai/` | 3072 | +| **Ollama** (lokal) | `nomic-embed-text` | `http://localhost:11434/v1` | anbieterspezifisch | + +
+ +
+Rerank-Anbieter + +Cross-Encoder Reranking unterstützt mehrere Anbieter über `rerankProvider`: + +| Anbieter | `rerankProvider` | Beispielmodell | +| --- | --- | --- | +| **Jina** (Standard) | `jina` | `jina-reranker-v3` | +| **SiliconFlow** (kostenlose Stufe verfügbar) | `siliconflow` | `BAAI/bge-reranker-v2-m3` | +| **Voyage AI** | `voyage` | `rerank-2.5` | +| **Pinecone** | `pinecone` | `bge-reranker-v2-m3` | + +Jeder Jina-kompatible Rerank-Endpunkt funktioniert ebenfalls — setzen Sie `rerankProvider: "jina"` und verweisen Sie `rerankEndpoint` auf Ihren Dienst (z.B. Hugging Face TEI, DashScope `qwen3-rerank`). + +
+ +
+Intelligente Extraktion (LLM) — v1.1.0 + +Wenn `smartExtraction` aktiviert ist (Standard: `true`), verwendet das Plugin ein LLM, um Erinnerungen intelligent zu extrahieren und zu klassifizieren, anstatt regex-basierte Auslöser zu verwenden. + +| Feld | Typ | Standard | Beschreibung | +|-------|------|---------|-------------| +| `smartExtraction` | boolean | `true` | LLM-gestützte 6-Kategorien-Extraktion aktivieren/deaktivieren | +| `llm.auth` | string | `api-key` | `api-key` verwendet `llm.apiKey` / `embedding.apiKey`; `oauth` verwendet standardmäßig eine plugin-spezifische OAuth-Token-Datei | +| `llm.apiKey` | string | *(Rückfall auf `embedding.apiKey`)* | API-Schlüssel für den LLM-Anbieter | +| `llm.model` | string | `openai/gpt-oss-120b` | LLM-Modellname | +| `llm.baseURL` | string | *(Rückfall auf `embedding.baseURL`)* | LLM-API-Endpunkt | +| `llm.oauthProvider` | string | `openai-codex` | OAuth-Anbieter-ID bei `llm.auth` = `oauth` | +| `llm.oauthPath` | string | `~/.openclaw/.memory-lancedb-pro/oauth.json` | OAuth-Token-Datei bei `llm.auth` = `oauth` | +| `llm.timeoutMs` | number | `30000` | LLM-Anfrage-Timeout in Millisekunden | +| `extractMinMessages` | number | `2` | Mindestanzahl an Nachrichten bevor Extraktion ausgelöst wird | +| `extractMaxChars` | number | `8000` | Maximale Zeichenanzahl, die an das LLM gesendet wird | + + +OAuth `llm`-Konfiguration (vorhandenen Codex / ChatGPT Login-Cache für LLM-Aufrufe verwenden): +```json +{ + "llm": { + "auth": "oauth", + "oauthProvider": "openai-codex", + "model": "gpt-5.4", + "oauthPath": "${HOME}/.openclaw/.memory-lancedb-pro/oauth.json", + "timeoutMs": 30000 + } +} +``` + +Hinweise zu `llm.auth: "oauth"`: + +- `llm.oauthProvider` ist derzeit `openai-codex`. +- OAuth-Tokens werden standardmäßig unter `~/.openclaw/.memory-lancedb-pro/oauth.json` gespeichert. +- Sie können `llm.oauthPath` setzen, wenn Sie die Datei an einem anderen Ort speichern möchten. +- `auth login` erstellt eine Sicherung der vorherigen api-key `llm`-Konfiguration neben der OAuth-Datei, und `auth logout` stellt diese Sicherung bei Verfügbarkeit wieder her. +- Der Wechsel von `api-key` zu `oauth` überträgt `llm.baseURL` nicht automatisch. Setzen Sie es im OAuth-Modus nur manuell, wenn Sie absichtlich ein benutzerdefiniertes ChatGPT/Codex-kompatibles Backend verwenden möchten. + +
+ +
+Lebenszyklus-Konfiguration (Zerfall + Stufen) + +| Feld | Standard | Beschreibung | +|-------|---------|-------------| +| `decay.recencyHalfLifeDays` | `30` | Basis-Halbwertszeit für Weibull-Aktualitätszerfall | +| `decay.frequencyWeight` | `0.3` | Gewichtung der Zugriffshäufigkeit im Gesamtscore | +| `decay.intrinsicWeight` | `0.3` | Gewichtung von `Wichtigkeit × Konfidenz` | +| `decay.betaCore` | `0.8` | Weibull-Beta für `core`-Erinnerungen | +| `decay.betaWorking` | `1.0` | Weibull-Beta für `working`-Erinnerungen | +| `decay.betaPeripheral` | `1.3` | Weibull-Beta für `peripheral`-Erinnerungen | +| `tier.coreAccessThreshold` | `10` | Mindestanzahl Abrufe vor Beförderung zu `core` | +| `tier.peripheralAgeDays` | `60` | Altersschwelle für die Herabstufung veralteter Erinnerungen | + +
+ +
+Zugriffsverstärkung + +Häufig abgerufene Erinnerungen zerfallen langsamer (Spaced-Repetition-Stil). + +Konfigurationsschlüssel (unter `retrieval`): +- `reinforcementFactor` (0-2, Standard: `0.5`) — auf `0` setzen zum Deaktivieren +- `maxHalfLifeMultiplier` (1-10, Standard: `3`) — harte Obergrenze für die effektive Halbwertszeit + +
+ +--- + +## CLI-Befehle + +```bash +openclaw memory-pro list [--scope global] [--category fact] [--limit 20] [--json] +openclaw memory-pro search "query" [--scope global] [--limit 10] [--json] +openclaw memory-pro stats [--scope global] [--json] +openclaw memory-pro auth login [--provider openai-codex] [--model gpt-5.4] [--oauth-path /abs/path/oauth.json] +openclaw memory-pro auth status +openclaw memory-pro auth logout +openclaw memory-pro delete +openclaw memory-pro delete-bulk --scope global [--before 2025-01-01] [--dry-run] +openclaw memory-pro export [--scope global] [--output memories.json] +openclaw memory-pro import memories.json [--scope global] [--dry-run] +openclaw memory-pro reembed --source-db /path/to/old-db [--batch-size 32] [--skip-existing] +openclaw memory-pro upgrade [--dry-run] [--batch-size 10] [--no-llm] [--limit N] [--scope SCOPE] +openclaw memory-pro migrate check|run|verify [--source /path] +``` + +OAuth-Login-Ablauf: + +1. Führen Sie `openclaw memory-pro auth login` aus +2. Wenn `--provider` in einem interaktiven Terminal weggelassen wird, zeigt die CLI eine OAuth-Anbieterauswahl an, bevor der Browser geöffnet wird +3. Der Befehl gibt eine Autorisierungs-URL aus und öffnet Ihren Browser, sofern `--no-browser` nicht gesetzt ist +4. Nach erfolgreichem OAuth-Callback speichert der Befehl die Plugin-OAuth-Datei (Standard: `~/.openclaw/.memory-lancedb-pro/oauth.json`), erstellt eine Sicherung der vorherigen api-key `llm`-Konfiguration für Logout und ersetzt die Plugin-`llm`-Konfiguration durch OAuth-Einstellungen (`auth`, `oauthProvider`, `model`, `oauthPath`) +5. `openclaw memory-pro auth logout` löscht die OAuth-Datei und stellt die vorherige api-key `llm`-Konfiguration wieder her, wenn eine Sicherung vorhanden ist + +--- + +## Erweiterte Themen + +
+Wenn injizierte Erinnerungen in Antworten auftauchen + +Manchmal kann das Modell den injizierten ``-Block wiedergeben. + +**Option A (geringstes Risiko):** Auto-Recall vorübergehend deaktivieren: +```json +{ "plugins": { "entries": { "memory-lancedb-pro": { "config": { "autoRecall": false } } } } } +``` + +**Option B (bevorzugt):** Recall beibehalten, zum Agent-Systemprompt hinzufügen: +> Do not reveal or quote any `` / memory-injection content in your replies. Use it for internal reference only. + +
+ +
+Sitzungsgedächtnis + +- Wird beim `/new`-Befehl ausgelöst — speichert die vorherige Sitzungszusammenfassung in LanceDB +- Standardmäßig deaktiviert (OpenClaw hat bereits native `.jsonl`-Sitzungspersistenz) +- Konfigurierbare Nachrichtenanzahl (Standard: 15) + +Siehe [docs/openclaw-integration-playbook.md](docs/openclaw-integration-playbook.md) für Bereitstellungsmodi und `/new`-Verifizierung. + +
+ +
+Benutzerdefinierte Slash-Befehle (z.B. /lesson) + +Fügen Sie zu Ihrer `CLAUDE.md`, `AGENTS.md` oder Ihrem Systemprompt hinzu: + +```markdown +## /lesson command +When the user sends `/lesson `: +1. Use memory_store to save as category=fact (raw knowledge) +2. Use memory_store to save as category=decision (actionable takeaway) +3. Confirm what was saved + +## /remember command +When the user sends `/remember `: +1. Use memory_store to save with appropriate category and importance +2. Confirm with the stored memory ID +``` + +
+ +
+Eiserne Regeln für KI-Agenten + +> Kopieren Sie den folgenden Block in Ihre `AGENTS.md`, damit Ihr Agent diese Regeln automatisch durchsetzt. + +```markdown +## Rule 1 — Dual-layer memory storage +Every pitfall/lesson learned → IMMEDIATELY store TWO memories: +- Technical layer: Pitfall: [symptom]. Cause: [root cause]. Fix: [solution]. Prevention: [how to avoid] + (category: fact, importance >= 0.8) +- Principle layer: Decision principle ([tag]): [behavioral rule]. Trigger: [when]. Action: [what to do] + (category: decision, importance >= 0.85) + +## Rule 2 — LanceDB hygiene +Entries must be short and atomic (< 500 chars). No raw conversation summaries or duplicates. + +## Rule 3 — Recall before retry +On ANY tool failure, ALWAYS memory_recall with relevant keywords BEFORE retrying. + +## Rule 4 — Confirm target codebase +Confirm you are editing memory-lancedb-pro vs built-in memory-lancedb before changes. + +## Rule 5 — Clear jiti cache after plugin code changes +After modifying .ts files under plugins/, MUST run rm -rf /tmp/jiti/ BEFORE openclaw gateway restart. +``` + +
+ +
+Datenbankschema + +LanceDB-Tabelle `memories`: + +| Feld | Typ | Beschreibung | +| --- | --- | --- | +| `id` | string (UUID) | Primärschlüssel | +| `text` | string | Gedächtnistext (FTS-indiziert) | +| `vector` | float[] | Embedding-Vektor | +| `category` | string | Speicherkategorie: `preference` / `fact` / `decision` / `entity` / `reflection` / `other` | +| `scope` | string | Scope-Bezeichner (z.B. `global`, `agent:main`) | +| `importance` | float | Wichtigkeitsscore 0-1 | +| `timestamp` | int64 | Erstellungszeitstempel (ms) | +| `metadata` | string (JSON) | Erweiterte Metadaten | + +Häufige `metadata`-Schlüssel in v1.1.0: `l0_abstract`, `l1_overview`, `l2_content`, `memory_category`, `tier`, `access_count`, `confidence`, `last_accessed_at` + +> **Hinweis zu Kategorien:** Das Top-Level-Feld `category` verwendet 6 Speicherkategorien. Die 6-Kategorien-semantischen Labels der intelligenten Extraktion (`profile` / `preferences` / `entities` / `events` / `cases` / `patterns`) werden in `metadata.memory_category` gespeichert. + +
+ +
+Fehlerbehebung + +### "Cannot mix BigInt and other types" (LanceDB / Apache Arrow) + +Bei LanceDB 0.26+ können einige numerische Spalten als `BigInt` zurückgegeben werden. Aktualisieren Sie auf **memory-lancedb-pro >= 1.0.14** — dieses Plugin konvertiert Werte nun mit `Number(...)` vor arithmetischen Operationen. + +
+ +--- + +## Dokumentation + +| Dokument | Beschreibung | +| --- | --- | +| [OpenClaw Integrations-Playbook](docs/openclaw-integration-playbook.md) | Bereitstellungsmodi, Verifizierung, Regressionsmatrix | +| [Gedächtnisarchitektur-Analyse](docs/memory_architecture_analysis.md) | Vollständige Architektur-Tiefenanalyse | +| [CHANGELOG v1.1.0](docs/CHANGELOG-v1.1.0.md) | Verhaltensänderungen v1.1.0 und Upgrade-Begründung | +| [Langkontext-Chunking](docs/long-context-chunking.md) | Chunking-Strategie für lange Dokumente | + +--- + +## Beta: Smart Memory v1.1.0 + +> Status: Beta — verfügbar über `npm i memory-lancedb-pro@beta`. Stabile Benutzer auf `latest` sind nicht betroffen. + +| Funktion | Beschreibung | +|---------|-------------| +| **Intelligente Extraktion** | LLM-gestützte 6-Kategorien-Extraktion mit L0/L1/L2 Metadaten. Rückfall auf Regex wenn deaktiviert. | +| **Lebenszyklus-Scoring** | Weibull-Zerfall in die Suche integriert — häufige und wichtige Erinnerungen ranken höher. | +| **Stufenverwaltung** | Dreistufiges System (Core → Working → Peripheral) mit automatischer Beförderung/Herabstufung. | + +Feedback: [GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issues) · Zurücksetzen: `npm i memory-lancedb-pro@latest` + +--- + +## Abhängigkeiten + +| Paket | Zweck | +| --- | --- | +| `@lancedb/lancedb` ≥0.26.2 | Vektordatenbank (ANN + FTS) | +| `openai` ≥6.21.0 | OpenAI-kompatibler Embedding-API-Client | +| `@sinclair/typebox` 0.34.48 | JSON-Schema-Typdefinitionen | + +--- + +## Contributors + +

+@win4r +@kctony +@Akatsuki-Ryu +@JasonSuz +@Minidoracat +@furedericca-lab +@joe2643 +@AliceLJY +@chenjiyong +

+ +Full list: [Contributors](https://github.com/CortexReach/memory-lancedb-pro/graphs/contributors) + +## Star History + + + + + + Star History Chart + + + +## Lizenz + +MIT + +--- + +## Mein WeChat QR-Code + + diff --git a/README_ES.md b/README_ES.md new file mode 100644 index 00000000..fe4d173c --- /dev/null +++ b/README_ES.md @@ -0,0 +1,773 @@ +
+ +# 🧠 memory-lancedb-pro · 🦞OpenClaw Plugin + +**Asistente de Memoria IA para Agentes [OpenClaw](https://github.com/openclaw/openclaw)** + +*Dale a tu agente de IA un cerebro que realmente recuerda — entre sesiones, entre agentes, a lo largo del tiempo.* + +Un plugin de memoria para OpenClaw respaldado por LanceDB que almacena preferencias, decisiones y contexto de proyectos, y los recupera automáticamente en sesiones futuras. + +[![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-blue)](https://github.com/openclaw/openclaw) +[![npm version](https://img.shields.io/npm/v/memory-lancedb-pro)](https://www.npmjs.com/package/memory-lancedb-pro) +[![LanceDB](https://img.shields.io/badge/LanceDB-Vectorstore-orange)](https://lancedb.com) +[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) + +[English](README.md) | [简体中文](README_CN.md) | [繁體中文](README_TW.md) | [日本語](README_JA.md) | [한국어](README_KO.md) | [Français](README_FR.md) | [Español](README_ES.md) | [Deutsch](README_DE.md) | [Italiano](README_IT.md) | [Русский](README_RU.md) | [Português (Brasil)](README_PT-BR.md) + +
+ +--- + +## ¿Por qué memory-lancedb-pro? + +La mayoría de los agentes de IA tienen amnesia. Olvidan todo en el momento en que inicias un nuevo chat. + +**memory-lancedb-pro** es un plugin de memoria a largo plazo de nivel productivo para OpenClaw que convierte a tu agente en un **Asistente de Memoria IA** — captura automáticamente lo que importa, deja que el ruido se desvanezca naturalmente y recupera el recuerdo correcto en el momento adecuado. Sin etiquetado manual, sin complicaciones de configuración. + +### Tu Asistente de Memoria IA en acción + +**Sin memoria — cada sesión comienza desde cero:** + +> **Tú:** "Usa tabulaciones para la indentación, siempre agrega manejo de errores." +> *(siguiente sesión)* +> **Tú:** "¡Ya te lo dije — tabulaciones, no espacios!" 😤 +> *(siguiente sesión)* +> **Tú:** "...en serio, tabulaciones. Y manejo de errores. Otra vez." + +**Con memory-lancedb-pro — tu agente aprende y recuerda:** + +> **Tú:** "Usa tabulaciones para la indentación, siempre agrega manejo de errores." +> *(siguiente sesión — el agente recupera automáticamente tus preferencias)* +> **Agente:** *(aplica silenciosamente tabulaciones + manejo de errores)* ✅ +> **Tú:** "¿Por qué elegimos PostgreSQL en lugar de MongoDB el mes pasado?" +> **Agente:** "Basándome en nuestra discusión del 12 de febrero, las razones principales fueron..." ✅ + +Esa es la diferencia que hace un **Asistente de Memoria IA** — aprende tu estilo, recuerda decisiones pasadas y entrega respuestas personalizadas sin que tengas que repetirte. + +### ¿Qué más puede hacer? + +| | Lo que obtienes | +|---|---| +| **Auto-Capture** | Tu agente aprende de cada conversación — sin necesidad de `memory_store` manual | +| **Smart Extraction** | Clasificación de 6 categorías impulsada por LLM: perfiles, preferencias, entidades, eventos, casos, patrones | +| **Olvido Inteligente** | Modelo de decaimiento Weibull — los recuerdos importantes permanecen, el ruido se desvanece naturalmente | +| **Recuperación Híbrida** | Búsqueda vectorial + BM25 de texto completo, fusionada con reranking por cross-encoder | +| **Inyección de Contexto** | Los recuerdos relevantes aparecen automáticamente antes de cada respuesta | +| **Aislamiento Multi-Scope** | Límites de memoria por agente, por usuario, por proyecto | +| **Cualquier Proveedor** | OpenAI, Jina, Gemini, Ollama, o cualquier API compatible con OpenAI | +| **Kit Completo de Herramientas** | CLI, respaldo, migración, actualización, exportar/importar — listo para producción | + +--- + +## Inicio Rápido + +### Opción A: Script de instalación con un clic (Recomendado) + +El **[script de instalación](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** mantenido por la comunidad gestiona la instalación, actualización y reparación en un solo comando: + +```bash +curl -fsSL https://raw.githubusercontent.com/CortexReach/toolbox/main/memory-lancedb-pro-setup/setup-memory.sh -o setup-memory.sh +bash setup-memory.sh +``` + +> Consulta [Ecosistema](#ecosistema) más abajo para ver la lista completa de escenarios que cubre el script y otras herramientas de la comunidad. + +### Opción B: Instalación Manual + +**Mediante la CLI de OpenClaw (recomendado):** +```bash +openclaw plugins install memory-lancedb-pro@beta +``` + +**O mediante npm:** +```bash +npm i memory-lancedb-pro@beta +``` +> Si usas npm, también necesitarás agregar el directorio de instalación del plugin como una ruta **absoluta** en `plugins.load.paths` en tu `openclaw.json`. Este es el problema de configuración más común. + +Agrega a tu `openclaw.json`: + +```json +{ + "plugins": { + "slots": { "memory": "memory-lancedb-pro" }, + "entries": { + "memory-lancedb-pro": { + "enabled": true, + "config": { + "embedding": { + "provider": "openai-compatible", + "apiKey": "${OPENAI_API_KEY}", + "model": "text-embedding-3-small" + }, + "autoCapture": true, + "autoRecall": true, + "smartExtraction": true, + "extractMinMessages": 2, + "extractMaxChars": 8000, + "sessionMemory": { "enabled": false } + } + } + } + } +} +``` + +**¿Por qué estos valores predeterminados?** +- `autoCapture` + `smartExtraction` → tu agente aprende de cada conversación automáticamente +- `autoRecall` → los recuerdos relevantes se inyectan antes de cada respuesta +- `extractMinMessages: 2` → la extracción se activa en chats normales de dos turnos +- `sessionMemory.enabled: false` → evita contaminar la recuperación con resúmenes de sesión desde el primer día + +Valida y reinicia: + +```bash +openclaw config validate +openclaw gateway restart +openclaw logs --follow --plain | grep "memory-lancedb-pro" +``` + +Deberías ver: +- `memory-lancedb-pro: smart extraction enabled` +- `memory-lancedb-pro@...: plugin registered` + +¡Listo! Tu agente ahora tiene memoria a largo plazo. + +
+Más rutas de instalación (usuarios existentes, actualizaciones) + +**¿Ya usas OpenClaw?** + +1. Agrega el plugin con una entrada **absoluta** en `plugins.load.paths` +2. Vincula el slot de memoria: `plugins.slots.memory = "memory-lancedb-pro"` +3. Verifica: `openclaw plugins info memory-lancedb-pro && openclaw memory-pro stats` + +**¿Actualizando desde una versión anterior a v1.1.0?** + +```bash +# 1) Respaldo +openclaw memory-pro export --scope global --output memories-backup.json +# 2) Ejecución de prueba +openclaw memory-pro upgrade --dry-run +# 3) Ejecutar actualización +openclaw memory-pro upgrade +# 4) Verificar +openclaw memory-pro stats +``` + +Consulta `CHANGELOG-v1.1.0.md` para los cambios de comportamiento y la justificación de la actualización. + +
+ +
+Importación rápida para Bot de Telegram (clic para expandir) + +Si usas la integración de Telegram de OpenClaw, la forma más fácil es enviar un comando de importación directamente al Bot principal en lugar de editar la configuración manualmente. + +Envía este mensaje (en inglés, ya que es un prompt para el bot): + +```text +Help me connect this memory plugin with the most user-friendly configuration: https://github.com/CortexReach/memory-lancedb-pro + +Requirements: +1. Set it as the only active memory plugin +2. Use Jina for embedding +3. Use Jina for reranker +4. Use gpt-4o-mini for the smart-extraction LLM +5. Enable autoCapture, autoRecall, smartExtraction +6. extractMinMessages=2 +7. sessionMemory.enabled=false +8. captureAssistant=false +9. retrieval mode=hybrid, vectorWeight=0.7, bm25Weight=0.3 +10. rerank=cross-encoder, candidatePoolSize=12, minScore=0.6, hardMinScore=0.62 +11. Generate the final openclaw.json config directly, not just an explanation +``` + +
+ +--- + +## Ecosistema + +memory-lancedb-pro es el plugin principal. La comunidad ha desarrollado herramientas a su alrededor para hacer que la configuración y el uso diario sean aún más sencillos: + +### Script de Instalación — Instala, actualiza y repara con un solo clic + +> **[CortexReach/toolbox/memory-lancedb-pro-setup](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** + +Mucho más que un simple instalador — el script gestiona de forma inteligente una amplia variedad de escenarios reales: + +| Tu situación | Lo que hace el script | +|---|---| +| Nunca instalado | Descarga nueva → instala dependencias → elige configuración → escribe en openclaw.json → reinicia | +| Instalado vía `git clone`, atascado en un commit antiguo | `git fetch` + `checkout` automático a la última versión → reinstala dependencias → verifica | +| La configuración tiene campos inválidos | Auto-detección mediante filtro de esquema, elimina campos no soportados | +| Instalado vía `npm` | Omite la actualización de git, te recuerda ejecutar `npm update` por tu cuenta | +| CLI de `openclaw` rota por configuración inválida | Alternativa: lee la ruta del workspace directamente del archivo `openclaw.json` | +| `extensions/` en lugar de `plugins/` | Auto-detección de la ubicación del plugin desde la configuración o el sistema de archivos | +| Ya está actualizado | Solo ejecuta verificaciones de salud, sin cambios | + +```bash +bash setup-memory.sh # Instalar o actualizar +bash setup-memory.sh --dry-run # Solo previsualización +bash setup-memory.sh --beta # Incluir versiones preliminares +bash setup-memory.sh --uninstall # Revertir configuración y eliminar plugin +``` + +Configuraciones preestablecidas de proveedores: **Jina / DashScope / SiliconFlow / OpenAI / Ollama**, o usa tu propia API compatible con OpenAI. Para la referencia completa (incluyendo `--ref`, `--selfcheck-only` y más), consulta el [README del script de instalación](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup). + +### Skill para Claude Code / OpenClaw — Configuración Guiada por IA + +> **[CortexReach/memory-lancedb-pro-skill](https://github.com/CortexReach/memory-lancedb-pro-skill)** + +Instala este skill y tu agente de IA (Claude Code u OpenClaw) obtiene un conocimiento profundo de cada característica de memory-lancedb-pro. Solo di **"ayúdame a habilitar la mejor configuración"** y obtén: + +- **Flujo de configuración guiado en 7 pasos** con 4 planes de despliegue: + - Potencia Total (Jina + OpenAI) / Económico (reranker gratuito de SiliconFlow) / Simple (solo OpenAI) / Totalmente Local (Ollama, sin costo de API) +- **Las 9 herramientas MCP** usadas correctamente: `memory_recall`, `memory_store`, `memory_forget`, `memory_update`, `memory_stats`, `memory_list`, `self_improvement_log`, `self_improvement_extract_skill`, `self_improvement_review` *(el conjunto completo de herramientas requiere `enableManagementTools: true` — la configuración de Inicio Rápido predeterminada expone las 4 herramientas principales)* +- **Prevención de errores comunes**: habilitación del plugin en el workspace, `autoRecall` desactivado por defecto, caché de jiti, variables de entorno, aislamiento de scope, y más + +**Instalar para Claude Code:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.claude/skills/memory-lancedb-pro +``` + +**Instalar para OpenClaw:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.openclaw/workspace/skills/memory-lancedb-pro-skill +``` + +--- + +## Tutorial en Video + +> Recorrido completo: instalación, configuración y funcionamiento interno de la recuperación híbrida. + +[![YouTube Video](https://img.shields.io/badge/YouTube-Watch%20Now-red?style=for-the-badge&logo=youtube)](https://youtu.be/MtukF1C8epQ) +**https://youtu.be/MtukF1C8epQ** + +[![Bilibili Video](https://img.shields.io/badge/Bilibili-Watch%20Now-00A1D6?style=for-the-badge&logo=bilibili&logoColor=white)](https://www.bilibili.com/video/BV1zUf2BGEgn/) +**https://www.bilibili.com/video/BV1zUf2BGEgn/** + +--- + +## Arquitectura + +``` +┌─────────────────────────────────────────────────────────┐ +│ index.ts (Entry Point) │ +│ Plugin Registration · Config Parsing · Lifecycle Hooks │ +└────────┬──────────┬──────────┬──────────┬───────────────┘ + │ │ │ │ + ┌────▼───┐ ┌────▼───┐ ┌───▼────┐ ┌──▼──────────┐ + │ store │ │embedder│ │retriever│ │ scopes │ + │ .ts │ │ .ts │ │ .ts │ │ .ts │ + └────────┘ └────────┘ └────────┘ └─────────────┘ + │ │ + ┌────▼───┐ ┌─────▼──────────┐ + │migrate │ │noise-filter.ts │ + │ .ts │ │adaptive- │ + └────────┘ │retrieval.ts │ + └────────────────┘ + ┌─────────────┐ ┌──────────┐ + │ tools.ts │ │ cli.ts │ + │ (Agent API) │ │ (CLI) │ + └─────────────┘ └──────────┘ +``` + +> Para un análisis detallado de la arquitectura completa, consulta [docs/memory_architecture_analysis.md](docs/memory_architecture_analysis.md). + +
+Referencia de Archivos (clic para expandir) + +| Archivo | Propósito | +| --- | --- | +| `index.ts` | Punto de entrada del plugin. Se registra con la API de Plugins de OpenClaw, analiza la configuración, monta hooks de ciclo de vida | +| `openclaw.plugin.json` | Metadatos del plugin + declaración completa de configuración con JSON Schema | +| `cli.ts` | Comandos CLI: `memory-pro list/search/stats/delete/delete-bulk/export/import/reembed/upgrade/migrate` | +| `src/store.ts` | Capa de almacenamiento LanceDB. Creación de tablas / Indexación FTS / Búsqueda vectorial / Búsqueda BM25 / CRUD | +| `src/embedder.ts` | Abstracción de embeddings. Compatible con cualquier proveedor de API compatible con OpenAI | +| `src/retriever.ts` | Motor de recuperación híbrida. Vector + BM25 → Fusión Híbrida → Rerank → Decaimiento de Ciclo de Vida → Filtro | +| `src/scopes.ts` | Control de acceso multi-scope | +| `src/tools.ts` | Definiciones de herramientas del agente: `memory_recall`, `memory_store`, `memory_forget`, `memory_update` + herramientas de gestión | +| `src/noise-filter.ts` | Filtra rechazos del agente, meta-preguntas, saludos y contenido de baja calidad | +| `src/adaptive-retrieval.ts` | Determina si una consulta necesita recuperación de memoria | +| `src/migrate.ts` | Migración desde `memory-lancedb` integrado a Pro | +| `src/smart-extractor.ts` | Extracción de 6 categorías impulsada por LLM con almacenamiento en capas L0/L1/L2 y deduplicación en dos etapas | +| `src/decay-engine.ts` | Modelo de decaimiento exponencial estirado de Weibull | +| `src/tier-manager.ts` | Promoción/degradación en tres niveles: Peripheral ↔ Working ↔ Core | + +
+ +--- + +## Características Principales + +### Recuperación Híbrida + +``` +Query → embedQuery() ─┐ + ├─→ Hybrid Fusion → Rerank → Lifecycle Decay Boost → Length Norm → Filter +Query → BM25 FTS ─────┘ +``` + +- **Búsqueda Vectorial** — similitud semántica mediante LanceDB ANN (distancia coseno) +- **Búsqueda de Texto Completo BM25** — coincidencia exacta de palabras clave mediante índice FTS de LanceDB +- **Fusión Híbrida** — puntuación vectorial como base, los resultados de BM25 reciben un impulso ponderado (no es RRF estándar — ajustado para calidad de recuperación en el mundo real) +- **Pesos Configurables** — `vectorWeight`, `bm25Weight`, `minScore` + +### Reranking con Cross-Encoder + +- Adaptadores integrados para **Jina**, **SiliconFlow**, **Voyage AI** y **Pinecone** +- Compatible con cualquier endpoint compatible con Jina (por ejemplo, Hugging Face TEI, DashScope) +- Puntuación híbrida: 60% cross-encoder + 40% puntuación fusionada original +- Degradación elegante: recurre a similitud coseno en caso de fallo de la API + +### Pipeline de Puntuación Multi-Etapa + +| Etapa | Efecto | +| --- | --- | +| **Fusión Híbrida** | Combina recuperación semántica y de coincidencia exacta | +| **Rerank con Cross-Encoder** | Promueve resultados semánticamente precisos | +| **Impulso por Decaimiento de Ciclo de Vida** | Frescura Weibull + frecuencia de acceso + importancia × confianza | +| **Normalización de Longitud** | Evita que entradas largas dominen (ancla: 500 caracteres) | +| **Puntuación Mínima Estricta** | Elimina resultados irrelevantes (predeterminado: 0.35) | +| **Diversidad MMR** | Similitud coseno > 0.85 → degradado | + +### Extracción Inteligente de Memoria (v1.1.0) + +- **Extracción de 6 Categorías con LLM**: perfil, preferencias, entidades, eventos, casos, patrones +- **Almacenamiento en Capas L0/L1/L2**: L0 (índice de una oración) → L1 (resumen estructurado) → L2 (narrativa completa) +- **Deduplicación en Dos Etapas**: pre-filtro de similitud vectorial (≥0.7) → decisión semántica por LLM (CREATE/MERGE/SKIP) +- **Fusión por Categoría**: `profile` siempre se fusiona, `events`/`cases` son solo de adición + +### Gestión del Ciclo de Vida de la Memoria (v1.1.0) + +- **Motor de Decaimiento Weibull**: puntuación compuesta = recencia + frecuencia + valor intrínseco +- **Promoción en Tres Niveles**: `Peripheral ↔ Working ↔ Core` con umbrales configurables +- **Refuerzo por Acceso**: los recuerdos frecuentemente recuperados decaen más lentamente (estilo repetición espaciada) +- **Vida Media Modulada por Importancia**: los recuerdos importantes decaen más lentamente + +### Aislamiento Multi-Scope + +- Scopes integrados: `global`, `agent:`, `custom:`, `project:`, `user:` +- Control de acceso a nivel de agente mediante `scopes.agentAccess` +- Predeterminado: cada agente accede a `global` + su propio scope `agent:` + +### Auto-Capture y Auto-Recall + +- **Auto-Capture** (`agent_end`): extrae preferencia/hecho/decisión/entidad de las conversaciones, deduplica, almacena hasta 3 por turno +- **Auto-Recall** (`before_agent_start`): inyecta contexto `` (hasta 3 entradas) + +### Filtrado de Ruido y Recuperación Adaptativa + +- Filtra contenido de baja calidad: rechazos del agente, meta-preguntas, saludos +- Omite la recuperación para saludos, comandos slash, confirmaciones simples, emojis +- Fuerza la recuperación para palabras clave de memoria ("recuerda", "anteriormente", "la última vez") +- Umbrales adaptados a CJK (chino: 6 caracteres vs inglés: 15 caracteres) + +--- + +
+Comparación con memory-lancedb integrado (clic para expandir) + +| Característica | `memory-lancedb` integrado | **memory-lancedb-pro** | +| --- | :---: | :---: | +| Búsqueda vectorial | Sí | Sí | +| Búsqueda de texto completo BM25 | - | Sí | +| Fusión híbrida (Vector + BM25) | - | Sí | +| Rerank con cross-encoder (multi-proveedor) | - | Sí | +| Impulso por recencia y decaimiento temporal | - | Sí | +| Normalización de longitud | - | Sí | +| Diversidad MMR | - | Sí | +| Aislamiento multi-scope | - | Sí | +| Filtrado de ruido | - | Sí | +| Recuperación adaptativa | - | Sí | +| CLI de gestión | - | Sí | +| Memoria de sesión | - | Sí | +| Embeddings adaptados a la tarea | - | Sí | +| **Extracción Inteligente con LLM (6 categorías)** | - | Sí (v1.1.0) | +| **Decaimiento Weibull + Promoción por Niveles** | - | Sí (v1.1.0) | +| Cualquier embedding compatible con OpenAI | Limitado | Sí | + +
+ +--- + +## Configuración + +
+Ejemplo de Configuración Completa + +```json +{ + "embedding": { + "apiKey": "${JINA_API_KEY}", + "model": "jina-embeddings-v5-text-small", + "baseURL": "https://api.jina.ai/v1", + "dimensions": 1024, + "taskQuery": "retrieval.query", + "taskPassage": "retrieval.passage", + "normalized": true + }, + "dbPath": "~/.openclaw/memory/lancedb-pro", + "autoCapture": true, + "autoRecall": true, + "retrieval": { + "mode": "hybrid", + "vectorWeight": 0.7, + "bm25Weight": 0.3, + "minScore": 0.3, + "rerank": "cross-encoder", + "rerankApiKey": "${JINA_API_KEY}", + "rerankModel": "jina-reranker-v3", + "rerankEndpoint": "https://api.jina.ai/v1/rerank", + "rerankProvider": "jina", + "candidatePoolSize": 20, + "recencyHalfLifeDays": 14, + "recencyWeight": 0.1, + "filterNoise": true, + "lengthNormAnchor": 500, + "hardMinScore": 0.35, + "timeDecayHalfLifeDays": 60, + "reinforcementFactor": 0.5, + "maxHalfLifeMultiplier": 3 + }, + "enableManagementTools": false, + "scopes": { + "default": "global", + "definitions": { + "global": { "description": "Shared knowledge" }, + "agent:discord-bot": { "description": "Discord bot private" } + }, + "agentAccess": { + "discord-bot": ["global", "agent:discord-bot"] + } + }, + "sessionMemory": { + "enabled": false, + "messageCount": 15 + }, + "smartExtraction": true, + "llm": { + "apiKey": "${OPENAI_API_KEY}", + "model": "gpt-4o-mini", + "baseURL": "https://api.openai.com/v1" + }, + "extractMinMessages": 2, + "extractMaxChars": 8000 +} +``` + +
+ +
+Proveedores de Embedding + +Funciona con **cualquier API de embedding compatible con OpenAI**: + +| Proveedor | Modelo | URL Base | Dimensiones | +| --- | --- | --- | --- | +| **Jina** (recomendado) | `jina-embeddings-v5-text-small` | `https://api.jina.ai/v1` | 1024 | +| **OpenAI** | `text-embedding-3-small` | `https://api.openai.com/v1` | 1536 | +| **Voyage** | `voyage-4-lite` / `voyage-4` | `https://api.voyageai.com/v1` | 1024 / 1024 | +| **Google Gemini** | `gemini-embedding-001` | `https://generativelanguage.googleapis.com/v1beta/openai/` | 3072 | +| **Ollama** (local) | `nomic-embed-text` | `http://localhost:11434/v1` | específico del proveedor | + +
+ +
+Proveedores de Rerank + +El reranking con cross-encoder admite múltiples proveedores mediante `rerankProvider`: + +| Proveedor | `rerankProvider` | Modelo de Ejemplo | +| --- | --- | --- | +| **Jina** (predeterminado) | `jina` | `jina-reranker-v3` | +| **SiliconFlow** (nivel gratuito disponible) | `siliconflow` | `BAAI/bge-reranker-v2-m3` | +| **Voyage AI** | `voyage` | `rerank-2.5` | +| **Pinecone** | `pinecone` | `bge-reranker-v2-m3` | + +Cualquier endpoint de rerank compatible con Jina también funciona — configura `rerankProvider: "jina"` y apunta `rerankEndpoint` a tu servicio (por ejemplo, Hugging Face TEI, DashScope `qwen3-rerank`). + +
+ +
+Smart Extraction (LLM) — v1.1.0 + +Cuando `smartExtraction` está habilitado (predeterminado: `true`), el plugin utiliza un LLM para extraer y clasificar recuerdos de forma inteligente en lugar de disparadores basados en regex. + +| Campo | Tipo | Predeterminado | Descripción | +|-------|------|----------------|-------------| +| `smartExtraction` | boolean | `true` | Habilitar/deshabilitar la extracción de 6 categorías impulsada por LLM | +| `llm.auth` | string | `api-key` | `api-key` usa `llm.apiKey` / `embedding.apiKey`; `oauth` usa un archivo de token OAuth con alcance de plugin por defecto | +| `llm.apiKey` | string | *(recurre a `embedding.apiKey`)* | Clave API para el proveedor de LLM | +| `llm.model` | string | `openai/gpt-oss-120b` | Nombre del modelo LLM | +| `llm.baseURL` | string | *(recurre a `embedding.baseURL`)* | Endpoint de la API del LLM | +| `llm.oauthProvider` | string | `openai-codex` | ID del proveedor OAuth usado cuando `llm.auth` es `oauth` | +| `llm.oauthPath` | string | `~/.openclaw/.memory-lancedb-pro/oauth.json` | Archivo de token OAuth usado cuando `llm.auth` es `oauth` | +| `llm.timeoutMs` | number | `30000` | Tiempo de espera de solicitud LLM en milisegundos | +| `extractMinMessages` | number | `2` | Mensajes mínimos antes de que se active la extracción | +| `extractMaxChars` | number | `8000` | Máximo de caracteres enviados al LLM | + + +Configuración de `llm` con OAuth (usa la caché de inicio de sesión existente de Codex / ChatGPT para llamadas al LLM): +```json +{ + "llm": { + "auth": "oauth", + "oauthProvider": "openai-codex", + "model": "gpt-5.4", + "oauthPath": "${HOME}/.openclaw/.memory-lancedb-pro/oauth.json", + "timeoutMs": 30000 + } +} +``` + +Notas para `llm.auth: "oauth"`: + +- `llm.oauthProvider` es actualmente `openai-codex`. +- Los tokens OAuth se almacenan por defecto en `~/.openclaw/.memory-lancedb-pro/oauth.json`. +- Puedes configurar `llm.oauthPath` si deseas almacenar ese archivo en otra ubicación. +- `auth login` guarda una copia de la configuración anterior de `llm` con api-key junto al archivo OAuth, y `auth logout` restaura esa copia cuando está disponible. +- Cambiar de `api-key` a `oauth` no transfiere automáticamente `llm.baseURL`. Configúralo manualmente en modo OAuth solo cuando intencionalmente quieras un backend personalizado compatible con ChatGPT/Codex. + +
+ +
+Configuración del Ciclo de Vida (Decaimiento + Nivel) + +| Campo | Predeterminado | Descripción | +|-------|----------------|-------------| +| `decay.recencyHalfLifeDays` | `30` | Vida media base para el decaimiento de recencia Weibull | +| `decay.frequencyWeight` | `0.3` | Peso de la frecuencia de acceso en la puntuación compuesta | +| `decay.intrinsicWeight` | `0.3` | Peso de `importancia × confianza` | +| `decay.betaCore` | `0.8` | Beta de Weibull para memorias `core` | +| `decay.betaWorking` | `1.0` | Beta de Weibull para memorias `working` | +| `decay.betaPeripheral` | `1.3` | Beta de Weibull para memorias `peripheral` | +| `tier.coreAccessThreshold` | `10` | Mínimo de recuperaciones antes de promover a `core` | +| `tier.peripheralAgeDays` | `60` | Umbral de antigüedad para degradar memorias inactivas | + +
+ +
+Refuerzo por Acceso + +Los recuerdos frecuentemente recuperados decaen más lentamente (estilo repetición espaciada). + +Claves de configuración (bajo `retrieval`): +- `reinforcementFactor` (0-2, predeterminado: `0.5`) — establece `0` para deshabilitar +- `maxHalfLifeMultiplier` (1-10, predeterminado: `3`) — límite máximo de vida media efectiva + +
+ +--- + +## Comandos CLI + +```bash +openclaw memory-pro list [--scope global] [--category fact] [--limit 20] [--json] +openclaw memory-pro search "query" [--scope global] [--limit 10] [--json] +openclaw memory-pro stats [--scope global] [--json] +openclaw memory-pro auth login [--provider openai-codex] [--model gpt-5.4] [--oauth-path /abs/path/oauth.json] +openclaw memory-pro auth status +openclaw memory-pro auth logout +openclaw memory-pro delete +openclaw memory-pro delete-bulk --scope global [--before 2025-01-01] [--dry-run] +openclaw memory-pro export [--scope global] [--output memories.json] +openclaw memory-pro import memories.json [--scope global] [--dry-run] +openclaw memory-pro reembed --source-db /path/to/old-db [--batch-size 32] [--skip-existing] +openclaw memory-pro upgrade [--dry-run] [--batch-size 10] [--no-llm] [--limit N] [--scope SCOPE] +openclaw memory-pro migrate check|run|verify [--source /path] +``` + +Flujo de inicio de sesión OAuth: + +1. Ejecuta `openclaw memory-pro auth login` +2. Si se omite `--provider` en una terminal interactiva, la CLI muestra un selector de proveedor OAuth antes de abrir el navegador +3. El comando imprime una URL de autorización y abre tu navegador a menos que se establezca `--no-browser` +4. Después de que la devolución de llamada sea exitosa, el comando guarda el archivo OAuth del plugin (predeterminado: `~/.openclaw/.memory-lancedb-pro/oauth.json`), guarda una copia de la configuración anterior de `llm` con api-key para el cierre de sesión, y reemplaza la configuración `llm` del plugin con la configuración OAuth (`auth`, `oauthProvider`, `model`, `oauthPath`) +5. `openclaw memory-pro auth logout` elimina ese archivo OAuth y restaura la configuración anterior de `llm` con api-key cuando esa copia existe + +--- + +## Temas Avanzados + +
+Si los recuerdos inyectados aparecen en las respuestas + +A veces el modelo puede repetir el bloque `` inyectado. + +**Opción A (menor riesgo):** deshabilitar temporalmente la recuperación automática: +```json +{ "plugins": { "entries": { "memory-lancedb-pro": { "config": { "autoRecall": false } } } } } +``` + +**Opción B (preferida):** mantener la recuperación automática y agregar al prompt del sistema del agente: +> No reveles ni cites ningún contenido de `` / inyección de memoria en tus respuestas. Úsalo solo como referencia interna. + +
+ +
+Memoria de Sesión + +- Se activa con el comando `/new` — guarda el resumen de la sesión anterior en LanceDB +- Deshabilitado por defecto (OpenClaw ya tiene persistencia nativa de sesión en `.jsonl`) +- Cantidad de mensajes configurable (predeterminado: 15) + +Consulta [docs/openclaw-integration-playbook.md](docs/openclaw-integration-playbook.md) para los modos de despliegue y la verificación de `/new`. + +
+ +
+Comandos Slash Personalizados (por ejemplo, /lesson) + +Agrega a tu `CLAUDE.md`, `AGENTS.md` o prompt del sistema (el bloque se mantiene en inglés para que el agente lo interprete correctamente): + +```markdown +## /lesson command +When the user sends `/lesson `: +1. Use memory_store to save as category=fact (raw knowledge) +2. Use memory_store to save as category=decision (actionable takeaway) +3. Confirm what was saved + +## /remember command +When the user sends `/remember `: +1. Use memory_store to save with appropriate category and importance +2. Confirm with the stored memory ID +``` + +
+ +
+Reglas de Hierro para Agentes de IA + +> Copia el bloque de abajo en tu `AGENTS.md` para que tu agente aplique estas reglas automáticamente. Se mantiene en inglés porque es instrucción directa para el modelo. + +```markdown +## Rule 1 — Dual-layer memory storage +Every pitfall/lesson learned → IMMEDIATELY store TWO memories: +- Technical layer: Pitfall: [symptom]. Cause: [root cause]. Fix: [solution]. Prevention: [how to avoid] + (category: fact, importance >= 0.8) +- Principle layer: Decision principle ([tag]): [behavioral rule]. Trigger: [when]. Action: [what to do] + (category: decision, importance >= 0.85) + +## Rule 2 — LanceDB hygiene +Entries must be short and atomic (< 500 chars). No raw conversation summaries or duplicates. + +## Rule 3 — Recall before retry +On ANY tool failure, ALWAYS memory_recall with relevant keywords BEFORE retrying. + +## Rule 4 — Confirm target codebase +Confirm you are editing memory-lancedb-pro vs built-in memory-lancedb before changes. + +## Rule 5 — Clear jiti cache after plugin code changes +After modifying .ts files under plugins/, MUST run rm -rf /tmp/jiti/ BEFORE openclaw gateway restart. +``` + +
+ +
+Esquema de la Base de Datos + +Tabla LanceDB `memories`: + +| Campo | Tipo | Descripción | +| --- | --- | --- | +| `id` | string (UUID) | Clave primaria | +| `text` | string | Texto del recuerdo (indexado con FTS) | +| `vector` | float[] | Vector de embedding | +| `category` | string | Categoría de almacenamiento: `preference` / `fact` / `decision` / `entity` / `reflection` / `other` | +| `scope` | string | Identificador de scope (por ejemplo, `global`, `agent:main`) | +| `importance` | float | Puntuación de importancia 0-1 | +| `timestamp` | int64 | Marca de tiempo de creación (ms) | +| `metadata` | string (JSON) | Metadatos extendidos | + +Claves comunes de `metadata` en v1.1.0: `l0_abstract`, `l1_overview`, `l2_content`, `memory_category`, `tier`, `access_count`, `confidence`, `last_accessed_at` + +> **Nota sobre categorías:** El campo de nivel superior `category` usa 6 categorías de almacenamiento. Las 6 etiquetas semánticas de categoría de Smart Extraction (`profile` / `preferences` / `entities` / `events` / `cases` / `patterns`) se almacenan en `metadata.memory_category`. + +
+ +
+Solución de Problemas + +### "Cannot mix BigInt and other types" (LanceDB / Apache Arrow) + +En LanceDB 0.26+, algunas columnas numéricas pueden devolverse como `BigInt`. Actualiza a **memory-lancedb-pro >= 1.0.14** — este plugin ahora convierte los valores usando `Number(...)` antes de realizar operaciones aritméticas. + +
+ +--- + +## Documentación + +| Documento | Descripción | +| --- | --- | +| [Manual de Integración con OpenClaw](docs/openclaw-integration-playbook.md) | Modos de despliegue, verificación, matriz de regresión | +| [Análisis de la Arquitectura de Memoria](docs/memory_architecture_analysis.md) | Análisis detallado de la arquitectura completa | +| [CHANGELOG v1.1.0](docs/CHANGELOG-v1.1.0.md) | Cambios de comportamiento en v1.1.0 y justificación de la actualización | +| [Fragmentación de Contexto Largo](docs/long-context-chunking.md) | Estrategia de fragmentación para documentos largos | + +--- + +## Beta: Smart Memory v1.1.0 + +> Estado: Beta — disponible mediante `npm i memory-lancedb-pro@beta`. Los usuarios estables en `latest` no se ven afectados. + +| Característica | Descripción | +|----------------|-------------| +| **Smart Extraction** | Extracción de 6 categorías impulsada por LLM con metadatos L0/L1/L2. Recurre a regex cuando está deshabilitado. | +| **Puntuación de Ciclo de Vida** | Decaimiento Weibull integrado en la recuperación — los recuerdos de alta frecuencia y alta importancia se clasifican mejor. | +| **Gestión de Niveles** | Sistema de tres niveles (Core → Working → Peripheral) con promoción/degradación automática. | + +Comentarios: [GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issues) · Revertir: `npm i memory-lancedb-pro@latest` + +--- + +## Dependencias + +| Paquete | Propósito | +| --- | --- | +| `@lancedb/lancedb` ≥0.26.2 | Base de datos vectorial (ANN + FTS) | +| `openai` ≥6.21.0 | Cliente de API de Embedding compatible con OpenAI | +| `@sinclair/typebox` 0.34.48 | Definiciones de tipos con JSON Schema | + +--- + +## Contributors + +

+@win4r +@kctony +@Akatsuki-Ryu +@JasonSuz +@Minidoracat +@furedericca-lab +@joe2643 +@AliceLJY +@chenjiyong +

+ +Full list: [Contributors](https://github.com/CortexReach/memory-lancedb-pro/graphs/contributors) + +## Star History + + + + + + Star History Chart + + + +## Licencia + +MIT + +--- + +## Mi Código QR de WeChat + + diff --git a/README_FR.md b/README_FR.md new file mode 100644 index 00000000..19f5fc45 --- /dev/null +++ b/README_FR.md @@ -0,0 +1,773 @@ +
+ +# 🧠 memory-lancedb-pro · 🦞OpenClaw Plugin + +**Assistant Mémoire IA pour les Agents [OpenClaw](https://github.com/openclaw/openclaw)** + +*Donnez à votre agent IA un cerveau qui se souvient vraiment — entre les sessions, entre les agents, dans le temps.* + +Un plugin de mémoire long terme pour OpenClaw basé sur LanceDB qui stocke les préférences, les décisions et le contexte du projet, puis les rappelle automatiquement dans les sessions futures. + +[![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-blue)](https://github.com/openclaw/openclaw) +[![npm version](https://img.shields.io/npm/v/memory-lancedb-pro)](https://www.npmjs.com/package/memory-lancedb-pro) +[![LanceDB](https://img.shields.io/badge/LanceDB-Vectorstore-orange)](https://lancedb.com) +[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) + +[English](README.md) | [简体中文](README_CN.md) | [繁體中文](README_TW.md) | [日本語](README_JA.md) | [한국어](README_KO.md) | [Français](README_FR.md) | [Español](README_ES.md) | [Deutsch](README_DE.md) | [Italiano](README_IT.md) | [Русский](README_RU.md) | [Português (Brasil)](README_PT-BR.md) + +
+ +--- + +## Pourquoi memory-lancedb-pro ? + +La plupart des agents IA souffrent d'amnésie. Ils oublient tout dès que vous démarrez une nouvelle conversation. + +**memory-lancedb-pro** est un plugin de mémoire long terme de niveau production pour OpenClaw qui transforme votre agent en un véritable **Assistant Mémoire IA** — il capture automatiquement ce qui compte, laisse le bruit s'estomper naturellement et retrouve le bon souvenir au bon moment. Pas d'étiquetage manuel, pas de configuration compliquée. + +### Votre Assistant Mémoire IA en action + +**Sans mémoire — chaque session repart de zéro :** + +> **Vous :** « Utilise des tabulations pour l'indentation, ajoute toujours la gestion d'erreurs. » +> *(session suivante)* +> **Vous :** « Je t'ai déjà dit — des tabulations, pas des espaces ! » 😤 +> *(session suivante)* +> **Vous :** « …sérieusement, des tabulations. Et la gestion d'erreurs. Encore. » + +**Avec memory-lancedb-pro — votre agent apprend et se souvient :** + +> **Vous :** « Utilise des tabulations pour l'indentation, ajoute toujours la gestion d'erreurs. » +> *(session suivante — l'agent rappelle automatiquement vos préférences)* +> **Agent :** *(applique silencieusement tabulations + gestion d'erreurs)* ✅ +> **Vous :** « Pourquoi avons-nous choisi PostgreSQL plutôt que MongoDB le mois dernier ? » +> **Agent :** « Selon notre discussion du 12 février, les raisons principales étaient… » ✅ + +Voilà la différence que fait un **Assistant Mémoire IA** — il apprend votre style, rappelle les décisions passées et fournit des réponses personnalisées sans que vous ayez à vous répéter. + +### Que peut-il faire d'autre ? + +| | Ce que vous obtenez | +|---|---| +| **Capture automatique** | Votre agent apprend de chaque conversation — pas besoin de `memory_store` manuel | +| **Extraction intelligente** | Classification LLM en 6 catégories : profils, préférences, entités, événements, cas, patterns | +| **Oubli intelligent** | Modèle de décroissance Weibull — les souvenirs importants restent, le bruit s'estompe | +| **Recherche hybride** | Recherche vectorielle + BM25 plein texte, fusionnée avec un reranking cross-encoder | +| **Injection de contexte** | Les souvenirs pertinents remontent automatiquement avant chaque réponse | +| **Isolation multi-scope** | Limites mémoire par agent, par utilisateur, par projet | +| **Tout fournisseur** | OpenAI, Jina, Gemini, Ollama ou toute API compatible OpenAI | +| **Boîte à outils complète** | CLI, sauvegarde, migration, mise à niveau, export/import — prêt pour la production | + +--- + +## Démarrage rapide + +### Option A : Script d'installation en un clic (recommandé) + +Le **[script d'installation](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** maintenu par la communauté gère l'installation, la mise à niveau et la réparation en une seule commande : + +```bash +curl -fsSL https://raw.githubusercontent.com/CortexReach/toolbox/main/memory-lancedb-pro-setup/setup-memory.sh -o setup-memory.sh +bash setup-memory.sh +``` + +> Consultez [Écosystème](#écosystème) ci-dessous pour la liste complète des scénarios couverts et les autres outils communautaires. + +### Option B : Installation manuelle + +**Via OpenClaw CLI (recommandé) :** +```bash +openclaw plugins install memory-lancedb-pro@beta +``` + +**Ou via npm :** +```bash +npm i memory-lancedb-pro@beta +``` +> Si vous utilisez npm, vous devrez également ajouter le répertoire d'installation du plugin comme chemin **absolu** dans `plugins.load.paths` de votre `openclaw.json`. C'est le problème de configuration le plus courant. + +Ajoutez à votre `openclaw.json` : + +```json +{ + "plugins": { + "slots": { "memory": "memory-lancedb-pro" }, + "entries": { + "memory-lancedb-pro": { + "enabled": true, + "config": { + "embedding": { + "provider": "openai-compatible", + "apiKey": "${OPENAI_API_KEY}", + "model": "text-embedding-3-small" + }, + "autoCapture": true, + "autoRecall": true, + "smartExtraction": true, + "extractMinMessages": 2, + "extractMaxChars": 8000, + "sessionMemory": { "enabled": false } + } + } + } + } +} +``` + +**Pourquoi ces valeurs par défaut ?** +- `autoCapture` + `smartExtraction` → votre agent apprend automatiquement de chaque conversation +- `autoRecall` → les souvenirs pertinents sont injectés avant chaque réponse +- `extractMinMessages: 2` → l'extraction se déclenche dans les conversations normales à deux tours +- `sessionMemory.enabled: false` → évite de polluer la recherche avec des résumés de session au début + +Validez et redémarrez : + +```bash +openclaw config validate +openclaw gateway restart +openclaw logs --follow --plain | grep "memory-lancedb-pro" +``` + +Vous devriez voir : +- `memory-lancedb-pro: smart extraction enabled` +- `memory-lancedb-pro@...: plugin registered` + +Terminé ! Votre agent dispose maintenant d'une mémoire long terme. + +
+Plus de chemins d'installation (utilisateurs existants, mises à niveau) + +**Déjà utilisateur d'OpenClaw ?** + +1. Ajoutez le plugin avec un chemin **absolu** dans `plugins.load.paths` +2. Liez le slot mémoire : `plugins.slots.memory = "memory-lancedb-pro"` +3. Vérifiez : `openclaw plugins info memory-lancedb-pro && openclaw memory-pro stats` + +**Mise à niveau depuis une version antérieure à v1.1.0 ?** + +```bash +# 1) Sauvegarde +openclaw memory-pro export --scope global --output memories-backup.json +# 2) Simulation +openclaw memory-pro upgrade --dry-run +# 3) Exécution de la mise à niveau +openclaw memory-pro upgrade +# 4) Vérification +openclaw memory-pro stats +``` + +Consultez `CHANGELOG-v1.1.0.md` pour les changements de comportement et la justification de la mise à niveau. + +
+ +
+Import rapide Telegram Bot (cliquez pour développer) + +Si vous utilisez l'intégration Telegram d'OpenClaw, le plus simple est d'envoyer une commande d'import directement au Bot principal au lieu de modifier manuellement la configuration. + +Envoyez ce message : + +```text +Help me connect this memory plugin with the most user-friendly configuration: https://github.com/CortexReach/memory-lancedb-pro + +Requirements: +1. Set it as the only active memory plugin +2. Use Jina for embedding +3. Use Jina for reranker +4. Use gpt-4o-mini for the smart-extraction LLM +5. Enable autoCapture, autoRecall, smartExtraction +6. extractMinMessages=2 +7. sessionMemory.enabled=false +8. captureAssistant=false +9. retrieval mode=hybrid, vectorWeight=0.7, bm25Weight=0.3 +10. rerank=cross-encoder, candidatePoolSize=12, minScore=0.6, hardMinScore=0.62 +11. Generate the final openclaw.json config directly, not just an explanation +``` + +
+ +--- + +## Écosystème + +memory-lancedb-pro est le plugin principal. La communauté a construit des outils autour pour faciliter l'installation et l'utilisation quotidienne : + +### Script d'installation — Installation, mise à niveau et réparation en un clic + +> **[CortexReach/toolbox/memory-lancedb-pro-setup](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** + +Pas un simple installateur — le script gère intelligemment de nombreux scénarios réels : + +| Votre situation | Ce que fait le script | +|---|---| +| Jamais installé | Téléchargement → installation des dépendances → choix de la config → écriture dans openclaw.json → redémarrage | +| Installé via `git clone`, bloqué sur un ancien commit | `git fetch` + `checkout` automatique vers la dernière version → réinstallation des dépendances → vérification | +| La config contient des champs invalides | Détection automatique via filtre de schéma, suppression des champs non supportés | +| Installé via `npm` | Saute la mise à jour git, rappelle d'exécuter `npm update` soi-même | +| CLI `openclaw` cassé à cause d'une config invalide | Solution de repli : lecture directe du chemin workspace depuis le fichier `openclaw.json` | +| `extensions/` au lieu de `plugins/` | Détection automatique de l'emplacement du plugin depuis la config ou le système de fichiers | +| Déjà à jour | Exécution des vérifications de santé uniquement, aucune modification | + +```bash +bash setup-memory.sh # Installer ou mettre à niveau +bash setup-memory.sh --dry-run # Aperçu uniquement +bash setup-memory.sh --beta # Inclure les versions pré-release +bash setup-memory.sh --uninstall # Restaurer la config et supprimer le plugin +``` + +Presets de fournisseurs intégrés : **Jina / DashScope / SiliconFlow / OpenAI / Ollama**, ou apportez votre propre API compatible OpenAI. Pour l'utilisation complète (incluant `--ref`, `--selfcheck-only`, etc.), consultez le [README du script d'installation](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup). + +### Claude Code / OpenClaw Skill — Configuration guidée par IA + +> **[CortexReach/memory-lancedb-pro-skill](https://github.com/CortexReach/memory-lancedb-pro-skill)** + +Installez ce Skill et votre agent IA (Claude Code ou OpenClaw) acquiert une connaissance approfondie de toutes les fonctionnalités de memory-lancedb-pro. Dites simplement **« aide-moi à activer la meilleure config »** et obtenez : + +- **Workflow de configuration guidé en 7 étapes** avec 4 plans de déploiement : + - Full Power (Jina + OpenAI) / Budget (reranker SiliconFlow gratuit) / Simple (OpenAI uniquement) / Entièrement local (Ollama, zéro coût API) +- **Les 9 outils MCP** utilisés correctement : `memory_recall`, `memory_store`, `memory_forget`, `memory_update`, `memory_stats`, `memory_list`, `self_improvement_log`, `self_improvement_extract_skill`, `self_improvement_review` *(l'ensemble complet nécessite `enableManagementTools: true` — la config Quick Start par défaut expose les 4 outils principaux)* +- **Évitement des pièges courants** : activation du plugin workspace, `autoRecall` par défaut à false, cache jiti, variables d'environnement, isolation des scopes, etc. + +**Installation pour Claude Code :** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.claude/skills/memory-lancedb-pro +``` + +**Installation pour OpenClaw :** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.openclaw/workspace/skills/memory-lancedb-pro-skill +``` + +--- + +## Tutoriel vidéo + +> Présentation complète : installation, configuration et fonctionnement interne de la recherche hybride. + +[![YouTube Video](https://img.shields.io/badge/YouTube-Watch%20Now-red?style=for-the-badge&logo=youtube)](https://youtu.be/MtukF1C8epQ) +**https://youtu.be/MtukF1C8epQ** + +[![Bilibili Video](https://img.shields.io/badge/Bilibili-Watch%20Now-00A1D6?style=for-the-badge&logo=bilibili&logoColor=white)](https://www.bilibili.com/video/BV1zUf2BGEgn/) +**https://www.bilibili.com/video/BV1zUf2BGEgn/** + +--- + +## Architecture + +``` +┌─────────────────────────────────────────────────────────┐ +│ index.ts (Point d'entrée) │ +│ Enregistrement du plugin · Parsing config · Hooks │ +└────────┬──────────┬──────────┬──────────┬───────────────┘ + │ │ │ │ + ┌────▼───┐ ┌────▼───┐ ┌───▼────┐ ┌──▼──────────┐ + │ store │ │embedder│ │retriever│ │ scopes │ + │ .ts │ │ .ts │ │ .ts │ │ .ts │ + └────────┘ └────────┘ └────────┘ └─────────────┘ + │ │ + ┌────▼───┐ ┌─────▼──────────┐ + │migrate │ │noise-filter.ts │ + │ .ts │ │adaptive- │ + └────────┘ │retrieval.ts │ + └────────────────┘ + ┌─────────────┐ ┌──────────┐ + │ tools.ts │ │ cli.ts │ + │ (API Agent) │ │ (CLI) │ + └─────────────┘ └──────────┘ +``` + +> Pour une analyse approfondie de l'architecture complète, consultez [docs/memory_architecture_analysis.md](docs/memory_architecture_analysis.md). + +
+Référence des fichiers (cliquez pour développer) + +| Fichier | Rôle | +| --- | --- | +| `index.ts` | Point d'entrée du plugin. S'enregistre auprès de l'API Plugin OpenClaw, parse la config, monte les hooks de cycle de vie | +| `openclaw.plugin.json` | Métadonnées du plugin + déclaration complète du JSON Schema de config | +| `cli.ts` | Commandes CLI : `memory-pro list/search/stats/delete/delete-bulk/export/import/reembed/upgrade/migrate` | +| `src/store.ts` | Couche de stockage LanceDB. Création de tables / Indexation FTS / Recherche vectorielle / Recherche BM25 / CRUD | +| `src/embedder.ts` | Abstraction d'embedding. Compatible avec tout fournisseur API compatible OpenAI | +| `src/retriever.ts` | Moteur de recherche hybride. Vectoriel + BM25 → Fusion hybride → Rerank → Décroissance cycle de vie → Filtre | +| `src/scopes.ts` | Contrôle d'accès multi-scope | +| `src/tools.ts` | Définitions des outils agent : `memory_recall`, `memory_store`, `memory_forget`, `memory_update` + outils de gestion | +| `src/noise-filter.ts` | Filtre les refus d'agent, les méta-questions, les salutations et le contenu de faible qualité | +| `src/adaptive-retrieval.ts` | Détermine si une requête nécessite une recherche en mémoire | +| `src/migrate.ts` | Migration depuis `memory-lancedb` intégré vers Pro | +| `src/smart-extractor.ts` | Extraction LLM en 6 catégories avec stockage L0/L1/L2 et déduplication en deux étapes | +| `src/decay-engine.ts` | Modèle de décroissance exponentielle étirée Weibull | +| `src/tier-manager.ts` | Promotion/rétrogradation à trois niveaux : Périphérique ↔ Travail ↔ Noyau | + +
+ +--- + +## Fonctionnalités principales + +### Recherche hybride + +``` +Requête → embedQuery() ─┐ + ├─→ Fusion hybride → Rerank → Boost décroissance → Normalisation longueur → Filtre +Requête → BM25 FTS ─────┘ +``` + +- **Recherche vectorielle** — similarité sémantique via LanceDB ANN (distance cosinus) +- **Recherche plein texte BM25** — correspondance exacte de mots-clés via l'index FTS de LanceDB +- **Fusion hybride** — score vectoriel comme base, les résultats BM25 reçoivent un boost pondéré (pas du RRF standard — optimisé pour la qualité de rappel réelle) +- **Poids configurables** — `vectorWeight`, `bm25Weight`, `minScore` + +### Reranking Cross-Encoder + +- Adaptateurs intégrés pour **Jina**, **SiliconFlow**, **Voyage AI** et **Pinecone** +- Compatible avec tout endpoint compatible Jina (ex. Hugging Face TEI, DashScope) +- Scoring hybride : 60% cross-encoder + 40% score fusionné original +- Dégradation gracieuse : repli sur la similarité cosinus en cas d'échec API + +### Pipeline de scoring multi-étapes + +| Étape | Effet | +| --- | --- | +| **Fusion hybride** | Combine rappel sémantique et correspondance exacte | +| **Rerank Cross-Encoder** | Promeut les résultats sémantiquement précis | +| **Boost décroissance cycle de vie** | Fraîcheur Weibull + fréquence d'accès + importance × confiance | +| **Normalisation de longueur** | Empêche les entrées longues de dominer (ancre : 500 caractères) | +| **Score minimum dur** | Supprime les résultats non pertinents (par défaut : 0.35) | +| **Diversité MMR** | Similarité cosinus > 0.85 → rétrogradé | + +### Extraction mémoire intelligente (v1.1.0) + +- **Extraction LLM en 6 catégories** : profil, préférences, entités, événements, cas, patterns +- **Stockage par couches L0/L1/L2** : L0 (index en une phrase) → L1 (résumé structuré) → L2 (récit complet) +- **Déduplication en deux étapes** : pré-filtre de similarité vectorielle (≥0.7) → décision sémantique LLM (CREATE/MERGE/SKIP) +- **Fusion sensible aux catégories** : `profile` fusionne toujours, `events`/`cases` en ajout uniquement + +### Gestion du cycle de vie mémoire (v1.1.0) + +- **Moteur de décroissance Weibull** : score composite = fraîcheur + fréquence + valeur intrinsèque +- **Promotion à trois niveaux** : `Périphérique ↔ Travail ↔ Noyau` avec seuils configurables +- **Renforcement par accès** : les souvenirs fréquemment rappelés décroissent plus lentement (style répétition espacée) +- **Demi-vie modulée par l'importance** : les souvenirs importants décroissent plus lentement + +### Isolation multi-scope + +- Scopes intégrés : `global`, `agent:`, `custom:`, `project:`, `user:` +- Contrôle d'accès au niveau agent via `scopes.agentAccess` +- Par défaut : chaque agent accède à `global` + son propre scope `agent:` + +### Capture automatique et rappel automatique + +- **Capture auto** (`agent_end`) : extrait préférences/faits/décisions/entités des conversations, déduplique, stocke jusqu'à 3 par tour +- **Rappel auto** (`before_agent_start`) : injecte le contexte `` (jusqu'à 3 entrées) + +### Filtrage du bruit et recherche adaptative + +- Filtre le contenu de faible qualité : refus d'agent, méta-questions, salutations +- Ignore la recherche pour : salutations, commandes slash, confirmations simples, emoji +- Force la recherche pour les mots-clés mémoire (« souviens-toi », « précédemment », « la dernière fois ») +- Seuils CJK (chinois : 6 caractères vs anglais : 15 caractères) + +--- + +
+Comparaison avec memory-lancedb intégré (cliquez pour développer) + +| Fonctionnalité | `memory-lancedb` intégré | **memory-lancedb-pro** | +| --- | :---: | :---: | +| Recherche vectorielle | Oui | Oui | +| Recherche plein texte BM25 | - | Oui | +| Fusion hybride (Vectoriel + BM25) | - | Oui | +| Rerank cross-encoder (multi-fournisseur) | - | Oui | +| Boost de fraîcheur et décroissance temporelle | - | Oui | +| Normalisation de longueur | - | Oui | +| Diversité MMR | - | Oui | +| Isolation multi-scope | - | Oui | +| Filtrage du bruit | - | Oui | +| Recherche adaptative | - | Oui | +| CLI de gestion | - | Oui | +| Mémoire de session | - | Oui | +| Embeddings sensibles aux tâches | - | Oui | +| **Extraction intelligente LLM (6 catégories)** | - | Oui (v1.1.0) | +| **Décroissance Weibull + Promotion par niveaux** | - | Oui (v1.1.0) | +| Tout embedding compatible OpenAI | Limité | Oui | + +
+ +--- + +## Configuration + +
+Exemple de configuration complète + +```json +{ + "embedding": { + "apiKey": "${JINA_API_KEY}", + "model": "jina-embeddings-v5-text-small", + "baseURL": "https://api.jina.ai/v1", + "dimensions": 1024, + "taskQuery": "retrieval.query", + "taskPassage": "retrieval.passage", + "normalized": true + }, + "dbPath": "~/.openclaw/memory/lancedb-pro", + "autoCapture": true, + "autoRecall": true, + "retrieval": { + "mode": "hybrid", + "vectorWeight": 0.7, + "bm25Weight": 0.3, + "minScore": 0.3, + "rerank": "cross-encoder", + "rerankApiKey": "${JINA_API_KEY}", + "rerankModel": "jina-reranker-v3", + "rerankEndpoint": "https://api.jina.ai/v1/rerank", + "rerankProvider": "jina", + "candidatePoolSize": 20, + "recencyHalfLifeDays": 14, + "recencyWeight": 0.1, + "filterNoise": true, + "lengthNormAnchor": 500, + "hardMinScore": 0.35, + "timeDecayHalfLifeDays": 60, + "reinforcementFactor": 0.5, + "maxHalfLifeMultiplier": 3 + }, + "enableManagementTools": false, + "scopes": { + "default": "global", + "definitions": { + "global": { "description": "Shared knowledge" }, + "agent:discord-bot": { "description": "Discord bot private" } + }, + "agentAccess": { + "discord-bot": ["global", "agent:discord-bot"] + } + }, + "sessionMemory": { + "enabled": false, + "messageCount": 15 + }, + "smartExtraction": true, + "llm": { + "apiKey": "${OPENAI_API_KEY}", + "model": "gpt-4o-mini", + "baseURL": "https://api.openai.com/v1" + }, + "extractMinMessages": 2, + "extractMaxChars": 8000 +} +``` + +
+ +
+Fournisseurs d'embedding + +Fonctionne avec **toute API d'embedding compatible OpenAI** : + +| Fournisseur | Modèle | Base URL | Dimensions | +| --- | --- | --- | --- | +| **Jina** (recommandé) | `jina-embeddings-v5-text-small` | `https://api.jina.ai/v1` | 1024 | +| **OpenAI** | `text-embedding-3-small` | `https://api.openai.com/v1` | 1536 | +| **Voyage** | `voyage-4-lite` / `voyage-4` | `https://api.voyageai.com/v1` | 1024 / 1024 | +| **Google Gemini** | `gemini-embedding-001` | `https://generativelanguage.googleapis.com/v1beta/openai/` | 3072 | +| **Ollama** (local) | `nomic-embed-text` | `http://localhost:11434/v1` | selon le modèle | + +
+ +
+Fournisseurs de reranking + +Le reranking cross-encoder supporte plusieurs fournisseurs via `rerankProvider` : + +| Fournisseur | `rerankProvider` | Modèle exemple | +| --- | --- | --- | +| **Jina** (par défaut) | `jina` | `jina-reranker-v3` | +| **SiliconFlow** (niveau gratuit disponible) | `siliconflow` | `BAAI/bge-reranker-v2-m3` | +| **Voyage AI** | `voyage` | `rerank-2.5` | +| **Pinecone** | `pinecone` | `bge-reranker-v2-m3` | + +Tout endpoint de reranking compatible Jina fonctionne également — définissez `rerankProvider: "jina"` et pointez `rerankEndpoint` vers votre service (ex. Hugging Face TEI, DashScope `qwen3-rerank`). + +
+ +
+Extraction intelligente (LLM) — v1.1.0 + +Quand `smartExtraction` est activé (par défaut : `true`), le plugin utilise un LLM pour extraire et classifier intelligemment les souvenirs au lieu de déclencheurs basés sur des regex. + +| Champ | Type | Défaut | Description | +|-------|------|---------|-------------| +| `smartExtraction` | boolean | `true` | Activer/désactiver l'extraction LLM en 6 catégories | +| `llm.auth` | string | `api-key` | `api-key` utilise `llm.apiKey` / `embedding.apiKey` ; `oauth` utilise un fichier token OAuth au niveau plugin | +| `llm.apiKey` | string | *(repli sur `embedding.apiKey`)* | Clé API pour le fournisseur LLM | +| `llm.model` | string | `openai/gpt-oss-120b` | Nom du modèle LLM | +| `llm.baseURL` | string | *(repli sur `embedding.baseURL`)* | Point de terminaison API LLM | +| `llm.oauthProvider` | string | `openai-codex` | ID du fournisseur OAuth utilisé quand `llm.auth` est `oauth` | +| `llm.oauthPath` | string | `~/.openclaw/.memory-lancedb-pro/oauth.json` | Fichier token OAuth utilisé quand `llm.auth` est `oauth` | +| `llm.timeoutMs` | number | `30000` | Timeout des requêtes LLM en millisecondes | +| `extractMinMessages` | number | `2` | Nombre minimum de messages avant le déclenchement de l'extraction | +| `extractMaxChars` | number | `8000` | Nombre maximum de caractères envoyés au LLM | + + +OAuth `llm` config (utiliser le cache de connexion Codex / ChatGPT existant pour les appels LLM) : +```json +{ + "llm": { + "auth": "oauth", + "oauthProvider": "openai-codex", + "model": "gpt-5.4", + "oauthPath": "${HOME}/.openclaw/.memory-lancedb-pro/oauth.json", + "timeoutMs": 30000 + } +} +``` + +Notes pour `llm.auth: "oauth"` : + +- `llm.oauthProvider` est actuellement `openai-codex`. +- Les tokens OAuth sont stockés par défaut dans `~/.openclaw/.memory-lancedb-pro/oauth.json`. +- Vous pouvez définir `llm.oauthPath` si vous souhaitez stocker ce fichier ailleurs. +- `auth login` sauvegarde la configuration `llm` api-key précédente à côté du fichier OAuth, et `auth logout` restaure cette sauvegarde lorsqu'elle est disponible. +- Passer de `api-key` à `oauth` ne transfère pas automatiquement `llm.baseURL`. Définissez-le manuellement en mode OAuth uniquement si vous souhaitez intentionnellement un backend personnalisé compatible ChatGPT/Codex. + +
+ +
+Configuration du cycle de vie (Décroissance + Niveaux) + +| Champ | Défaut | Description | +|-------|---------|-------------| +| `decay.recencyHalfLifeDays` | `30` | Demi-vie de base pour la décroissance Weibull | +| `decay.frequencyWeight` | `0.3` | Poids de la fréquence d'accès dans le score composite | +| `decay.intrinsicWeight` | `0.3` | Poids de `importance × confiance` | +| `decay.betaCore` | `0.8` | Beta Weibull pour les souvenirs `noyau` | +| `decay.betaWorking` | `1.0` | Beta Weibull pour les souvenirs `travail` | +| `decay.betaPeripheral` | `1.3` | Beta Weibull pour les souvenirs `périphériques` | +| `tier.coreAccessThreshold` | `10` | Nombre minimum de rappels avant promotion en `noyau` | +| `tier.peripheralAgeDays` | `60` | Seuil d'âge pour la rétrogradation des souvenirs obsolètes | + +
+ +
+Renforcement par accès + +Les souvenirs fréquemment rappelés décroissent plus lentement (style répétition espacée). + +Clés de config (sous `retrieval`) : +- `reinforcementFactor` (0-2, défaut : `0.5`) — mettre à `0` pour désactiver +- `maxHalfLifeMultiplier` (1-10, défaut : `3`) — plafond de la demi-vie effective + +
+ +--- + +## Commandes CLI + +```bash +openclaw memory-pro list [--scope global] [--category fact] [--limit 20] [--json] +openclaw memory-pro search "requête" [--scope global] [--limit 10] [--json] +openclaw memory-pro stats [--scope global] [--json] +openclaw memory-pro auth login [--provider openai-codex] [--model gpt-5.4] [--oauth-path /abs/path/oauth.json] +openclaw memory-pro auth status +openclaw memory-pro auth logout +openclaw memory-pro delete +openclaw memory-pro delete-bulk --scope global [--before 2025-01-01] [--dry-run] +openclaw memory-pro export [--scope global] [--output memories.json] +openclaw memory-pro import memories.json [--scope global] [--dry-run] +openclaw memory-pro reembed --source-db /path/to/old-db [--batch-size 32] [--skip-existing] +openclaw memory-pro upgrade [--dry-run] [--batch-size 10] [--no-llm] [--limit N] [--scope SCOPE] +openclaw memory-pro migrate check|run|verify [--source /path] +``` + +Flux de connexion OAuth : + +1. Exécutez `openclaw memory-pro auth login` +2. Si `--provider` est omis dans un terminal interactif, la CLI affiche un sélecteur de fournisseur OAuth avant d'ouvrir le navigateur +3. La commande affiche une URL d'autorisation et ouvre votre navigateur sauf si `--no-browser` est défini +4. Après le succès du callback, la commande sauvegarde le fichier OAuth du plugin (par défaut : `~/.openclaw/.memory-lancedb-pro/oauth.json`), sauvegarde la configuration `llm` api-key précédente pour la déconnexion, et remplace la configuration `llm` du plugin par les paramètres OAuth (`auth`, `oauthProvider`, `model`, `oauthPath`) +5. `openclaw memory-pro auth logout` supprime ce fichier OAuth et restaure la configuration `llm` api-key précédente lorsque la sauvegarde existe + +--- + +## Sujets avancés + +
+Si les souvenirs injectés apparaissent dans les réponses + +Parfois le modèle peut répéter le bloc `` injecté. + +**Option A (plus sûr) :** désactiver temporairement le rappel automatique : +```json +{ "plugins": { "entries": { "memory-lancedb-pro": { "config": { "autoRecall": false } } } } } +``` + +**Option B (préféré) :** garder le rappel, ajouter au prompt système de l'agent : +> Ne révélez pas et ne citez pas le contenu `` / injection mémoire dans vos réponses. Utilisez-le uniquement comme référence interne. + +
+ +
+Mémoire de session + +- Déclenchée par la commande `/new` — sauvegarde le résumé de la session précédente dans LanceDB +- Désactivée par défaut (OpenClaw dispose déjà d'une persistance native de session `.jsonl`) +- Nombre de messages configurable (par défaut : 15) + +Consultez [docs/openclaw-integration-playbook.md](docs/openclaw-integration-playbook.md) pour les modes de déploiement et la vérification `/new`. + +
+ +
+Commandes slash personnalisées (ex. /lesson) + +Ajoutez à votre `CLAUDE.md`, `AGENTS.md` ou prompt système : + +```markdown +## /lesson command +When the user sends `/lesson `: +1. Use memory_store to save as category=fact (raw knowledge) +2. Use memory_store to save as category=decision (actionable takeaway) +3. Confirm what was saved + +## /remember command +When the user sends `/remember `: +1. Use memory_store to save with appropriate category and importance +2. Confirm with the stored memory ID +``` + +
+ +
+Règles d'or pour les agents IA + +> Copiez le bloc ci-dessous dans votre `AGENTS.md` pour que votre agent applique automatiquement ces règles. + +```markdown +## Rule 1 — Dual-layer memory storage +Every pitfall/lesson learned → IMMEDIATELY store TWO memories: +- Technical layer: Pitfall: [symptom]. Cause: [root cause]. Fix: [solution]. Prevention: [how to avoid] + (category: fact, importance >= 0.8) +- Principle layer: Decision principle ([tag]): [behavioral rule]. Trigger: [when]. Action: [what to do] + (category: decision, importance >= 0.85) + +## Rule 2 — LanceDB hygiene +Entries must be short and atomic (< 500 chars). No raw conversation summaries or duplicates. + +## Rule 3 — Recall before retry +On ANY tool failure, ALWAYS memory_recall with relevant keywords BEFORE retrying. + +## Rule 4 — Confirm target codebase +Confirm you are editing memory-lancedb-pro vs built-in memory-lancedb before changes. + +## Rule 5 — Clear jiti cache after plugin code changes +After modifying .ts files under plugins/, MUST run rm -rf /tmp/jiti/ BEFORE openclaw gateway restart. +``` + +
+ +
+Schéma de la base de données + +Table LanceDB `memories` : + +| Champ | Type | Description | +| --- | --- | --- | +| `id` | string (UUID) | Clé primaire | +| `text` | string | Texte du souvenir (indexé FTS) | +| `vector` | float[] | Vecteur d'embedding | +| `category` | string | Catégorie de stockage : `preference` / `fact` / `decision` / `entity` / `reflection` / `other` | +| `scope` | string | Identifiant de scope (ex. `global`, `agent:main`) | +| `importance` | float | Score d'importance 0-1 | +| `timestamp` | int64 | Horodatage de création (ms) | +| `metadata` | string (JSON) | Métadonnées étendues | + +Clés `metadata` courantes en v1.1.0 : `l0_abstract`, `l1_overview`, `l2_content`, `memory_category`, `tier`, `access_count`, `confidence`, `last_accessed_at` + +> **Note sur les catégories :** Le champ `category` de niveau supérieur utilise 6 catégories de stockage. Les 6 labels sémantiques de l'Extraction Intelligente (`profile` / `preferences` / `entities` / `events` / `cases` / `patterns`) sont stockés dans `metadata.memory_category`. + +
+ +
+Dépannage + +### "Cannot mix BigInt and other types" (LanceDB / Apache Arrow) + +Avec LanceDB 0.26+, certaines colonnes numériques peuvent être retournées en `BigInt`. Mettez à niveau vers **memory-lancedb-pro >= 1.0.14** — ce plugin convertit maintenant les valeurs avec `Number(...)` avant les opérations arithmétiques. + +
+ +--- + +## Documentation + +| Document | Description | +| --- | --- | +| [Playbook d'intégration OpenClaw](docs/openclaw-integration-playbook.md) | Modes de déploiement, vérification, matrice de régression | +| [Analyse de l'architecture mémoire](docs/memory_architecture_analysis.md) | Analyse approfondie de l'architecture complète | +| [CHANGELOG v1.1.0](docs/CHANGELOG-v1.1.0.md) | Changements de comportement v1.1.0 et justification de la mise à niveau | +| [Chunking long contexte](docs/long-context-chunking.md) | Stratégie de chunking pour les longs documents | + +--- + +## Beta : Smart Memory v1.1.0 + +> Statut : Beta — disponible via `npm i memory-lancedb-pro@beta`. Les utilisateurs stables sur `latest` ne sont pas affectés. + +| Fonctionnalité | Description | +|---------|-------------| +| **Extraction intelligente** | Extraction LLM en 6 catégories avec métadonnées L0/L1/L2. Repli sur regex si désactivé. | +| **Scoring du cycle de vie** | Décroissance Weibull intégrée à la recherche — les souvenirs fréquents et importants sont mieux classés. | +| **Gestion des niveaux** | Système à trois niveaux (Noyau → Travail → Périphérique) avec promotion/rétrogradation automatique. | + +Retours : [GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issues) · Retour en arrière : `npm i memory-lancedb-pro@latest` + +--- + +## Dépendances + +| Package | Rôle | +| --- | --- | +| `@lancedb/lancedb` ≥0.26.2 | Base de données vectorielle (ANN + FTS) | +| `openai` ≥6.21.0 | Client API d'embedding compatible OpenAI | +| `@sinclair/typebox` 0.34.48 | Définitions de types JSON Schema | + +--- + +## Contributors + +

+@win4r +@kctony +@Akatsuki-Ryu +@JasonSuz +@Minidoracat +@furedericca-lab +@joe2643 +@AliceLJY +@chenjiyong +

+ +Full list: [Contributors](https://github.com/CortexReach/memory-lancedb-pro/graphs/contributors) + +## Star History + + + + + + Star History Chart + + + +## Licence + +MIT + +--- + +## Mon QR Code WeChat + + diff --git a/README_IT.md b/README_IT.md new file mode 100644 index 00000000..b1679682 --- /dev/null +++ b/README_IT.md @@ -0,0 +1,773 @@ +
+ +# 🧠 memory-lancedb-pro · 🦞OpenClaw Plugin + +**Assistente Memoria IA per Agenti [OpenClaw](https://github.com/openclaw/openclaw)** + +*Dai al tuo agente IA un cervello che ricorda davvero — tra sessioni, tra agenti, nel tempo.* + +Un plugin di memoria a lungo termine per OpenClaw basato su LanceDB che memorizza preferenze, decisioni e contesto di progetto, e li richiama automaticamente nelle sessioni future. + +[![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-blue)](https://github.com/openclaw/openclaw) +[![npm version](https://img.shields.io/npm/v/memory-lancedb-pro)](https://www.npmjs.com/package/memory-lancedb-pro) +[![LanceDB](https://img.shields.io/badge/LanceDB-Vectorstore-orange)](https://lancedb.com) +[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) + +[English](README.md) | [简体中文](README_CN.md) | [繁體中文](README_TW.md) | [日本語](README_JA.md) | [한국어](README_KO.md) | [Français](README_FR.md) | [Español](README_ES.md) | [Deutsch](README_DE.md) | [Italiano](README_IT.md) | [Русский](README_RU.md) | [Português (Brasil)](README_PT-BR.md) + +
+ +--- + +## Perché memory-lancedb-pro? + +La maggior parte degli agenti IA soffre di amnesia. Dimenticano tutto nel momento in cui si avvia una nuova chat. + +**memory-lancedb-pro** è un plugin di memoria a lungo termine di livello produttivo per OpenClaw che trasforma il tuo agente in un vero **Assistente Memoria IA** — cattura automaticamente ciò che conta, lascia il rumore dissolversi naturalmente e recupera il ricordo giusto al momento giusto. Nessun tag manuale, nessuna configurazione complicata. + +### Il tuo Assistente Memoria IA in azione + +**Senza memoria — ogni sessione parte da zero:** + +> **Tu:** "Usa i tab per l'indentazione, aggiungi sempre la gestione degli errori." +> *(sessione successiva)* +> **Tu:** "Te l'ho già detto — tab, non spazi!" 😤 +> *(sessione successiva)* +> **Tu:** "…sul serio, tab. E gestione degli errori. Di nuovo." + +**Con memory-lancedb-pro — il tuo agente impara e ricorda:** + +> **Tu:** "Usa i tab per l'indentazione, aggiungi sempre la gestione degli errori." +> *(sessione successiva — l'agente richiama automaticamente le tue preferenze)* +> **Agente:** *(applica silenziosamente tab + gestione errori)* ✅ +> **Tu:** "Perché il mese scorso abbiamo scelto PostgreSQL invece di MongoDB?" +> **Agente:** "In base alla nostra discussione del 12 febbraio, i motivi principali erano…" ✅ + +Questa è la differenza che fa un **Assistente Memoria IA** — impara il tuo stile, ricorda le decisioni passate e fornisce risposte personalizzate senza che tu debba ripeterti. + +### Cos'altro può fare? + +| | Cosa ottieni | +|---|---| +| **Auto-Capture** | Il tuo agente impara da ogni conversazione — nessun `memory_store` manuale necessario | +| **Estrazione intelligente** | Classificazione LLM in 6 categorie: profili, preferenze, entità, eventi, casi, pattern | +| **Oblio intelligente** | Modello di decadimento Weibull — i ricordi importanti restano, il rumore svanisce | +| **Ricerca ibrida** | Ricerca vettoriale + BM25 full-text, fusa con reranking cross-encoder | +| **Iniezione di contesto** | I ricordi rilevanti emergono automaticamente prima di ogni risposta | +| **Isolamento multi-scope** | Confini di memoria per agente, per utente, per progetto | +| **Qualsiasi provider** | OpenAI, Jina, Gemini, Ollama o qualsiasi API compatibile OpenAI | +| **Toolkit completo** | CLI, backup, migrazione, upgrade, esportazione/importazione — pronto per la produzione | + +--- + +## Avvio rapido + +### Opzione A: Script di installazione con un clic (consigliato) + +Lo **[script di installazione](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** mantenuto dalla community gestisce installazione, aggiornamento e riparazione in un solo comando: + +```bash +curl -fsSL https://raw.githubusercontent.com/CortexReach/toolbox/main/memory-lancedb-pro-setup/setup-memory.sh -o setup-memory.sh +bash setup-memory.sh +``` + +> Vedi [Ecosistema](#ecosistema) qui sotto per l'elenco completo degli scenari coperti e altri strumenti della community. + +### Opzione B: Installazione manuale + +**Tramite OpenClaw CLI (consigliato):** +```bash +openclaw plugins install memory-lancedb-pro@beta +``` + +**Oppure tramite npm:** +```bash +npm i memory-lancedb-pro@beta +``` +> Se usi npm, dovrai anche aggiungere la directory di installazione del plugin come percorso **assoluto** in `plugins.load.paths` nel tuo `openclaw.json`. Questo è il problema di configurazione più comune. + +Aggiungi al tuo `openclaw.json`: + +```json +{ + "plugins": { + "slots": { "memory": "memory-lancedb-pro" }, + "entries": { + "memory-lancedb-pro": { + "enabled": true, + "config": { + "embedding": { + "provider": "openai-compatible", + "apiKey": "${OPENAI_API_KEY}", + "model": "text-embedding-3-small" + }, + "autoCapture": true, + "autoRecall": true, + "smartExtraction": true, + "extractMinMessages": 2, + "extractMaxChars": 8000, + "sessionMemory": { "enabled": false } + } + } + } + } +} +``` + +**Perché questi valori predefiniti?** +- `autoCapture` + `smartExtraction` → il tuo agente impara automaticamente da ogni conversazione +- `autoRecall` → i ricordi rilevanti vengono iniettati prima di ogni risposta +- `extractMinMessages: 2` → l'estrazione si attiva nelle normali chat a due turni +- `sessionMemory.enabled: false` → evita di inquinare la ricerca con riassunti di sessione all'inizio + +Valida e riavvia: + +```bash +openclaw config validate +openclaw gateway restart +openclaw logs --follow --plain | grep "memory-lancedb-pro" +``` + +Dovresti vedere: +- `memory-lancedb-pro: smart extraction enabled` +- `memory-lancedb-pro@...: plugin registered` + +Fatto! Il tuo agente ora ha una memoria a lungo termine. + +
+Ulteriori percorsi di installazione (utenti esistenti, aggiornamenti) + +**Usi già OpenClaw?** + +1. Aggiungi il plugin con un percorso **assoluto** in `plugins.load.paths` +2. Associa lo slot di memoria: `plugins.slots.memory = "memory-lancedb-pro"` +3. Verifica: `openclaw plugins info memory-lancedb-pro && openclaw memory-pro stats` + +**Aggiornamento da versioni precedenti alla v1.1.0?** + +```bash +# 1) Backup +openclaw memory-pro export --scope global --output memories-backup.json +# 2) Dry run +openclaw memory-pro upgrade --dry-run +# 3) Run upgrade +openclaw memory-pro upgrade +# 4) Verify +openclaw memory-pro stats +``` + +Vedi `CHANGELOG-v1.1.0.md` per le modifiche comportamentali e le motivazioni dell'aggiornamento. + +
+ +
+Importazione rapida Telegram Bot (clicca per espandere) + +Se stai usando l'integrazione Telegram di OpenClaw, il modo più semplice è inviare un comando di importazione direttamente al Bot principale invece di modificare manualmente la configurazione. + +Invia questo messaggio: + +```text +Help me connect this memory plugin with the most user-friendly configuration: https://github.com/CortexReach/memory-lancedb-pro + +Requirements: +1. Set it as the only active memory plugin +2. Use Jina for embedding +3. Use Jina for reranker +4. Use gpt-4o-mini for the smart-extraction LLM +5. Enable autoCapture, autoRecall, smartExtraction +6. extractMinMessages=2 +7. sessionMemory.enabled=false +8. captureAssistant=false +9. retrieval mode=hybrid, vectorWeight=0.7, bm25Weight=0.3 +10. rerank=cross-encoder, candidatePoolSize=12, minScore=0.6, hardMinScore=0.62 +11. Generate the final openclaw.json config directly, not just an explanation +``` + +
+ +--- + +## Ecosistema + +memory-lancedb-pro è il plugin principale. La community ha costruito strumenti per rendere l'installazione e l'uso quotidiano ancora più fluidi: + +### Script di installazione — Installazione, aggiornamento e riparazione con un clic + +> **[CortexReach/toolbox/memory-lancedb-pro-setup](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** + +Non è un semplice installer — lo script gestisce in modo intelligente numerosi scenari reali: + +| La tua situazione | Cosa fa lo script | +|---|---| +| Mai installato | Download → installazione dipendenze → scelta configurazione → scrittura in openclaw.json → riavvio | +| Installato tramite `git clone`, bloccato su un vecchio commit | `git fetch` + `checkout` automatico all'ultima versione → reinstallazione dipendenze → verifica | +| La configurazione ha campi non validi | Rilevamento automatico tramite filtro schema, rimozione campi non supportati | +| Installato tramite `npm` | Salta l'aggiornamento git, ricorda di eseguire `npm update` autonomamente | +| CLI `openclaw` non funzionante per configurazione non valida | Fallback: lettura diretta del percorso workspace dal file `openclaw.json` | +| `extensions/` invece di `plugins/` | Rilevamento automatico della posizione del plugin da configurazione o filesystem | +| Già aggiornato | Solo controlli di integrità, nessuna modifica | + +```bash +bash setup-memory.sh # Installa o aggiorna +bash setup-memory.sh --dry-run # Solo anteprima +bash setup-memory.sh --beta # Includi versioni pre-release +bash setup-memory.sh --uninstall # Ripristina configurazione e rimuovi plugin +``` + +Preset di provider integrati: **Jina / DashScope / SiliconFlow / OpenAI / Ollama**, oppure usa la tua API compatibile OpenAI. Per l'utilizzo completo (inclusi `--ref`, `--selfcheck-only` e altro), consulta il [README dello script di installazione](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup). + +### Claude Code / OpenClaw Skill — Configurazione guidata dall'IA + +> **[CortexReach/memory-lancedb-pro-skill](https://github.com/CortexReach/memory-lancedb-pro-skill)** + +Installa questa Skill e il tuo agente IA (Claude Code o OpenClaw) acquisisce una conoscenza approfondita di tutte le funzionalità di memory-lancedb-pro. Basta dire **"aiutami ad attivare la configurazione migliore"** per ottenere: + +- **Workflow di configurazione guidato in 7 passaggi** con 4 piani di distribuzione: + - Full Power (Jina + OpenAI) / Budget (reranker SiliconFlow gratuito) / Simple (solo OpenAI) / Completamente locale (Ollama, zero costi API) +- **Tutti i 9 strumenti MCP** usati correttamente: `memory_recall`, `memory_store`, `memory_forget`, `memory_update`, `memory_stats`, `memory_list`, `self_improvement_log`, `self_improvement_extract_skill`, `self_improvement_review` *(il set completo richiede `enableManagementTools: true` — la configurazione Quick Start predefinita espone i 4 strumenti principali)* +- **Prevenzione delle insidie comuni**: attivazione plugin workspace, `autoRecall` predefinito a false, cache jiti, variabili d'ambiente, isolamento scope, ecc. + +**Installazione per Claude Code:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.claude/skills/memory-lancedb-pro +``` + +**Installazione per OpenClaw:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.openclaw/workspace/skills/memory-lancedb-pro-skill +``` + +--- + +## Tutorial video + +> Guida completa: installazione, configurazione e funzionamento interno della ricerca ibrida. + +[![YouTube Video](https://img.shields.io/badge/YouTube-Watch%20Now-red?style=for-the-badge&logo=youtube)](https://youtu.be/MtukF1C8epQ) +**https://youtu.be/MtukF1C8epQ** + +[![Bilibili Video](https://img.shields.io/badge/Bilibili-Watch%20Now-00A1D6?style=for-the-badge&logo=bilibili&logoColor=white)](https://www.bilibili.com/video/BV1zUf2BGEgn/) +**https://www.bilibili.com/video/BV1zUf2BGEgn/** + +--- + +## Architettura + +``` +┌─────────────────────────────────────────────────────────┐ +│ index.ts (Entry Point) │ +│ Plugin Registration · Config Parsing · Lifecycle Hooks │ +└────────┬──────────┬──────────┬──────────┬───────────────┘ + │ │ │ │ + ┌────▼───┐ ┌────▼───┐ ┌───▼────┐ ┌──▼──────────┐ + │ store │ │embedder│ │retriever│ │ scopes │ + │ .ts │ │ .ts │ │ .ts │ │ .ts │ + └────────┘ └────────┘ └────────┘ └─────────────┘ + │ │ + ┌────▼───┐ ┌─────▼──────────┐ + │migrate │ │noise-filter.ts │ + │ .ts │ │adaptive- │ + └────────┘ │retrieval.ts │ + └────────────────┘ + ┌─────────────┐ ┌──────────┐ + │ tools.ts │ │ cli.ts │ + │ (Agent API) │ │ (CLI) │ + └─────────────┘ └──────────┘ +``` + +> Per un approfondimento sull'architettura completa, consulta [docs/memory_architecture_analysis.md](docs/memory_architecture_analysis.md). + +
+Riferimento file (clicca per espandere) + +| File | Scopo | +| --- | --- | +| `index.ts` | Punto di ingresso del plugin. Si registra con l'API Plugin di OpenClaw, analizza la configurazione, monta gli hook del ciclo di vita | +| `openclaw.plugin.json` | Metadati del plugin + dichiarazione completa della configurazione JSON Schema | +| `cli.ts` | Comandi CLI: `memory-pro list/search/stats/delete/delete-bulk/export/import/reembed/upgrade/migrate` | +| `src/store.ts` | Layer di storage LanceDB. Creazione tabelle / indicizzazione FTS / ricerca vettoriale / ricerca BM25 / CRUD | +| `src/embedder.ts` | Astrazione embedding. Compatibile con qualsiasi provider API compatibile OpenAI | +| `src/retriever.ts` | Motore di ricerca ibrido. Vettoriale + BM25 → Fusione ibrida → Rerank → Decadimento ciclo di vita → Filtro | +| `src/scopes.ts` | Controllo accessi multi-scope | +| `src/tools.ts` | Definizioni degli strumenti agente: `memory_recall`, `memory_store`, `memory_forget`, `memory_update` + strumenti di gestione | +| `src/noise-filter.ts` | Filtra rifiuti dell'agente, meta-domande, saluti e contenuti di bassa qualità | +| `src/adaptive-retrieval.ts` | Determina se una query necessita di ricerca nella memoria | +| `src/migrate.ts` | Migrazione dal `memory-lancedb` integrato a Pro | +| `src/smart-extractor.ts` | Estrazione LLM in 6 categorie con archiviazione a strati L0/L1/L2 e deduplicazione in due fasi | +| `src/decay-engine.ts` | Modello di decadimento esponenziale esteso Weibull | +| `src/tier-manager.ts` | Promozione/retrocessione a tre livelli: Peripheral ↔ Working ↔ Core | + +
+ +--- + +## Funzionalità principali + +### Ricerca ibrida + +``` +Query → embedQuery() ─┐ + ├─→ Hybrid Fusion → Rerank → Lifecycle Decay Boost → Length Norm → Filter +Query → BM25 FTS ─────┘ +``` + +- **Ricerca vettoriale** — similarità semantica tramite LanceDB ANN (distanza del coseno) +- **Ricerca full-text BM25** — corrispondenza esatta delle parole chiave tramite indice FTS di LanceDB +- **Fusione ibrida** — punteggio vettoriale come base, i risultati BM25 ricevono un boost ponderato (non RRF standard — ottimizzato per la qualità di richiamo nel mondo reale) +- **Pesi configurabili** — `vectorWeight`, `bm25Weight`, `minScore` + +### Reranking Cross-Encoder + +- Adattatori integrati per **Jina**, **SiliconFlow**, **Voyage AI** e **Pinecone** +- Compatibile con qualsiasi endpoint compatibile Jina (ad es. Hugging Face TEI, DashScope) +- Punteggio ibrido: 60% cross-encoder + 40% punteggio fuso originale +- Degradazione elegante: fallback sulla similarità del coseno in caso di errore API + +### Pipeline di punteggio multi-fase + +| Fase | Effetto | +| --- | --- | +| **Fusione ibrida** | Combina richiamo semantico e corrispondenza esatta | +| **Rerank Cross-Encoder** | Promuove risultati semanticamente precisi | +| **Boost decadimento ciclo di vita** | Freschezza Weibull + frequenza di accesso + importance × confidence | +| **Normalizzazione lunghezza** | Impedisce alle voci lunghe di dominare (ancora: 500 caratteri) | +| **Punteggio minimo rigido** | Rimuove risultati irrilevanti (predefinito: 0.35) | +| **Diversità MMR** | Similarità coseno > 0.85 → retrocesso | + +### Estrazione intelligente della memoria (v1.1.0) + +- **Estrazione LLM in 6 categorie**: profilo, preferenze, entità, eventi, casi, pattern +- **Archiviazione a strati L0/L1/L2**: L0 (indice in una frase) → L1 (riepilogo strutturato) → L2 (narrazione completa) +- **Deduplicazione in due fasi**: pre-filtro similarità vettoriale (≥0.7) → decisione semantica LLM (CREATE/MERGE/SKIP) +- **Fusione consapevole delle categorie**: `profile` viene sempre fuso, `events`/`cases` solo in aggiunta + +### Gestione del ciclo di vita della memoria (v1.1.0) + +- **Motore di decadimento Weibull**: punteggio composito = freschezza + frequenza + valore intrinseco +- **Promozione a tre livelli**: `Peripheral ↔ Working ↔ Core` con soglie configurabili +- **Rinforzo per accesso**: i ricordi richiamati frequentemente decadono più lentamente (stile ripetizione spaziata) +- **Emivita modulata dall'importanza**: i ricordi importanti decadono più lentamente + +### Isolamento multi-scope + +- Scope integrati: `global`, `agent:`, `custom:`, `project:`, `user:` +- Controllo accessi a livello agente tramite `scopes.agentAccess` +- Predefinito: ogni agente accede a `global` + il proprio scope `agent:` + +### Auto-Capture e Auto-Recall + +- **Auto-Capture** (`agent_end`): estrae preferenze/fatti/decisioni/entità dalle conversazioni, deduplica, memorizza fino a 3 per turno +- **Auto-Recall** (`before_agent_start`): inietta il contesto `` (fino a 3 voci) + +### Filtraggio del rumore e ricerca adattiva + +- Filtra contenuti di bassa qualità: rifiuti dell'agente, meta-domande, saluti +- Salta la ricerca per: saluti, comandi slash, conferme semplici, emoji +- Forza la ricerca per parole chiave della memoria ("ricorda", "precedentemente", "l'ultima volta") +- Soglie CJK (cinese: 6 caratteri vs inglese: 15 caratteri) + +--- + +
+Confronto con memory-lancedb integrato (clicca per espandere) + +| Funzionalità | `memory-lancedb` integrato | **memory-lancedb-pro** | +| --- | :---: | :---: | +| Ricerca vettoriale | Sì | Sì | +| Ricerca full-text BM25 | - | Sì | +| Fusione ibrida (Vettoriale + BM25) | - | Sì | +| Rerank cross-encoder (multi-provider) | - | Sì | +| Boost di freschezza e decadimento temporale | - | Sì | +| Normalizzazione lunghezza | - | Sì | +| Diversità MMR | - | Sì | +| Isolamento multi-scope | - | Sì | +| Filtraggio del rumore | - | Sì | +| Ricerca adattiva | - | Sì | +| CLI di gestione | - | Sì | +| Memoria di sessione | - | Sì | +| Embedding task-aware | - | Sì | +| **Estrazione intelligente LLM (6 categorie)** | - | Sì (v1.1.0) | +| **Decadimento Weibull + promozione livelli** | - | Sì (v1.1.0) | +| Qualsiasi embedding compatibile OpenAI | Limitato | Sì | + +
+ +--- + +## Configurazione + +
+Esempio di configurazione completa + +```json +{ + "embedding": { + "apiKey": "${JINA_API_KEY}", + "model": "jina-embeddings-v5-text-small", + "baseURL": "https://api.jina.ai/v1", + "dimensions": 1024, + "taskQuery": "retrieval.query", + "taskPassage": "retrieval.passage", + "normalized": true + }, + "dbPath": "~/.openclaw/memory/lancedb-pro", + "autoCapture": true, + "autoRecall": true, + "retrieval": { + "mode": "hybrid", + "vectorWeight": 0.7, + "bm25Weight": 0.3, + "minScore": 0.3, + "rerank": "cross-encoder", + "rerankApiKey": "${JINA_API_KEY}", + "rerankModel": "jina-reranker-v3", + "rerankEndpoint": "https://api.jina.ai/v1/rerank", + "rerankProvider": "jina", + "candidatePoolSize": 20, + "recencyHalfLifeDays": 14, + "recencyWeight": 0.1, + "filterNoise": true, + "lengthNormAnchor": 500, + "hardMinScore": 0.35, + "timeDecayHalfLifeDays": 60, + "reinforcementFactor": 0.5, + "maxHalfLifeMultiplier": 3 + }, + "enableManagementTools": false, + "scopes": { + "default": "global", + "definitions": { + "global": { "description": "Shared knowledge" }, + "agent:discord-bot": { "description": "Discord bot private" } + }, + "agentAccess": { + "discord-bot": ["global", "agent:discord-bot"] + } + }, + "sessionMemory": { + "enabled": false, + "messageCount": 15 + }, + "smartExtraction": true, + "llm": { + "apiKey": "${OPENAI_API_KEY}", + "model": "gpt-4o-mini", + "baseURL": "https://api.openai.com/v1" + }, + "extractMinMessages": 2, + "extractMaxChars": 8000 +} +``` + +
+ +
+Provider di embedding + +Funziona con **qualsiasi API di embedding compatibile OpenAI**: + +| Provider | Modello | Base URL | Dimensioni | +| --- | --- | --- | --- | +| **Jina** (consigliato) | `jina-embeddings-v5-text-small` | `https://api.jina.ai/v1` | 1024 | +| **OpenAI** | `text-embedding-3-small` | `https://api.openai.com/v1` | 1536 | +| **Voyage** | `voyage-4-lite` / `voyage-4` | `https://api.voyageai.com/v1` | 1024 / 1024 | +| **Google Gemini** | `gemini-embedding-001` | `https://generativelanguage.googleapis.com/v1beta/openai/` | 3072 | +| **Ollama** (locale) | `nomic-embed-text` | `http://localhost:11434/v1` | specifico del provider | + +
+ +
+Provider di rerank + +Il reranking cross-encoder supporta più provider tramite `rerankProvider`: + +| Provider | `rerankProvider` | Modello di esempio | +| --- | --- | --- | +| **Jina** (predefinito) | `jina` | `jina-reranker-v3` | +| **SiliconFlow** (piano gratuito disponibile) | `siliconflow` | `BAAI/bge-reranker-v2-m3` | +| **Voyage AI** | `voyage` | `rerank-2.5` | +| **Pinecone** | `pinecone` | `bge-reranker-v2-m3` | + +Funziona anche qualsiasi endpoint di rerank compatibile Jina — imposta `rerankProvider: "jina"` e punta `rerankEndpoint` al tuo servizio (ad es. Hugging Face TEI, DashScope `qwen3-rerank`). + +
+ +
+Estrazione intelligente (LLM) — v1.1.0 + +Quando `smartExtraction` è abilitato (predefinito: `true`), il plugin utilizza un LLM per estrarre e classificare intelligentemente i ricordi invece di trigger basati su regex. + +| Campo | Tipo | Predefinito | Descrizione | +|-------|------|---------|-------------| +| `smartExtraction` | boolean | `true` | Abilita/disabilita l'estrazione LLM in 6 categorie | +| `llm.auth` | string | `api-key` | `api-key` usa `llm.apiKey` / `embedding.apiKey`; `oauth` usa un file token OAuth con scope plugin per impostazione predefinita | +| `llm.apiKey` | string | *(fallback su `embedding.apiKey`)* | Chiave API per il provider LLM | +| `llm.model` | string | `openai/gpt-oss-120b` | Nome del modello LLM | +| `llm.baseURL` | string | *(fallback su `embedding.baseURL`)* | Endpoint API LLM | +| `llm.oauthProvider` | string | `openai-codex` | ID del provider OAuth usato quando `llm.auth` è `oauth` | +| `llm.oauthPath` | string | `~/.openclaw/.memory-lancedb-pro/oauth.json` | File token OAuth usato quando `llm.auth` è `oauth` | +| `llm.timeoutMs` | number | `30000` | Timeout della richiesta LLM in millisecondi | +| `extractMinMessages` | number | `2` | Messaggi minimi prima che l'estrazione si attivi | +| `extractMaxChars` | number | `8000` | Caratteri massimi inviati al LLM | + + +Configurazione `llm` OAuth (usa la cache di login esistente di Codex / ChatGPT per le chiamate LLM): +```json +{ + "llm": { + "auth": "oauth", + "oauthProvider": "openai-codex", + "model": "gpt-5.4", + "oauthPath": "${HOME}/.openclaw/.memory-lancedb-pro/oauth.json", + "timeoutMs": 30000 + } +} +``` + +Note per `llm.auth: "oauth"`: + +- `llm.oauthProvider` è attualmente `openai-codex`. +- I token OAuth sono salvati di default in `~/.openclaw/.memory-lancedb-pro/oauth.json`. +- Puoi impostare `llm.oauthPath` se vuoi salvare quel file altrove. +- `auth login` crea uno snapshot della configurazione `llm` precedente con api-key accanto al file OAuth, e `auth logout` ripristina quello snapshot quando disponibile. +- Il passaggio da `api-key` a `oauth` non trasferisce automaticamente `llm.baseURL`. Impostalo manualmente in modalità OAuth solo quando vuoi intenzionalmente un backend personalizzato compatibile ChatGPT/Codex. + +
+ +
+Configurazione ciclo di vita (Decadimento + Livelli) + +| Campo | Predefinito | Descrizione | +|-------|---------|-------------| +| `decay.recencyHalfLifeDays` | `30` | Emivita base per il decadimento di freschezza Weibull | +| `decay.frequencyWeight` | `0.3` | Peso della frequenza di accesso nel punteggio composito | +| `decay.intrinsicWeight` | `0.3` | Peso di `importance × confidence` | +| `decay.betaCore` | `0.8` | Beta Weibull per i ricordi `core` | +| `decay.betaWorking` | `1.0` | Beta Weibull per i ricordi `working` | +| `decay.betaPeripheral` | `1.3` | Beta Weibull per i ricordi `peripheral` | +| `tier.coreAccessThreshold` | `10` | Conteggio minimo richiami prima della promozione a `core` | +| `tier.peripheralAgeDays` | `60` | Soglia di età per retrocedere i ricordi inattivi | + +
+ +
+Rinforzo per accesso + +I ricordi richiamati frequentemente decadono più lentamente (stile ripetizione spaziata). + +Chiavi di configurazione (sotto `retrieval`): +- `reinforcementFactor` (0-2, predefinito: `0.5`) — imposta `0` per disabilitare +- `maxHalfLifeMultiplier` (1-10, predefinito: `3`) — limite massimo sull'emivita effettiva + +
+ +--- + +## Comandi CLI + +```bash +openclaw memory-pro list [--scope global] [--category fact] [--limit 20] [--json] +openclaw memory-pro search "query" [--scope global] [--limit 10] [--json] +openclaw memory-pro stats [--scope global] [--json] +openclaw memory-pro auth login [--provider openai-codex] [--model gpt-5.4] [--oauth-path /abs/path/oauth.json] +openclaw memory-pro auth status +openclaw memory-pro auth logout +openclaw memory-pro delete +openclaw memory-pro delete-bulk --scope global [--before 2025-01-01] [--dry-run] +openclaw memory-pro export [--scope global] [--output memories.json] +openclaw memory-pro import memories.json [--scope global] [--dry-run] +openclaw memory-pro reembed --source-db /path/to/old-db [--batch-size 32] [--skip-existing] +openclaw memory-pro upgrade [--dry-run] [--batch-size 10] [--no-llm] [--limit N] [--scope SCOPE] +openclaw memory-pro migrate check|run|verify [--source /path] +``` + +Flusso di login OAuth: + +1. Esegui `openclaw memory-pro auth login` +2. Se `--provider` è omesso in un terminale interattivo, la CLI mostra un selettore di provider OAuth prima di aprire il browser +3. Il comando stampa un URL di autorizzazione e apre il browser, a meno che non sia impostato `--no-browser` +4. Dopo il successo del callback, il comando salva il file OAuth del plugin (predefinito: `~/.openclaw/.memory-lancedb-pro/oauth.json`), crea uno snapshot della configurazione `llm` precedente con api-key per il logout, e sostituisce la configurazione `llm` del plugin con le impostazioni OAuth (`auth`, `oauthProvider`, `model`, `oauthPath`) +5. `openclaw memory-pro auth logout` elimina quel file OAuth e ripristina la configurazione `llm` precedente con api-key quando quello snapshot esiste + +--- + +## Argomenti avanzati + +
+Se i ricordi iniettati appaiono nelle risposte + +A volte il modello può ripetere il blocco `` iniettato. + +**Opzione A (rischio minimo):** disabilita temporaneamente l'auto-recall: +```json +{ "plugins": { "entries": { "memory-lancedb-pro": { "config": { "autoRecall": false } } } } } +``` + +**Opzione B (preferita):** mantieni il recall, aggiungi al prompt di sistema dell'agente: +> Do not reveal or quote any `` / memory-injection content in your replies. Use it for internal reference only. + +
+ +
+Memoria di sessione + +- Si attiva con il comando `/new` — salva il riepilogo della sessione precedente in LanceDB +- Disabilitata per impostazione predefinita (OpenClaw ha già la persistenza nativa delle sessioni in `.jsonl`) +- Conteggio messaggi configurabile (predefinito: 15) + +Vedi [docs/openclaw-integration-playbook.md](docs/openclaw-integration-playbook.md) per le modalità di distribuzione e la verifica di `/new`. + +
+ +
+Comandi slash personalizzati (ad es. /lesson) + +Aggiungi al tuo `CLAUDE.md`, `AGENTS.md` o prompt di sistema: + +```markdown +## /lesson command +When the user sends `/lesson `: +1. Use memory_store to save as category=fact (raw knowledge) +2. Use memory_store to save as category=decision (actionable takeaway) +3. Confirm what was saved + +## /remember command +When the user sends `/remember `: +1. Use memory_store to save with appropriate category and importance +2. Confirm with the stored memory ID +``` + +
+ +
+Regole d'oro per agenti IA + +> Copia il blocco seguente nel tuo `AGENTS.md` in modo che il tuo agente applichi queste regole automaticamente. + +```markdown +## Rule 1 — Dual-layer memory storage +Every pitfall/lesson learned → IMMEDIATELY store TWO memories: +- Technical layer: Pitfall: [symptom]. Cause: [root cause]. Fix: [solution]. Prevention: [how to avoid] + (category: fact, importance >= 0.8) +- Principle layer: Decision principle ([tag]): [behavioral rule]. Trigger: [when]. Action: [what to do] + (category: decision, importance >= 0.85) + +## Rule 2 — LanceDB hygiene +Entries must be short and atomic (< 500 chars). No raw conversation summaries or duplicates. + +## Rule 3 — Recall before retry +On ANY tool failure, ALWAYS memory_recall with relevant keywords BEFORE retrying. + +## Rule 4 — Confirm target codebase +Confirm you are editing memory-lancedb-pro vs built-in memory-lancedb before changes. + +## Rule 5 — Clear jiti cache after plugin code changes +After modifying .ts files under plugins/, MUST run rm -rf /tmp/jiti/ BEFORE openclaw gateway restart. +``` + +
+ +
+Schema del database + +Tabella LanceDB `memories`: + +| Campo | Tipo | Descrizione | +| --- | --- | --- | +| `id` | string (UUID) | Chiave primaria | +| `text` | string | Testo del ricordo (indicizzato FTS) | +| `vector` | float[] | Vettore di embedding | +| `category` | string | Categoria di archiviazione: `preference` / `fact` / `decision` / `entity` / `reflection` / `other` | +| `scope` | string | Identificatore scope (ad es. `global`, `agent:main`) | +| `importance` | float | Punteggio di importanza 0-1 | +| `timestamp` | int64 | Timestamp di creazione (ms) | +| `metadata` | string (JSON) | Metadati estesi | + +Chiavi `metadata` comuni nella v1.1.0: `l0_abstract`, `l1_overview`, `l2_content`, `memory_category`, `tier`, `access_count`, `confidence`, `last_accessed_at` + +> **Nota sulle categorie:** Il campo `category` di primo livello usa 6 categorie di archiviazione. Le 6 etichette semantiche dell'Estrazione Intelligente (`profile` / `preferences` / `entities` / `events` / `cases` / `patterns`) sono memorizzate in `metadata.memory_category`. + +
+ +
+Risoluzione dei problemi + +### "Cannot mix BigInt and other types" (LanceDB / Apache Arrow) + +Con LanceDB 0.26+, alcune colonne numeriche potrebbero essere restituite come `BigInt`. Aggiorna a **memory-lancedb-pro >= 1.0.14** — questo plugin ora converte i valori usando `Number(...)` prima delle operazioni aritmetiche. + +
+ +--- + +## Documentazione + +| Documento | Descrizione | +| --- | --- | +| [Playbook di integrazione OpenClaw](docs/openclaw-integration-playbook.md) | Modalità di distribuzione, verifica, matrice di regressione | +| [Analisi dell'architettura della memoria](docs/memory_architecture_analysis.md) | Analisi approfondita dell'architettura completa | +| [CHANGELOG v1.1.0](docs/CHANGELOG-v1.1.0.md) | Modifiche comportamentali v1.1.0 e motivazioni per l'upgrade | +| [Chunking contesto lungo](docs/long-context-chunking.md) | Strategia di chunking per documenti lunghi | + +--- + +## Beta: Smart Memory v1.1.0 + +> Stato: Beta — disponibile tramite `npm i memory-lancedb-pro@beta`. Gli utenti stabili su `latest` non sono interessati. + +| Funzionalità | Descrizione | +|---------|-------------| +| **Estrazione intelligente** | Estrazione LLM in 6 categorie con metadati L0/L1/L2. Fallback su regex se disabilitato. | +| **Punteggio ciclo di vita** | Decadimento Weibull integrato nella ricerca — i ricordi frequenti e importanti si posizionano più in alto. | +| **Gestione livelli** | Sistema a tre livelli (Core → Working → Peripheral) con promozione/retrocessione automatica. | + +Feedback: [GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issues) · Ripristina: `npm i memory-lancedb-pro@latest` + +--- + +## Dipendenze + +| Pacchetto | Scopo | +| --- | --- | +| `@lancedb/lancedb` ≥0.26.2 | Database vettoriale (ANN + FTS) | +| `openai` ≥6.21.0 | Client API Embedding compatibile OpenAI | +| `@sinclair/typebox` 0.34.48 | Definizioni di tipo JSON Schema | + +--- + +## Contributors + +

+@win4r +@kctony +@Akatsuki-Ryu +@JasonSuz +@Minidoracat +@furedericca-lab +@joe2643 +@AliceLJY +@chenjiyong +

+ +Full list: [Contributors](https://github.com/CortexReach/memory-lancedb-pro/graphs/contributors) + +## Star History + + + + + + Star History Chart + + + +## Licenza + +MIT + +--- + +## Il mio QR Code WeChat + + diff --git a/README_JA.md b/README_JA.md new file mode 100644 index 00000000..0627a2de --- /dev/null +++ b/README_JA.md @@ -0,0 +1,773 @@ +
+ +# 🧠 memory-lancedb-pro · 🦞OpenClaw Plugin + +**[OpenClaw](https://github.com/openclaw/openclaw) エージェント向け AI メモリアシスタント** + +*あなたの AI エージェントに本物の記憶力を——セッションを超え、エージェントを超え、時間を超えて。* + +LanceDB ベースの OpenClaw 長期メモリプラグイン。好み・意思決定・プロジェクトコンテキストを自動保存し、将来のセッションで自動的に想起します。 + +[![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-blue)](https://github.com/openclaw/openclaw) +[![npm version](https://img.shields.io/npm/v/memory-lancedb-pro)](https://www.npmjs.com/package/memory-lancedb-pro) +[![LanceDB](https://img.shields.io/badge/LanceDB-Vectorstore-orange)](https://lancedb.com) +[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) + +[English](README.md) | [简体中文](README_CN.md) | [繁體中文](README_TW.md) | [日本語](README_JA.md) | [한국어](README_KO.md) | [Français](README_FR.md) | [Español](README_ES.md) | [Deutsch](README_DE.md) | [Italiano](README_IT.md) | [Русский](README_RU.md) | [Português (Brasil)](README_PT-BR.md) + +
+ +--- + +## なぜ memory-lancedb-pro なのか? + +ほとんどの AI エージェントは「記憶喪失」です——新しいチャットを始めるたびに、以前の会話内容はすべてリセットされます。 + +**memory-lancedb-pro** は OpenClaw 向けのプロダクショングレードの長期メモリプラグインです。エージェントを真の **AI メモリアシスタント** に変えます——重要な情報を自動的にキャプチャし、ノイズを自然に減衰させ、適切なタイミングで適切な記憶を呼び出します。手動タグ付けも複雑な設定も不要です。 + +### AI メモリアシスタントの実際の動作 + +**メモリなし——毎回ゼロからスタート:** + +> **あなた:** 「インデントはタブで、常にエラーハンドリングを追加して。」 +> *(次のセッション)* +> **あなた:** 「前に言ったでしょ——タブであってスペースじゃない!」 😤 +> *(さらに次のセッション)* +> **あなた:** 「……本当にもう3回目だよ、タブ。あとエラーハンドリングも。」 + +**memory-lancedb-pro あり——エージェントが学習し記憶する:** + +> **あなた:** 「インデントはタブで、常にエラーハンドリングを追加して。」 +> *(次のセッション——エージェントが自動的にあなたの好みを想起)* +> **エージェント:** *(黙ってタブインデント+エラーハンドリングを適用)* ✅ +> **あなた:** 「先月なぜ MongoDB ではなく PostgreSQL を選んだんだっけ?」 +> **エージェント:** 「2月12日の議論に基づくと、主な理由は……」 ✅ + +これが **AI メモリアシスタント** の価値です——あなたのスタイルを学び、過去の意思決定を想起し、繰り返し説明することなくパーソナライズされた応答を提供します。 + +### 他に何ができる? + +| | 得られるもの | +|---|---| +| **自動キャプチャ** | エージェントが毎回の会話から学習——手動で `memory_store` を呼ぶ必要なし | +| **スマート抽出** | LLM 駆動の6カテゴリ分類:プロフィール、好み、エンティティ、イベント、ケース、パターン | +| **インテリジェント忘却** | Weibull 減衰モデル——重要な記憶は残り、ノイズは自然に消える | +| **ハイブリッド検索** | ベクトル + BM25 全文検索、クロスエンコーダーリランキングで融合 | +| **コンテキスト注入** | 関連する記憶が各応答前に自動的に浮上 | +| **マルチスコープ分離** | エージェント別、ユーザー別、プロジェクト別のメモリ境界 | +| **任意のプロバイダー** | OpenAI、Jina、Gemini、Ollama、または任意の OpenAI 互換 API | +| **フルツールキット** | CLI、バックアップ、マイグレーション、アップグレード、エクスポート/インポート——本番運用対応 | + +--- + +## クイックスタート + +### 方法 A:ワンクリックインストールスクリプト(推奨) + +コミュニティが管理する **[セットアップスクリプト](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** で、インストール・アップグレード・修復を1コマンドで実行: + +```bash +curl -fsSL https://raw.githubusercontent.com/CortexReach/toolbox/main/memory-lancedb-pro-setup/setup-memory.sh -o setup-memory.sh +bash setup-memory.sh +``` + +> スクリプトがカバーするシナリオの完全なリストとその他のコミュニティツールは、以下の [エコシステム](#エコシステム) をご覧ください。 + +### 方法 B:手動インストール + +**OpenClaw CLI 経由(推奨):** +```bash +openclaw plugins install memory-lancedb-pro@beta +``` + +**または npm 経由:** +```bash +npm i memory-lancedb-pro@beta +``` +> npm を使用する場合、`openclaw.json` の `plugins.load.paths` にプラグインのインストールディレクトリの **絶対パス** を追加する必要があります。これが最も一般的なセットアップの問題です。 + +`openclaw.json` に以下を追加: + +```json +{ + "plugins": { + "slots": { "memory": "memory-lancedb-pro" }, + "entries": { + "memory-lancedb-pro": { + "enabled": true, + "config": { + "embedding": { + "provider": "openai-compatible", + "apiKey": "${OPENAI_API_KEY}", + "model": "text-embedding-3-small" + }, + "autoCapture": true, + "autoRecall": true, + "smartExtraction": true, + "extractMinMessages": 2, + "extractMaxChars": 8000, + "sessionMemory": { "enabled": false } + } + } + } + } +} +``` + +**これらのデフォルト値の理由:** +- `autoCapture` + `smartExtraction` → エージェントが毎回の会話から自動的に学習 +- `autoRecall` → 関連する記憶が各応答前に自動注入 +- `extractMinMessages: 2` → 通常の2ターン会話で抽出がトリガー +- `sessionMemory.enabled: false` → 初期段階でセッション要約が検索結果を汚染するのを回避 + +検証と再起動: + +```bash +openclaw config validate +openclaw gateway restart +openclaw logs --follow --plain | grep "memory-lancedb-pro" +``` + +以下が表示されるはずです: +- `memory-lancedb-pro: smart extraction enabled` +- `memory-lancedb-pro@...: plugin registered` + +完了!あなたのエージェントは長期メモリを持つようになりました。 + +
+その他のインストール方法(既存ユーザー、アップグレード) + +**既に OpenClaw を使用中?** + +1. `plugins.load.paths` にプラグインの **絶対パス** を追加 +2. メモリスロットをバインド:`plugins.slots.memory = "memory-lancedb-pro"` +3. 検証:`openclaw plugins info memory-lancedb-pro && openclaw memory-pro stats` + +**v1.1.0 以前からのアップグレード?** + +```bash +# 1) バックアップ +openclaw memory-pro export --scope global --output memories-backup.json +# 2) ドライラン +openclaw memory-pro upgrade --dry-run +# 3) アップグレード実行 +openclaw memory-pro upgrade +# 4) 検証 +openclaw memory-pro stats +``` + +動作変更とアップグレードの詳細は `CHANGELOG-v1.1.0.md` を参照してください。 + +
+ +
+Telegram Bot クイックインポート(クリックで展開) + +OpenClaw の Telegram 連携を使用している場合、設定ファイルを手動で編集するより、メイン Bot にインポートコマンドを直接送信するのが最も簡単です。 + +以下のメッセージを送信してください: + +```text +Help me connect this memory plugin with the most user-friendly configuration: https://github.com/CortexReach/memory-lancedb-pro + +Requirements: +1. Set it as the only active memory plugin +2. Use Jina for embedding +3. Use Jina for reranker +4. Use gpt-4o-mini for the smart-extraction LLM +5. Enable autoCapture, autoRecall, smartExtraction +6. extractMinMessages=2 +7. sessionMemory.enabled=false +8. captureAssistant=false +9. retrieval mode=hybrid, vectorWeight=0.7, bm25Weight=0.3 +10. rerank=cross-encoder, candidatePoolSize=12, minScore=0.6, hardMinScore=0.62 +11. Generate the final openclaw.json config directly, not just an explanation +``` + +
+ +--- + +## エコシステム + +memory-lancedb-pro はコアプラグインです。コミュニティがセットアップと日常利用をさらにスムーズにするツールを構築しています: + +### セットアップスクリプト——ワンクリックでインストール・アップグレード・修復 + +> **[CortexReach/toolbox/memory-lancedb-pro-setup](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** + +単なるインストーラーではありません——さまざまな実際のシナリオをインテリジェントに処理します: + +| あなたの状況 | スクリプトの動作 | +|---|---| +| 未インストール | 新規ダウンロード → 依存関係インストール → 設定選択 → openclaw.json に書き込み → 再起動 | +| `git clone` でインストール済み、古いコミットで停滞 | 自動で `git fetch` + `checkout` を最新版に → 依存関係再インストール → 検証 | +| 設定に無効なフィールドがある | スキーマフィルターで自動検出し、サポートされていないフィールドを除去 | +| `npm` でインストール済み | git 更新をスキップし、`npm update` の実行を促す | +| 無効な設定で `openclaw` CLI が壊れている | フォールバック:`openclaw.json` ファイルからワークスペースパスを直接読み取り | +| `plugins/` ではなく `extensions/` | 設定またはファイルシステムからプラグインの場所を自動検出 | +| 既に最新版 | ヘルスチェックのみ実行、変更なし | + +```bash +bash setup-memory.sh # インストールまたはアップグレード +bash setup-memory.sh --dry-run # プレビューのみ +bash setup-memory.sh --beta # プレリリース版を含む +bash setup-memory.sh --uninstall # 設定を元に戻しプラグインを削除 +``` + +内蔵プロバイダープリセット:**Jina / DashScope / SiliconFlow / OpenAI / Ollama**、または任意の OpenAI 互換 API を利用可能。完全な使用方法(`--ref`、`--selfcheck-only` など)は [セットアップスクリプト README](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup) を参照してください。 + +### Claude Code / OpenClaw Skill——AI ガイド付き設定 + +> **[CortexReach/memory-lancedb-pro-skill](https://github.com/CortexReach/memory-lancedb-pro-skill)** + +この Skill をインストールすると、AI エージェント(Claude Code または OpenClaw)が memory-lancedb-pro のすべての機能を深く理解できるようになります。**「最適な設定を有効にして」** と言うだけで: + +- **7ステップのガイド付き設定ワークフロー**、4つのデプロイプランを提供: + - フルパワー版(Jina + OpenAI)/ コスト削減版(無料の SiliconFlow リランカー)/ シンプル版(OpenAI のみ)/ 完全ローカル版(Ollama、API コストゼロ) +- **全9つの MCP ツール** の正しい使い方:`memory_recall`、`memory_store`、`memory_forget`、`memory_update`、`memory_stats`、`memory_list`、`self_improvement_log`、`self_improvement_extract_skill`、`self_improvement_review` *(フルツールセットには `enableManagementTools: true` が必要——デフォルトのクイックスタート設定では4つのコアツールのみ公開)* +- **よくある落とし穴の回避**:ワークスペースプラグインの有効化、`autoRecall` のデフォルト false、jiti キャッシュ、環境変数、スコープ分離など + +**Claude Code へのインストール:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.claude/skills/memory-lancedb-pro +``` + +**OpenClaw へのインストール:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.openclaw/workspace/skills/memory-lancedb-pro-skill +``` + +--- + +## 動画チュートリアル + +> フルウォークスルー:インストール、設定、ハイブリッド検索の内部構造。 + +[![YouTube Video](https://img.shields.io/badge/YouTube-Watch%20Now-red?style=for-the-badge&logo=youtube)](https://youtu.be/MtukF1C8epQ) +**https://youtu.be/MtukF1C8epQ** + +[![Bilibili Video](https://img.shields.io/badge/Bilibili-Watch%20Now-00A1D6?style=for-the-badge&logo=bilibili&logoColor=white)](https://www.bilibili.com/video/BV1zUf2BGEgn/) +**https://www.bilibili.com/video/BV1zUf2BGEgn/** + +--- + +## アーキテクチャ + +``` +┌─────────────────────────────────────────────────────────┐ +│ index.ts(エントリポイント) │ +│ プラグイン登録 · 設定解析 · ライフサイクルフック │ +└────────┬──────────┬──────────┬──────────┬───────────────┘ + │ │ │ │ + ┌────▼───┐ ┌────▼───┐ ┌───▼────┐ ┌──▼──────────┐ + │ store │ │embedder│ │retriever│ │ scopes │ + │ .ts │ │ .ts │ │ .ts │ │ .ts │ + └────────┘ └────────┘ └────────┘ └─────────────┘ + │ │ + ┌────▼───┐ ┌─────▼──────────┐ + │migrate │ │noise-filter.ts │ + │ .ts │ │adaptive- │ + └────────┘ │retrieval.ts │ + └────────────────┘ + ┌─────────────┐ ┌──────────┐ + │ tools.ts │ │ cli.ts │ + │(エージェントAPI)│ │ (CLI) │ + └─────────────┘ └──────────┘ +``` + +> 完全なアーキテクチャの詳細は [docs/memory_architecture_analysis.md](docs/memory_architecture_analysis.md) を参照してください。 + +
+ファイルリファレンス(クリックで展開) + +| ファイル | 用途 | +| --- | --- | +| `index.ts` | プラグインエントリポイント。OpenClaw Plugin API に登録、設定解析、ライフサイクルフックのマウント | +| `openclaw.plugin.json` | プラグインメタデータ + 完全な JSON Schema 設定宣言 | +| `cli.ts` | CLI コマンド:`memory-pro list/search/stats/delete/delete-bulk/export/import/reembed/upgrade/migrate` | +| `src/store.ts` | LanceDB ストレージレイヤー。テーブル作成 / FTS インデックス / ベクトル検索 / BM25 検索 / CRUD | +| `src/embedder.ts` | Embedding 抽象レイヤー。任意の OpenAI 互換 API プロバイダーに対応 | +| `src/retriever.ts` | ハイブリッド検索エンジン。ベクトル + BM25 → ハイブリッド融合 → リランク → ライフサイクル減衰 → フィルタ | +| `src/scopes.ts` | マルチスコープアクセス制御 | +| `src/tools.ts` | エージェントツール定義:`memory_recall`、`memory_store`、`memory_forget`、`memory_update` + 管理ツール | +| `src/noise-filter.ts` | エージェントの拒否応答、メタ質問、挨拶などの低品質コンテンツをフィルタリング | +| `src/adaptive-retrieval.ts` | クエリがメモリ検索を必要とするかどうかを判定 | +| `src/migrate.ts` | 内蔵 `memory-lancedb` から Pro へのマイグレーション | +| `src/smart-extractor.ts` | LLM 駆動の6カテゴリ抽出、L0/L1/L2 階層ストレージと2段階重複排除対応 | +| `src/decay-engine.ts` | Weibull 伸長指数関数減衰モデル | +| `src/tier-manager.ts` | 3段階昇格/降格:周辺 ↔ ワーキング ↔ コア | + +
+ +--- + +## コア機能 + +### ハイブリッド検索 + +``` +クエリ → embedQuery() ─┐ + ├─→ ハイブリッド融合 → リランク → ライフサイクル減衰ブースト → 長さ正規化 → フィルタ +クエリ → BM25 全文 ─────┘ +``` + +- **ベクトル検索** — LanceDB ANN によるセマンティック類似度(コサイン距離) +- **BM25 全文検索** — LanceDB FTS インデックスによる正確なキーワードマッチング +- **ハイブリッド融合** — ベクトルスコアをベースに、BM25 ヒットに重み付きブーストを適用(標準 RRF ではなく、実際の再現率品質に最適化) +- **設定可能な重み** — `vectorWeight`、`bm25Weight`、`minScore` + +### クロスエンコーダーリランキング + +- **Jina**、**SiliconFlow**、**Voyage AI**、**Pinecone** の組み込みアダプター +- 任意の Jina 互換エンドポイント(例:Hugging Face TEI、DashScope)に対応 +- ハイブリッドスコアリング:60% クロスエンコーダー + 40% 元の融合スコア +- グレースフルデグラデーション:API 失敗時にコサイン類似度にフォールバック + +### マルチステージスコアリングパイプライン + +| ステージ | 効果 | +| --- | --- | +| **ハイブリッド融合** | セマンティック検索と完全一致検索を統合 | +| **クロスエンコーダーリランク** | セマンティックに正確なヒットを上位に昇格 | +| **ライフサイクル減衰ブースト** | Weibull 鮮度 + アクセス頻度 + 重要度 × 信頼度 | +| **長さ正規化** | 長いエントリが結果を支配するのを防止(アンカー:500文字) | +| **ハード最低スコア** | 無関係な結果を除去(デフォルト:0.35) | +| **MMR 多様性** | コサイン類似度 > 0.85 → 降格 | + +### スマートメモリ抽出(v1.1.0) + +- **LLM 駆動の6カテゴリ抽出**:プロフィール、好み、エンティティ、イベント、ケース、パターン +- **L0/L1/L2 階層ストレージ**:L0(一文の索引)→ L1(構造化サマリー)→ L2(完全な記述) +- **2段階重複排除**:ベクトル類似度プレフィルタ(≥0.7)→ LLM セマンティック判定(CREATE/MERGE/SKIP) +- **カテゴリ対応マージ**:`profile` は常にマージ、`events`/`cases` は追記のみ + +### メモリライフサイクル管理(v1.1.0) + +- **Weibull 減衰エンジン**:複合スコア = 鮮度 + 頻度 + 内在的価値 +- **3段階昇格**:`周辺 ↔ ワーキング ↔ コア`、閾値は設定可能 +- **アクセス強化**:頻繁に想起される記憶はより遅く減衰(間隔反復スタイル) +- **重要度変調半減期**:重要な記憶はより遅く減衰 + +### マルチスコープ分離 + +- 組み込みスコープ:`global`、`agent:`、`custom:`、`project:`、`user:` +- `scopes.agentAccess` によるエージェントレベルのアクセス制御 +- デフォルト:各エージェントが `global` + 自身の `agent:` スコープにアクセス + +### 自動キャプチャ&自動想起 + +- **自動キャプチャ**(`agent_end`):会話から好み/事実/決定/エンティティを抽出、重複排除後、1ターンあたり最大3件を保存 +- **自動想起**(`before_agent_start`):`` コンテキストを注入(最大3件) + +### ノイズフィルタリング&アダプティブ検索 + +- 低品質コンテンツをフィルタリング:エージェントの拒否応答、メタ質問、挨拶 +- 検索をスキップ:挨拶、スラッシュコマンド、簡単な確認、絵文字 +- 強制検索:メモリキーワード(「覚えている」「以前」「前回」) +- CJK 対応の閾値(中国語:6文字、英語:15文字) + +--- + +
+内蔵 memory-lancedb との比較(クリックで展開) + +| 機能 | 内蔵 `memory-lancedb` | **memory-lancedb-pro** | +| --- | :---: | :---: | +| ベクトル検索 | あり | あり | +| BM25 全文検索 | - | あり | +| ハイブリッド融合(ベクトル + BM25) | - | あり | +| クロスエンコーダーリランク(マルチプロバイダー) | - | あり | +| 鮮度ブースト&時間減衰 | - | あり | +| 長さ正規化 | - | あり | +| MMR 多様性 | - | あり | +| マルチスコープ分離 | - | あり | +| ノイズフィルタリング | - | あり | +| アダプティブ検索 | - | あり | +| 管理 CLI | - | あり | +| セッションメモリ | - | あり | +| タスク対応 Embedding | - | あり | +| **LLM スマート抽出(6カテゴリ)** | - | あり(v1.1.0) | +| **Weibull 減衰 + 階層昇格** | - | あり(v1.1.0) | +| 任意の OpenAI 互換 Embedding | 限定的 | あり | + +
+ +--- + +## 設定 + +
+完全な設定例 + +```json +{ + "embedding": { + "apiKey": "${JINA_API_KEY}", + "model": "jina-embeddings-v5-text-small", + "baseURL": "https://api.jina.ai/v1", + "dimensions": 1024, + "taskQuery": "retrieval.query", + "taskPassage": "retrieval.passage", + "normalized": true + }, + "dbPath": "~/.openclaw/memory/lancedb-pro", + "autoCapture": true, + "autoRecall": true, + "retrieval": { + "mode": "hybrid", + "vectorWeight": 0.7, + "bm25Weight": 0.3, + "minScore": 0.3, + "rerank": "cross-encoder", + "rerankApiKey": "${JINA_API_KEY}", + "rerankModel": "jina-reranker-v3", + "rerankEndpoint": "https://api.jina.ai/v1/rerank", + "rerankProvider": "jina", + "candidatePoolSize": 20, + "recencyHalfLifeDays": 14, + "recencyWeight": 0.1, + "filterNoise": true, + "lengthNormAnchor": 500, + "hardMinScore": 0.35, + "timeDecayHalfLifeDays": 60, + "reinforcementFactor": 0.5, + "maxHalfLifeMultiplier": 3 + }, + "enableManagementTools": false, + "scopes": { + "default": "global", + "definitions": { + "global": { "description": "Shared knowledge" }, + "agent:discord-bot": { "description": "Discord bot private" } + }, + "agentAccess": { + "discord-bot": ["global", "agent:discord-bot"] + } + }, + "sessionMemory": { + "enabled": false, + "messageCount": 15 + }, + "smartExtraction": true, + "llm": { + "apiKey": "${OPENAI_API_KEY}", + "model": "gpt-4o-mini", + "baseURL": "https://api.openai.com/v1" + }, + "extractMinMessages": 2, + "extractMaxChars": 8000 +} +``` + +
+ +
+Embedding プロバイダー + +**任意の OpenAI 互換 Embedding API** で動作: + +| プロバイダー | モデル | Base URL | 次元数 | +| --- | --- | --- | --- | +| **Jina**(推奨) | `jina-embeddings-v5-text-small` | `https://api.jina.ai/v1` | 1024 | +| **OpenAI** | `text-embedding-3-small` | `https://api.openai.com/v1` | 1536 | +| **Voyage** | `voyage-4-lite` / `voyage-4` | `https://api.voyageai.com/v1` | 1024 / 1024 | +| **Google Gemini** | `gemini-embedding-001` | `https://generativelanguage.googleapis.com/v1beta/openai/` | 3072 | +| **Ollama**(ローカル) | `nomic-embed-text` | `http://localhost:11434/v1` | プロバイダー依存 | + +
+ +
+リランクプロバイダー + +クロスエンコーダーリランキングは `rerankProvider` で複数のプロバイダーをサポート: + +| プロバイダー | `rerankProvider` | モデル例 | +| --- | --- | --- | +| **Jina**(デフォルト) | `jina` | `jina-reranker-v3` | +| **SiliconFlow**(無料枠あり) | `siliconflow` | `BAAI/bge-reranker-v2-m3` | +| **Voyage AI** | `voyage` | `rerank-2.5` | +| **Pinecone** | `pinecone` | `bge-reranker-v2-m3` | + +任意の Jina 互換リランクエンドポイントも使用可能——`rerankProvider: "jina"` を設定し、`rerankEndpoint` をあなたのサービスに向けてください(例:Hugging Face TEI、DashScope `qwen3-rerank`)。 + +
+ +
+スマート抽出(LLM)— v1.1.0 + +`smartExtraction` が有効(デフォルト:`true`)の場合、プラグインは正規表現ベースのトリガーの代わりに LLM を使用してインテリジェントにメモリを抽出・分類します。 + +| フィールド | 型 | デフォルト | 説明 | +|-------|------|---------|-------------| +| `smartExtraction` | boolean | `true` | LLM 駆動の6カテゴリ抽出の有効化/無効化 | +| `llm.auth` | string | `api-key` | `api-key` は `llm.apiKey` / `embedding.apiKey` を使用;`oauth` はデフォルトでプラグインスコープの OAuth トークンファイルを使用 | +| `llm.apiKey` | string | *(`embedding.apiKey` にフォールバック)* | LLM プロバイダーの API キー | +| `llm.model` | string | `openai/gpt-oss-120b` | LLM モデル名 | +| `llm.baseURL` | string | *(`embedding.baseURL` にフォールバック)* | LLM API エンドポイント | +| `llm.oauthProvider` | string | `openai-codex` | `llm.auth` が `oauth` の場合に使用する OAuth プロバイダー ID | +| `llm.oauthPath` | string | `~/.openclaw/.memory-lancedb-pro/oauth.json` | `llm.auth` が `oauth` の場合に使用する OAuth トークンファイル | +| `llm.timeoutMs` | number | `30000` | LLM リクエストタイムアウト(ミリ秒) | +| `extractMinMessages` | number | `2` | 抽出がトリガーされる最小メッセージ数 | +| `extractMaxChars` | number | `8000` | LLM に送信される最大文字数 | + + +OAuth `llm` 設定(既存の Codex / ChatGPT ログインキャッシュを使用して LLM 呼び出しを行う): +```json +{ + "llm": { + "auth": "oauth", + "oauthProvider": "openai-codex", + "model": "gpt-5.4", + "oauthPath": "${HOME}/.openclaw/.memory-lancedb-pro/oauth.json", + "timeoutMs": 30000 + } +} +``` + +`llm.auth: "oauth"` に関する注意点: + +- `llm.oauthProvider` は現在 `openai-codex` です。 +- OAuth トークンのデフォルト保存先は `~/.openclaw/.memory-lancedb-pro/oauth.json` です。 +- 別の場所に保存したい場合は `llm.oauthPath` を設定してください。 +- `auth login` は OAuth ファイルの隣に以前の api-key モードの `llm` 設定のスナップショットを保存し、`auth logout` は利用可能な場合にそのスナップショットを復元します。 +- `api-key` から `oauth` への切り替え時、`llm.baseURL` は自動的に引き継がれません。意図的にカスタム ChatGPT/Codex 互換バックエンドを使用する場合のみ、OAuth モードで手動設定してください。 + +
+ +
+ライフサイクル設定(減衰 + 階層) + +| フィールド | デフォルト | 説明 | +|-------|---------|-------------| +| `decay.recencyHalfLifeDays` | `30` | Weibull 鮮度減衰のベース半減期 | +| `decay.frequencyWeight` | `0.3` | 複合スコアにおけるアクセス頻度の重み | +| `decay.intrinsicWeight` | `0.3` | `重要度 × 信頼度` の重み | +| `decay.betaCore` | `0.8` | `コア` メモリの Weibull ベータ | +| `decay.betaWorking` | `1.0` | `ワーキング` メモリの Weibull ベータ | +| `decay.betaPeripheral` | `1.3` | `周辺` メモリの Weibull ベータ | +| `tier.coreAccessThreshold` | `10` | `コア` に昇格するために必要な最小想起回数 | +| `tier.peripheralAgeDays` | `60` | 古いメモリを降格するための日数閾値 | + +
+ +
+アクセス強化 + +頻繁に想起されるメモリはより遅く減衰します(間隔反復スタイル)。 + +設定キー(`retrieval` 内): +- `reinforcementFactor`(0-2、デフォルト:`0.5`)— `0` に設定すると無効化 +- `maxHalfLifeMultiplier`(1-10、デフォルト:`3`)— 実効半減期のハードキャップ + +
+ +--- + +## CLI コマンド + +```bash +openclaw memory-pro list [--scope global] [--category fact] [--limit 20] [--json] +openclaw memory-pro search "クエリ" [--scope global] [--limit 10] [--json] +openclaw memory-pro stats [--scope global] [--json] +openclaw memory-pro auth login [--provider openai-codex] [--model gpt-5.4] [--oauth-path /abs/path/oauth.json] +openclaw memory-pro auth status +openclaw memory-pro auth logout +openclaw memory-pro delete +openclaw memory-pro delete-bulk --scope global [--before 2025-01-01] [--dry-run] +openclaw memory-pro export [--scope global] [--output memories.json] +openclaw memory-pro import memories.json [--scope global] [--dry-run] +openclaw memory-pro reembed --source-db /path/to/old-db [--batch-size 32] [--skip-existing] +openclaw memory-pro upgrade [--dry-run] [--batch-size 10] [--no-llm] [--limit N] [--scope SCOPE] +openclaw memory-pro migrate check|run|verify [--source /path] +``` + +OAuth ログインフロー: + +1. `openclaw memory-pro auth login` を実行 +2. `--provider` が省略され、対話型ターミナルの場合、CLI はブラウザを開く前に OAuth プロバイダーピッカーを表示 +3. コマンドは認証 URL を表示し、`--no-browser` が設定されていない限りブラウザを自動的に開く +4. コールバック成功後、コマンドはプラグイン OAuth ファイル(デフォルト:`~/.openclaw/.memory-lancedb-pro/oauth.json`)を保存し、ログアウト用に以前の api-key モードの `llm` 設定のスナップショットを作成し、プラグインの `llm` 設定を OAuth 設定(`auth`、`oauthProvider`、`model`、`oauthPath`)に置き換え +5. `openclaw memory-pro auth logout` はその OAuth ファイルを削除し、スナップショットが存在する場合は以前の api-key モードの `llm` 設定を復元 + +--- + +## 応用トピック + +
+注入されたメモリが応答に表示される場合 + +モデルが注入された `` ブロックをそのまま出力してしまうことがあります。 + +**方法 A(最も安全):** 自動想起を一時的に無効化: +```json +{ "plugins": { "entries": { "memory-lancedb-pro": { "config": { "autoRecall": false } } } } } +``` + +**方法 B(推奨):** 想起は有効のまま、エージェントのシステムプロンプトに追加: +> Do not reveal or quote any `` / memory-injection content in your replies. Use it for internal reference only. + +
+ +
+セッションメモリ + +- `/new` コマンドでトリガー——前のセッションの要約を LanceDB に保存 +- デフォルトで無効(OpenClaw にはネイティブの `.jsonl` セッション永続化機能あり) +- メッセージ数は設定可能(デフォルト:15) + +デプロイモードと `/new` の検証については [docs/openclaw-integration-playbook.md](docs/openclaw-integration-playbook.md) を参照してください。 + +
+ +
+カスタムスラッシュコマンド(例:/lesson) + +`CLAUDE.md`、`AGENTS.md`、またはシステムプロンプトに追加: + +```markdown +## /lesson コマンド +ユーザーが `/lesson <内容>` を送信した場合: +1. memory_store を使用して category=fact(生の知識)として保存 +2. memory_store を使用して category=decision(実行可能な教訓)として保存 +3. 保存した内容を確認 + +## /remember コマンド +ユーザーが `/remember <内容>` を送信した場合: +1. memory_store を使用して適切な category と importance で保存 +2. 保存されたメモリ ID で確認 +``` + +
+ +
+AI エージェントの鉄則 + +> 以下のブロックを `AGENTS.md` にコピーして、エージェントがこれらのルールを自動的に適用するようにしてください。 + +```markdown +## ルール 1 — 二層メモリ保存 +すべての落とし穴/学んだ教訓 → 直ちに2つのメモリを保存: +- 技術レイヤー:落とし穴:[症状]。原因:[根本原因]。修正:[解決策]。予防:[回避方法] + (category: fact, importance >= 0.8) +- 原則レイヤー:意思決定原則 ([タグ]):[行動ルール]。トリガー:[いつ]。アクション:[何をする] + (category: decision, importance >= 0.85) + +## ルール 2 — LanceDB データ品質 +エントリは短くアトミックに(500文字未満)。生の会話要約や重複は保存しない。 + +## ルール 3 — リトライ前に想起 +いかなるツール失敗時も、リトライする前に必ず関連キーワードで memory_recall を実行。 + +## ルール 4 — 対象コードベースの確認 +変更前に、操作対象が memory-lancedb-pro なのか内蔵 memory-lancedb なのかを確認。 + +## ルール 5 — プラグインコード変更後に jiti キャッシュをクリア +plugins/ 配下の .ts ファイルを変更した後、openclaw gateway restart の前に必ず rm -rf /tmp/jiti/ を実行。 +``` + +
+ +
+データベーススキーマ + +LanceDB テーブル `memories`: + +| フィールド | 型 | 説明 | +| --- | --- | --- | +| `id` | string (UUID) | 主キー | +| `text` | string | メモリテキスト(FTS インデックス付き) | +| `vector` | float[] | Embedding ベクトル | +| `category` | string | ストレージカテゴリ:`preference` / `fact` / `decision` / `entity` / `reflection` / `other` | +| `scope` | string | スコープ識別子(例:`global`、`agent:main`) | +| `importance` | float | 重要度スコア 0-1 | +| `timestamp` | int64 | 作成タイムスタンプ(ミリ秒) | +| `metadata` | string (JSON) | 拡張メタデータ | + +v1.1.0 の一般的な `metadata` キー:`l0_abstract`、`l1_overview`、`l2_content`、`memory_category`、`tier`、`access_count`、`confidence`、`last_accessed_at` + +> **カテゴリに関する注意:** トップレベルの `category` フィールドは6つのストレージカテゴリを使用します。スマート抽出の6カテゴリセマンティックラベル(`profile` / `preferences` / `entities` / `events` / `cases` / `patterns`)は `metadata.memory_category` に保存されます。 + +
+ +
+トラブルシューティング + +### "Cannot mix BigInt and other types"(LanceDB / Apache Arrow) + +LanceDB 0.26+ では、一部の数値カラムが `BigInt` として返されることがあります。**memory-lancedb-pro >= 1.0.14** にアップグレードしてください——プラグインは算術演算の前に `Number(...)` で値を変換するようになっています。 + +
+ +--- + +## ドキュメント + +| ドキュメント | 説明 | +| --- | --- | +| [OpenClaw 統合プレイブック](docs/openclaw-integration-playbook.md) | デプロイモード、検証、リグレッションマトリックス | +| [メモリアーキテクチャ分析](docs/memory_architecture_analysis.md) | 完全なアーキテクチャ詳細解説 | +| [CHANGELOG v1.1.0](docs/CHANGELOG-v1.1.0.md) | v1.1.0 の動作変更とアップグレード根拠 | +| [ロングコンテキストチャンキング](docs/long-context-chunking.md) | 長文ドキュメントのチャンキング戦略 | + +--- + +## Beta:スマートメモリ v1.1.0 + +> ステータス:Beta——`npm i memory-lancedb-pro@beta` でインストール可能。`latest` を使用している安定版ユーザーには影響しません。 + +| 機能 | 説明 | +|---------|-------------| +| **スマート抽出** | LLM 駆動の6カテゴリ抽出、L0/L1/L2 メタデータ対応。無効時は正規表現にフォールバック。 | +| **ライフサイクルスコアリング** | Weibull 減衰を検索に統合——高頻度・高重要度のメモリが上位にランク。 | +| **階層管理** | 3段階システム(コア → ワーキング → 周辺)、自動昇格/降格。 | + +フィードバック:[GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issues) · 元に戻す:`npm i memory-lancedb-pro@latest` + +--- + +## 依存関係 + +| パッケージ | 用途 | +| --- | --- | +| `@lancedb/lancedb` ≥0.26.2 | ベクトルデータベース(ANN + FTS) | +| `openai` ≥6.21.0 | OpenAI 互換 Embedding API クライアント | +| `@sinclair/typebox` 0.34.48 | JSON Schema 型定義 | + +--- + +## コントリビューター + +

+@win4r +@kctony +@Akatsuki-Ryu +@JasonSuz +@Minidoracat +@furedericca-lab +@joe2643 +@AliceLJY +@chenjiyong +

+ +全リスト:[Contributors](https://github.com/CortexReach/memory-lancedb-pro/graphs/contributors) + +## Star 履歴 + + + + + + Star History Chart + + + +## ライセンス + +MIT + +--- + +## WeChat QR コード + + diff --git a/README_KO.md b/README_KO.md new file mode 100644 index 00000000..c8f165e6 --- /dev/null +++ b/README_KO.md @@ -0,0 +1,773 @@ +
+ +# 🧠 memory-lancedb-pro · 🦞OpenClaw Plugin + +**[OpenClaw](https://github.com/openclaw/openclaw) 에이전트를 위한 AI 메모리 어시스턴트** + +*AI 에이전트에게 진짜 기억하는 두뇌를 선물하세요 — 세션을 넘어, 에이전트를 넘어, 시간을 넘어.* + +LanceDB 기반 OpenClaw 메모리 플러그인으로, 사용자 선호도·의사결정·프로젝트 맥락을 저장하고 이후 세션에서 자동으로 불러옵니다. + +[![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-blue)](https://github.com/openclaw/openclaw) +[![npm version](https://img.shields.io/npm/v/memory-lancedb-pro)](https://www.npmjs.com/package/memory-lancedb-pro) +[![LanceDB](https://img.shields.io/badge/LanceDB-Vectorstore-orange)](https://lancedb.com) +[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) + +[English](README.md) | [简体中文](README_CN.md) | [繁體中文](README_TW.md) | [日本語](README_JA.md) | [한국어](README_KO.md) | [Français](README_FR.md) | [Español](README_ES.md) | [Deutsch](README_DE.md) | [Italiano](README_IT.md) | [Русский](README_RU.md) | [Português (Brasil)](README_PT-BR.md) + +
+ +--- + +## 왜 memory-lancedb-pro인가? + +대부분의 AI 에이전트는 건망증이 있습니다. 새 채팅을 시작하는 순간 모든 것을 잊어버립니다. + +**memory-lancedb-pro**는 OpenClaw를 위한 프로덕션 수준의 장기 기억 플러그인으로, 에이전트를 **AI 메모리 어시스턴트**로 바꿔줍니다 — 중요한 내용을 자동으로 캡처하고, 노이즈는 자연스럽게 희미해지게 하며, 적시에 적절한 기억을 검색합니다. 수동 태그 지정도, 복잡한 설정도 필요 없습니다. + +### AI 메모리 어시스턴트 실제 사용 모습 + +**기억 없이 — 매 세션이 처음부터 시작:** + +> **사용자:** "들여쓰기에 탭을 사용하고, 항상 에러 처리를 추가해." +> *(다음 세션)* +> **사용자:** "이미 말했잖아 — 스페이스 말고 탭이라고!" 😤 +> *(다음 세션)* +> **사용자:** "...진짜로, 탭이라고. 에러 처리도. 또." + +**memory-lancedb-pro와 함께 — 에이전트가 학습하고 기억합니다:** + +> **사용자:** "들여쓰기에 탭을 사용하고, 항상 에러 처리를 추가해." +> *(다음 세션 — 에이전트가 사용자 선호도를 자동으로 불러옴)* +> **에이전트:** *(자동으로 탭 + 에러 처리 적용)* ✅ +> **사용자:** "지난달에 왜 MongoDB 대신 PostgreSQL을 선택했지?" +> **에이전트:** "2월 12일 논의 내용에 따르면, 주요 이유는..." ✅ + +이것이 **AI 메모리 어시스턴트**가 만드는 차이입니다 — 사용자의 스타일을 학습하고, 과거 결정을 불러오며, 반복 없이 개인화된 응답을 제공합니다. + +### 그 외 무엇을 할 수 있나요? + +| | 제공 기능 | +|---|---| +| **Auto-Capture** | 에이전트가 모든 대화에서 학습 — 수동 `memory_store` 불필요 | +| **Smart Extraction** | LLM 기반 6개 카테고리 분류: profile, preferences, entities, events, cases, patterns | +| **Intelligent Forgetting** | Weibull 감쇠 모델 — 중요한 기억은 유지, 노이즈는 자연스럽게 사라짐 | +| **Hybrid Retrieval** | 벡터 + BM25 전문 검색, Cross-Encoder 리랭킹으로 융합 | +| **Context Injection** | 관련 기억이 매 응답 전에 자동으로 불러와짐 | +| **Multi-Scope Isolation** | 에이전트별, 사용자별, 프로젝트별 메모리 경계 | +| **Any Provider** | OpenAI, Jina, Gemini, Ollama 또는 OpenAI 호환 API 모두 지원 | +| **Full Toolkit** | CLI, 백업, 마이그레이션, 업그레이드, 내보내기/가져오기 — 프로덕션 환경에 적합 | + +--- + +## 빠른 시작 + +### 옵션 A: 원클릭 설치 스크립트 (권장) + +커뮤니티에서 관리하는 **[설치 스크립트](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)**가 설치, 업그레이드, 복구를 하나의 명령어로 처리합니다: + +```bash +curl -fsSL https://raw.githubusercontent.com/CortexReach/toolbox/main/memory-lancedb-pro-setup/setup-memory.sh -o setup-memory.sh +bash setup-memory.sh +``` + +> 스크립트가 다루는 전체 시나리오와 기타 커뮤니티 도구 목록은 아래 [에코시스템](#에코시스템)을 참조하세요. + +### 옵션 B: 수동 설치 + +**OpenClaw CLI를 통한 설치 (권장):** +```bash +openclaw plugins install memory-lancedb-pro@beta +``` + +**또는 npm을 통한 설치:** +```bash +npm i memory-lancedb-pro@beta +``` +> npm을 사용하는 경우, `openclaw.json`의 `plugins.load.paths`에 플러그인 설치 디렉터리의 **절대** 경로를 추가해야 합니다. 이것이 가장 흔한 설정 문제입니다. + +`openclaw.json`에 다음을 추가하세요: + +```json +{ + "plugins": { + "slots": { "memory": "memory-lancedb-pro" }, + "entries": { + "memory-lancedb-pro": { + "enabled": true, + "config": { + "embedding": { + "provider": "openai-compatible", + "apiKey": "${OPENAI_API_KEY}", + "model": "text-embedding-3-small" + }, + "autoCapture": true, + "autoRecall": true, + "smartExtraction": true, + "extractMinMessages": 2, + "extractMaxChars": 8000, + "sessionMemory": { "enabled": false } + } + } + } + } +} +``` + +**왜 이러한 기본값인가?** +- `autoCapture` + `smartExtraction` → 에이전트가 모든 대화에서 자동으로 학습 +- `autoRecall` → 매 응답 전에 관련 기억이 주입됨 +- `extractMinMessages: 2` → 일반적인 두 턴 대화에서 추출이 시작됨 +- `sessionMemory.enabled: false` → 초기에 세션 요약으로 검색이 오염되는 것을 방지 + +검증 및 재시작: + +```bash +openclaw config validate +openclaw gateway restart +openclaw logs --follow --plain | grep "memory-lancedb-pro" +``` + +다음이 표시되어야 합니다: +- `memory-lancedb-pro: smart extraction enabled` +- `memory-lancedb-pro@...: plugin registered` + +완료! 이제 에이전트가 장기 기억을 갖게 됩니다. + +
+추가 설치 경로 (기존 사용자, 업그레이드) + +**이미 OpenClaw를 사용 중인 경우:** + +1. **절대** 경로의 `plugins.load.paths` 항목으로 플러그인 추가 +2. 메모리 슬롯 바인딩: `plugins.slots.memory = "memory-lancedb-pro"` +3. 확인: `openclaw plugins info memory-lancedb-pro && openclaw memory-pro stats` + +**v1.1.0 이전 버전에서 업그레이드하는 경우:** + +```bash +# 1) 백업 +openclaw memory-pro export --scope global --output memories-backup.json +# 2) 시뮬레이션 실행 +openclaw memory-pro upgrade --dry-run +# 3) 업그레이드 실행 +openclaw memory-pro upgrade +# 4) 확인 +openclaw memory-pro stats +``` + +동작 변경사항과 업그레이드 근거는 `CHANGELOG-v1.1.0.md`를 참조하세요. + +
+ +
+Telegram 봇 빠른 가져오기 (클릭하여 펼치기) + +OpenClaw의 Telegram 연동을 사용하는 경우, 수동으로 설정을 편집하는 대신 메인 봇에 가져오기 명령어를 직접 보내는 것이 가장 쉬운 방법입니다. + +다음 메시지를 전송하세요 (봇에 그대로 복사하여 붙여넣기하는 영문 프롬프트입니다): + +```text +Help me connect this memory plugin with the most user-friendly configuration: https://github.com/CortexReach/memory-lancedb-pro + +Requirements: +1. Set it as the only active memory plugin +2. Use Jina for embedding +3. Use Jina for reranker +4. Use gpt-4o-mini for the smart-extraction LLM +5. Enable autoCapture, autoRecall, smartExtraction +6. extractMinMessages=2 +7. sessionMemory.enabled=false +8. captureAssistant=false +9. retrieval mode=hybrid, vectorWeight=0.7, bm25Weight=0.3 +10. rerank=cross-encoder, candidatePoolSize=12, minScore=0.6, hardMinScore=0.62 +11. Generate the final openclaw.json config directly, not just an explanation +``` + +
+ +--- + +## 에코시스템 + +memory-lancedb-pro는 핵심 플러그인입니다. 커뮤니티에서 설정과 일상적인 사용을 더욱 원활하게 만드는 도구들을 구축했습니다: + +### 설치 스크립트 — 원클릭 설치, 업그레이드 및 복구 + +> **[CortexReach/toolbox/memory-lancedb-pro-setup](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** + +단순한 인스톨러가 아닙니다 — 스크립트가 다양한 실제 시나리오를 지능적으로 처리합니다: + +| 상황 | 스크립트의 동작 | +|---|---| +| 설치한 적 없음 | 새로 다운로드 → 의존성 설치 → 설정 선택 → openclaw.json에 기록 → 재시작 | +| `git clone`으로 설치, 이전 커밋에서 멈춤 | 자동 `git fetch` + `checkout`으로 최신 버전 이동 → 의존성 재설치 → 확인 | +| 설정에 유효하지 않은 필드 존재 | 스키마 필터를 통한 자동 감지, 지원되지 않는 필드 제거 | +| `npm`으로 설치 | git 업데이트 건너뜀, `npm update` 직접 실행 알림 | +| 유효하지 않은 설정으로 `openclaw` CLI 동작 불가 | 대체 방법: `openclaw.json` 파일에서 직접 워크스페이스 경로 읽기 | +| `plugins/` 대신 `extensions/` 사용 | 설정 또는 파일시스템에서 플러그인 위치 자동 감지 | +| 이미 최신 상태 | 상태 확인만 실행, 변경 없음 | + +```bash +bash setup-memory.sh # 설치 또는 업그레이드 +bash setup-memory.sh --dry-run # 미리보기만 +bash setup-memory.sh --beta # 사전 릴리스 버전 포함 +bash setup-memory.sh --uninstall # 설정 복원 및 플러그인 제거 +``` + +내장 프로바이더 프리셋: **Jina / DashScope / SiliconFlow / OpenAI / Ollama**, 또는 자체 OpenAI 호환 API를 사용할 수 있습니다. `--ref`, `--selfcheck-only` 등 전체 사용법은 [설치 스크립트 README](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)를 참조하세요. + +### Claude Code / OpenClaw Skill — AI 가이드 설정 + +> **[CortexReach/memory-lancedb-pro-skill](https://github.com/CortexReach/memory-lancedb-pro-skill)** + +이 Skill을 설치하면 AI 에이전트(Claude Code 또는 OpenClaw)가 memory-lancedb-pro의 모든 기능에 대한 깊은 지식을 갖게 됩니다. **"최적의 설정을 도와줘"**라고 말하면 다음을 제공합니다: + +- **가이드 7단계 설정 워크플로우**와 4가지 배포 계획: + - Full Power (Jina + OpenAI) / Budget (무료 SiliconFlow 리랭커) / Simple (OpenAI만) / Fully Local (Ollama, API 비용 제로) +- **모든 9개 MCP 도구**의 올바른 사용법: `memory_recall`, `memory_store`, `memory_forget`, `memory_update`, `memory_stats`, `memory_list`, `self_improvement_log`, `self_improvement_extract_skill`, `self_improvement_review` *(전체 도구 세트를 사용하려면 `enableManagementTools: true`가 필요합니다 — 기본 빠른 시작 설정은 4개 핵심 도구만 노출합니다)* +- **일반적인 함정 방지**: 워크스페이스 플러그인 활성화, `autoRecall` 기본값 false, jiti 캐시, 환경 변수, 스코프 격리 등 + +**Claude Code용 설치:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.claude/skills/memory-lancedb-pro +``` + +**OpenClaw용 설치:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.openclaw/workspace/skills/memory-lancedb-pro-skill +``` + +--- + +## 비디오 튜토리얼 + +> 전체 안내: 설치, 설정, 하이브리드 검색 내부 구조. + +[![YouTube Video](https://img.shields.io/badge/YouTube-Watch%20Now-red?style=for-the-badge&logo=youtube)](https://youtu.be/MtukF1C8epQ) +**https://youtu.be/MtukF1C8epQ** + +[![Bilibili Video](https://img.shields.io/badge/Bilibili-Watch%20Now-00A1D6?style=for-the-badge&logo=bilibili&logoColor=white)](https://www.bilibili.com/video/BV1zUf2BGEgn/) +**https://www.bilibili.com/video/BV1zUf2BGEgn/** + +--- + +## 아키텍처 + +``` +┌─────────────────────────────────────────────────────────┐ +│ index.ts (진입점) │ +│ 플러그인 등록 · 설정 파싱 · 라이프사이클 훅 │ +└────────┬──────────┬──────────┬──────────┬───────────────┘ + │ │ │ │ + ┌────▼───┐ ┌────▼───┐ ┌───▼────┐ ┌──▼──────────┐ + │ store │ │embedder│ │retriever│ │ scopes │ + │ .ts │ │ .ts │ │ .ts │ │ .ts │ + └────────┘ └────────┘ └────────┘ └─────────────┘ + │ │ + ┌────▼───┐ ┌─────▼──────────┐ + │migrate │ │noise-filter.ts │ + │ .ts │ │adaptive- │ + └────────┘ │retrieval.ts │ + └────────────────┘ + ┌─────────────┐ ┌──────────┐ + │ tools.ts │ │ cli.ts │ + │ (에이전트API)│ │ (CLI) │ + └─────────────┘ └──────────┘ +``` + +> 전체 아키텍처에 대한 심층 분석은 [docs/memory_architecture_analysis.md](docs/memory_architecture_analysis.md)를 참조하세요. + +
+파일 레퍼런스 (클릭하여 펼치기) + +| 파일 | 용도 | +| --- | --- | +| `index.ts` | 플러그인 진입점. OpenClaw Plugin API에 등록, 설정 파싱, 라이프사이클 훅 마운트 | +| `openclaw.plugin.json` | 플러그인 메타데이터 + 전체 JSON Schema 설정 선언 | +| `cli.ts` | CLI 명령어: `memory-pro list/search/stats/delete/delete-bulk/export/import/reembed/upgrade/migrate` | +| `src/store.ts` | LanceDB 스토리지 레이어. 테이블 생성 / FTS 인덱싱 / 벡터 검색 / BM25 검색 / CRUD | +| `src/embedder.ts` | 임베딩 추상화. OpenAI 호환 API 프로바이더 모두 지원 | +| `src/retriever.ts` | 하이브리드 검색 엔진. 벡터 + BM25 → 하이브리드 퓨전 → 리랭크 → 라이프사이클 감쇠 → 필터 | +| `src/scopes.ts` | 멀티 스코프 접근 제어 | +| `src/tools.ts` | 에이전트 도구 정의: `memory_recall`, `memory_store`, `memory_forget`, `memory_update` + 관리 도구 | +| `src/noise-filter.ts` | 에이전트 거절, 메타 질문, 인사, 저품질 콘텐츠 필터링 | +| `src/adaptive-retrieval.ts` | 쿼리에 메모리 검색이 필요한지 판단 | +| `src/migrate.ts` | 내장 `memory-lancedb`에서 Pro로의 마이그레이션 | +| `src/smart-extractor.ts` | LLM 기반 6개 카테고리 추출 + L0/L1/L2 계층 저장 + 2단계 중복 제거 | +| `src/decay-engine.ts` | Weibull 확장 지수 감쇠 모델 | +| `src/tier-manager.ts` | 3단계 승격/강등: Peripheral ↔ Working ↔ Core | + +
+ +--- + +## 핵심 기능 + +### 하이브리드 검색 + +``` +Query → embedQuery() ─┐ + ├─→ Hybrid Fusion → Rerank → Lifecycle Decay Boost → Length Norm → Filter +Query → BM25 FTS ─────┘ +``` + +- **벡터 검색** — LanceDB ANN을 통한 의미적 유사도 (코사인 거리) +- **BM25 전문 검색** — LanceDB FTS 인덱스를 통한 정확한 키워드 매칭 +- **하이브리드 퓨전** — 벡터 스코어를 기본으로, BM25 히트에 가중 부스트 적용 (표준 RRF가 아님 — 실제 검색 품질에 맞게 튜닝됨) +- **가중치 설정 가능** — `vectorWeight`, `bm25Weight`, `minScore` + +### Cross-Encoder 리랭킹 + +- **Jina**, **SiliconFlow**, **Voyage AI**, **Pinecone** 내장 어댑터 +- Jina 호환 엔드포인트와 호환 (예: Hugging Face TEI, DashScope) +- 하이브리드 스코어링: Cross-Encoder 60% + 원래 퓨전 스코어 40% +- 그레이스풀 디그레이데이션: API 실패 시 코사인 유사도로 폴백 + +### 다단계 스코어링 파이프라인 + +| 단계 | 효과 | +| --- | --- | +| **하이브리드 퓨전** | 의미적 검색과 정확한 매칭 결합 | +| **Cross-Encoder 리랭크** | 의미적으로 정확한 결과 승격 | +| **라이프사이클 감쇠 부스트** | Weibull 최신성 + 접근 빈도 + 중요도 × 신뢰도 | +| **길이 정규화** | 긴 항목이 결과를 지배하는 것을 방지 (앵커: 500자) | +| **최소 점수 하한** | 관련 없는 결과 제거 (기본값: 0.35) | +| **MMR 다양성** | 코사인 유사도 > 0.85 → 강등 | + +### Smart Memory Extraction (v1.1.0) + +- **LLM 기반 6개 카테고리 추출**: profile, preferences, entities, events, cases, patterns +- **L0/L1/L2 계층 저장**: L0 (한 줄 인덱스) → L1 (구조화된 요약) → L2 (전체 내러티브) +- **2단계 중복 제거**: 벡터 유사도 사전 필터 (≥0.7) → LLM 의미 판단 (CREATE/MERGE/SKIP) +- **카테고리 인식 병합**: `profile`은 항상 병합, `events`/`cases`는 추가 전용 + +### 메모리 라이프사이클 관리 (v1.1.0) + +- **Weibull 감쇠 엔진**: 복합 점수 = 최신성 + 빈도 + 내재적 가치 +- **3단계 승격**: `Peripheral ↔ Working ↔ Core`, 설정 가능한 임계값 +- **접근 강화**: 자주 불러오는 기억은 더 느리게 감쇠 (간격 반복 학습 방식) +- **중요도 조절 반감기**: 중요한 기억은 더 느리게 감쇠 + +### Multi-Scope 격리 + +- 내장 스코프: `global`, `agent:`, `custom:`, `project:`, `user:` +- `scopes.agentAccess`를 통한 에이전트 수준 접근 제어 +- 기본값: 각 에이전트가 `global` + 자체 `agent:` 스코프에 접근 + +### Auto-Capture 및 Auto-Recall + +- **Auto-Capture** (`agent_end`): 대화에서 선호도/사실/결정/엔티티를 추출, 중복 제거, 턴당 최대 3개 저장 +- **Auto-Recall** (`before_agent_start`): `` 컨텍스트 주입 (최대 3개 항목) + +### 노이즈 필터링 및 적응형 검색 + +- 저품질 콘텐츠 필터링: 에이전트 거절, 메타 질문, 인사 +- 인사, 슬래시 명령어, 간단한 확인, 이모지에 대해서는 검색 건너뜀 +- 기억 키워드에 대해서는 검색 강제 실행 ("기억해", "이전에", "지난번에") +- CJK 인식 임계값 (중국어: 6자 vs 영어: 15자) + +--- + +
+내장 memory-lancedb와의 비교 (클릭하여 펼치기) + +| 기능 | 내장 `memory-lancedb` | **memory-lancedb-pro** | +| --- | :---: | :---: | +| 벡터 검색 | 예 | 예 | +| BM25 전문 검색 | - | 예 | +| 하이브리드 퓨전 (벡터 + BM25) | - | 예 | +| Cross-Encoder 리랭크 (멀티 프로바이더) | - | 예 | +| 최신성 부스트 및 시간 감쇠 | - | 예 | +| 길이 정규화 | - | 예 | +| MMR 다양성 | - | 예 | +| 멀티 스코프 격리 | - | 예 | +| 노이즈 필터링 | - | 예 | +| 적응형 검색 | - | 예 | +| 관리 CLI | - | 예 | +| 세션 메모리 | - | 예 | +| 태스크 인식 임베딩 | - | 예 | +| **LLM Smart Extraction (6개 카테고리)** | - | 예 (v1.1.0) | +| **Weibull 감쇠 + 단계 승격** | - | 예 (v1.1.0) | +| OpenAI 호환 임베딩 | 제한적 | 예 | + +
+ +--- + +## 설정 + +
+전체 설정 예시 + +```json +{ + "embedding": { + "apiKey": "${JINA_API_KEY}", + "model": "jina-embeddings-v5-text-small", + "baseURL": "https://api.jina.ai/v1", + "dimensions": 1024, + "taskQuery": "retrieval.query", + "taskPassage": "retrieval.passage", + "normalized": true + }, + "dbPath": "~/.openclaw/memory/lancedb-pro", + "autoCapture": true, + "autoRecall": true, + "retrieval": { + "mode": "hybrid", + "vectorWeight": 0.7, + "bm25Weight": 0.3, + "minScore": 0.3, + "rerank": "cross-encoder", + "rerankApiKey": "${JINA_API_KEY}", + "rerankModel": "jina-reranker-v3", + "rerankEndpoint": "https://api.jina.ai/v1/rerank", + "rerankProvider": "jina", + "candidatePoolSize": 20, + "recencyHalfLifeDays": 14, + "recencyWeight": 0.1, + "filterNoise": true, + "lengthNormAnchor": 500, + "hardMinScore": 0.35, + "timeDecayHalfLifeDays": 60, + "reinforcementFactor": 0.5, + "maxHalfLifeMultiplier": 3 + }, + "enableManagementTools": false, + "scopes": { + "default": "global", + "definitions": { + "global": { "description": "Shared knowledge" }, + "agent:discord-bot": { "description": "Discord bot private" } + }, + "agentAccess": { + "discord-bot": ["global", "agent:discord-bot"] + } + }, + "sessionMemory": { + "enabled": false, + "messageCount": 15 + }, + "smartExtraction": true, + "llm": { + "apiKey": "${OPENAI_API_KEY}", + "model": "gpt-4o-mini", + "baseURL": "https://api.openai.com/v1" + }, + "extractMinMessages": 2, + "extractMaxChars": 8000 +} +``` + +
+ +
+임베딩 프로바이더 + +**OpenAI 호환 임베딩 API**와 모두 동작합니다: + +| 프로바이더 | 모델 | Base URL | 차원 | +| --- | --- | --- | --- | +| **Jina** (권장) | `jina-embeddings-v5-text-small` | `https://api.jina.ai/v1` | 1024 | +| **OpenAI** | `text-embedding-3-small` | `https://api.openai.com/v1` | 1536 | +| **Voyage** | `voyage-4-lite` / `voyage-4` | `https://api.voyageai.com/v1` | 1024 / 1024 | +| **Google Gemini** | `gemini-embedding-001` | `https://generativelanguage.googleapis.com/v1beta/openai/` | 3072 | +| **Ollama** (로컬) | `nomic-embed-text` | `http://localhost:11434/v1` | 프로바이더별 상이 | + +
+ +
+리랭크 프로바이더 + +Cross-Encoder 리랭킹은 `rerankProvider`를 통해 여러 프로바이더를 지원합니다: + +| 프로바이더 | `rerankProvider` | 예시 모델 | +| --- | --- | --- | +| **Jina** (기본값) | `jina` | `jina-reranker-v3` | +| **SiliconFlow** (무료 티어 제공) | `siliconflow` | `BAAI/bge-reranker-v2-m3` | +| **Voyage AI** | `voyage` | `rerank-2.5` | +| **Pinecone** | `pinecone` | `bge-reranker-v2-m3` | + +Jina 호환 리랭크 엔드포인트도 사용 가능합니다 — `rerankProvider: "jina"`로 설정하고 `rerankEndpoint`를 해당 서비스로 지정하세요 (예: Hugging Face TEI, DashScope `qwen3-rerank`). + +
+ +
+Smart Extraction (LLM) — v1.1.0 + +`smartExtraction`이 활성화되면 (기본값: `true`), 플러그인이 정규식 기반 트리거 대신 LLM을 사용하여 기억을 지능적으로 추출하고 분류합니다. + +| 필드 | 타입 | 기본값 | 설명 | +|-------|------|---------|-------------| +| `smartExtraction` | boolean | `true` | LLM 기반 6개 카테고리 추출 활성화/비활성화 | +| `llm.auth` | string | `api-key` | `api-key`는 `llm.apiKey` / `embedding.apiKey`를 사용; `oauth`는 기본적으로 플러그인 범위의 OAuth 토큰 파일을 사용 | +| `llm.apiKey` | string | *(`embedding.apiKey`로 폴백)* | LLM 프로바이더용 API 키 | +| `llm.model` | string | `openai/gpt-oss-120b` | LLM 모델명 | +| `llm.baseURL` | string | *(`embedding.baseURL`로 폴백)* | LLM API 엔드포인트 | +| `llm.oauthProvider` | string | `openai-codex` | `llm.auth`가 `oauth`일 때 사용되는 OAuth 프로바이더 ID | +| `llm.oauthPath` | string | `~/.openclaw/.memory-lancedb-pro/oauth.json` | `llm.auth`가 `oauth`일 때 사용되는 OAuth 토큰 파일 | +| `llm.timeoutMs` | number | `30000` | LLM 요청 타임아웃 (밀리초) | +| `extractMinMessages` | number | `2` | 추출이 시작되는 최소 메시지 수 | +| `extractMaxChars` | number | `8000` | LLM에 전송되는 최대 문자 수 | + + +OAuth `llm` 설정 (기존 Codex / ChatGPT 로그인 캐시를 LLM 호출에 사용): +```json +{ + "llm": { + "auth": "oauth", + "oauthProvider": "openai-codex", + "model": "gpt-5.4", + "oauthPath": "${HOME}/.openclaw/.memory-lancedb-pro/oauth.json", + "timeoutMs": 30000 + } +} +``` + +`llm.auth: "oauth"` 참고사항: + +- `llm.oauthProvider`는 현재 `openai-codex`입니다. +- OAuth 토큰은 기본적으로 `~/.openclaw/.memory-lancedb-pro/oauth.json`에 저장됩니다. +- 파일을 다른 곳에 저장하려면 `llm.oauthPath`를 설정하세요. +- `auth login`은 OAuth 파일 옆에 이전 api-key `llm` 설정의 스냅샷을 저장하며, `auth logout`은 해당 스냅샷이 있을 때 복원합니다. +- `api-key`에서 `oauth`로 전환할 때 `llm.baseURL`이 자동으로 이전되지 않습니다. OAuth 모드에서 의도적으로 사용자 정의 ChatGPT/Codex 호환 백엔드를 원하는 경우에만 수동으로 설정하세요. + +
+ +
+라이프사이클 설정 (감쇠 + 단계) + +| 필드 | 기본값 | 설명 | +|-------|---------|-------------| +| `decay.recencyHalfLifeDays` | `30` | Weibull 최신성 감쇠의 기본 반감기 | +| `decay.frequencyWeight` | `0.3` | 복합 점수에서 접근 빈도의 가중치 | +| `decay.intrinsicWeight` | `0.3` | `importance × confidence`의 가중치 | +| `decay.betaCore` | `0.8` | `core` 기억의 Weibull 베타 | +| `decay.betaWorking` | `1.0` | `working` 기억의 Weibull 베타 | +| `decay.betaPeripheral` | `1.3` | `peripheral` 기억의 Weibull 베타 | +| `tier.coreAccessThreshold` | `10` | `core`로 승격하기 위한 최소 호출 횟수 | +| `tier.peripheralAgeDays` | `60` | 오래된 기억을 강등하기 위한 경과 일수 임계값 | + +
+ +
+접근 강화 + +자주 불러오는 기억은 더 느리게 감쇠합니다 (간격 반복 학습 방식). + +설정 키 (`retrieval` 하위): +- `reinforcementFactor` (0-2, 기본값: `0.5`) — `0`으로 설정하면 비활성화 +- `maxHalfLifeMultiplier` (1-10, 기본값: `3`) — 유효 반감기의 하드 캡 + +
+ +--- + +## CLI 명령어 + +```bash +openclaw memory-pro list [--scope global] [--category fact] [--limit 20] [--json] +openclaw memory-pro search "query" [--scope global] [--limit 10] [--json] +openclaw memory-pro stats [--scope global] [--json] +openclaw memory-pro auth login [--provider openai-codex] [--model gpt-5.4] [--oauth-path /abs/path/oauth.json] +openclaw memory-pro auth status +openclaw memory-pro auth logout +openclaw memory-pro delete +openclaw memory-pro delete-bulk --scope global [--before 2025-01-01] [--dry-run] +openclaw memory-pro export [--scope global] [--output memories.json] +openclaw memory-pro import memories.json [--scope global] [--dry-run] +openclaw memory-pro reembed --source-db /path/to/old-db [--batch-size 32] [--skip-existing] +openclaw memory-pro upgrade [--dry-run] [--batch-size 10] [--no-llm] [--limit N] [--scope SCOPE] +openclaw memory-pro migrate check|run|verify [--source /path] +``` + +OAuth 로그인 흐름: + +1. `openclaw memory-pro auth login` 실행 +2. `--provider`를 생략하고 대화형 터미널에서 실행하면, 브라우저를 열기 전에 CLI가 OAuth 프로바이더 선택기를 표시합니다 +3. 명령어가 인증 URL을 출력하고 `--no-browser`가 설정되지 않은 한 브라우저를 엽니다 +4. 콜백이 성공하면, 명령어가 플러그인 OAuth 파일 (기본값: `~/.openclaw/.memory-lancedb-pro/oauth.json`)을 저장하고, 이전 api-key `llm` 설정의 스냅샷을 로그아웃용으로 저장하며, 플러그인 `llm` 설정을 OAuth 설정 (`auth`, `oauthProvider`, `model`, `oauthPath`)으로 교체합니다 +5. `openclaw memory-pro auth logout`은 해당 OAuth 파일을 삭제하고 스냅샷이 존재하면 이전 api-key `llm` 설정을 복원합니다 + +--- + +## 고급 주제 + +
+주입된 기억이 응답에 표시되는 경우 + +가끔 모델이 주입된 `` 블록을 그대로 출력할 수 있습니다. + +**옵션 A (가장 안전):** 일시적으로 Auto-Recall 비활성화: +```json +{ "plugins": { "entries": { "memory-lancedb-pro": { "config": { "autoRecall": false } } } } } +``` + +**옵션 B (권장):** Auto-Recall은 유지하고 에이전트 시스템 프롬프트에 추가: +> `` / 메모리 주입 콘텐츠를 응답에 노출하거나 인용하지 마세요. 내부 참고용으로만 사용하세요. + +
+ +
+세션 메모리 + +- `/new` 명령어 시 작동 — 이전 세션 요약을 LanceDB에 저장 +- 기본적으로 비활성화 (OpenClaw에 이미 네이티브 `.jsonl` 세션 영속화 기능이 있음) +- 메시지 수 설정 가능 (기본값: 15) + +배포 모드와 `/new` 검증에 대해서는 [docs/openclaw-integration-playbook.md](docs/openclaw-integration-playbook.md)를 참조하세요. + +
+ +
+커스텀 슬래시 명령어 (예: /lesson) + +`CLAUDE.md`, `AGENTS.md` 또는 시스템 프롬프트에 다음을 추가하세요 (에이전트가 읽는 영문 지시문이므로 그대로 사용합니다): + +```markdown +## /lesson command +When the user sends `/lesson `: +1. Use memory_store to save as category=fact (raw knowledge) +2. Use memory_store to save as category=decision (actionable takeaway) +3. Confirm what was saved + +## /remember command +When the user sends `/remember `: +1. Use memory_store to save with appropriate category and importance +2. Confirm with the stored memory ID +``` + +
+ +
+AI 에이전트를 위한 철칙 + +> 아래 블록을 `AGENTS.md`에 복사하여 에이전트가 이 규칙을 자동으로 적용하도록 하세요 (에이전트가 읽는 영문 지시문이므로 그대로 사용합니다). + +```markdown +## Rule 1 — Dual-layer memory storage +Every pitfall/lesson learned → IMMEDIATELY store TWO memories: +- Technical layer: Pitfall: [symptom]. Cause: [root cause]. Fix: [solution]. Prevention: [how to avoid] + (category: fact, importance >= 0.8) +- Principle layer: Decision principle ([tag]): [behavioral rule]. Trigger: [when]. Action: [what to do] + (category: decision, importance >= 0.85) + +## Rule 2 — LanceDB hygiene +Entries must be short and atomic (< 500 chars). No raw conversation summaries or duplicates. + +## Rule 3 — Recall before retry +On ANY tool failure, ALWAYS memory_recall with relevant keywords BEFORE retrying. + +## Rule 4 — Confirm target codebase +Confirm you are editing memory-lancedb-pro vs built-in memory-lancedb before changes. + +## Rule 5 — Clear jiti cache after plugin code changes +After modifying .ts files under plugins/, MUST run rm -rf /tmp/jiti/ BEFORE openclaw gateway restart. +``` + +
+ +
+데이터베이스 스키마 + +LanceDB 테이블 `memories`: + +| 필드 | 타입 | 설명 | +| --- | --- | --- | +| `id` | string (UUID) | 기본 키 | +| `text` | string | 기억 텍스트 (FTS 인덱싱됨) | +| `vector` | float[] | 임베딩 벡터 | +| `category` | string | 저장 카테고리: `preference` / `fact` / `decision` / `entity` / `reflection` / `other` | +| `scope` | string | 스코프 식별자 (예: `global`, `agent:main`) | +| `importance` | float | 중요도 점수 0-1 | +| `timestamp` | int64 | 생성 타임스탬프 (ms) | +| `metadata` | string (JSON) | 확장 메타데이터 | + +v1.1.0의 주요 `metadata` 키: `l0_abstract`, `l1_overview`, `l2_content`, `memory_category`, `tier`, `access_count`, `confidence`, `last_accessed_at` + +> **카테고리 참고:** 최상위 `category` 필드는 6개 저장 카테고리를 사용합니다. Smart Extraction의 6개 카테고리 의미 라벨 (`profile` / `preferences` / `entities` / `events` / `cases` / `patterns`)은 `metadata.memory_category`에 저장됩니다. + +
+ +
+문제 해결 + +### "Cannot mix BigInt and other types" (LanceDB / Apache Arrow) + +LanceDB 0.26 이상에서 일부 숫자 열이 `BigInt`로 반환될 수 있습니다. **memory-lancedb-pro >= 1.0.14**로 업그레이드하세요 — 이 플러그인은 이제 산술 연산 전에 `Number(...)`를 사용하여 값을 변환합니다. + +
+ +--- + +## 문서 + +| 문서 | 설명 | +| --- | --- | +| [OpenClaw 통합 플레이북](docs/openclaw-integration-playbook.md) | 배포 모드, 검증, 회귀 매트릭스 | +| [메모리 아키텍처 분석](docs/memory_architecture_analysis.md) | 전체 아키텍처 심층 분석 | +| [CHANGELOG v1.1.0](docs/CHANGELOG-v1.1.0.md) | v1.1.0 동작 변경사항 및 업그레이드 근거 | +| [장문 컨텍스트 청킹](docs/long-context-chunking.md) | 긴 문서를 위한 청킹 전략 | + +--- + +## 베타: Smart Memory v1.1.0 + +> 상태: 베타 — `npm i memory-lancedb-pro@beta`로 사용 가능. `latest`를 사용하는 안정 버전 사용자는 영향 없음. + +| 기능 | 설명 | +|---------|-------------| +| **Smart Extraction** | LLM 기반 6개 카테고리 추출 + L0/L1/L2 메타데이터. 비활성화 시 정규식으로 폴백. | +| **라이프사이클 스코어링** | 검색에 Weibull 감쇠 통합 — 높은 빈도와 높은 중요도의 기억이 상위에 랭크. | +| **단계 관리** | 3단계 시스템 (Core → Working → Peripheral), 자동 승격/강등. | + +피드백: [GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issues) · 되돌리기: `npm i memory-lancedb-pro@latest` + +--- + +## 의존성 + +| 패키지 | 용도 | +| --- | --- | +| `@lancedb/lancedb` ≥0.26.2 | 벡터 데이터베이스 (ANN + FTS) | +| `openai` ≥6.21.0 | OpenAI 호환 Embedding API 클라이언트 | +| `@sinclair/typebox` 0.34.48 | JSON Schema 타입 정의 | + +--- + +## Contributors + +

+@win4r +@kctony +@Akatsuki-Ryu +@JasonSuz +@Minidoracat +@furedericca-lab +@joe2643 +@AliceLJY +@chenjiyong +

+ +Full list: [Contributors](https://github.com/CortexReach/memory-lancedb-pro/graphs/contributors) + +## Star History + + + + + + Star History Chart + + + +## 라이선스 + +MIT + +--- + +## WeChat QR 코드 + + diff --git a/README_PT-BR.md b/README_PT-BR.md new file mode 100644 index 00000000..65d721f8 --- /dev/null +++ b/README_PT-BR.md @@ -0,0 +1,773 @@ +
+ +# 🧠 memory-lancedb-pro · 🦞OpenClaw Plugin + +**Assistente de Memória IA para Agentes [OpenClaw](https://github.com/openclaw/openclaw)** + +*Dê ao seu agente de IA um cérebro que realmente lembra — entre sessões, entre agentes, ao longo do tempo.* + +Um plugin de memória de longo prazo para OpenClaw baseado em LanceDB que armazena preferências, decisões e contexto de projetos, e os recupera automaticamente em sessões futuras. + +[![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-blue)](https://github.com/openclaw/openclaw) +[![npm version](https://img.shields.io/npm/v/memory-lancedb-pro)](https://www.npmjs.com/package/memory-lancedb-pro) +[![LanceDB](https://img.shields.io/badge/LanceDB-Vectorstore-orange)](https://lancedb.com) +[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) + +[English](README.md) | [简体中文](README_CN.md) | [繁體中文](README_TW.md) | [日本語](README_JA.md) | [한국어](README_KO.md) | [Français](README_FR.md) | [Español](README_ES.md) | [Deutsch](README_DE.md) | [Italiano](README_IT.md) | [Русский](README_RU.md) | [Português (Brasil)](README_PT-BR.md) + +
+ +--- + +## Por que memory-lancedb-pro? + +A maioria dos agentes de IA sofre de amnésia. Eles esquecem tudo no momento em que você inicia um novo chat. + +**memory-lancedb-pro** é um plugin de memória de longo prazo de nível de produção para OpenClaw que transforma seu agente em um verdadeiro **Assistente de Memória IA** — captura automaticamente o que importa, deixa o ruído desaparecer naturalmente e recupera a memória certa no momento certo. Sem tags manuais, sem dores de cabeça com configuração. + +### Seu Assistente de Memória IA em ação + +**Sem memória — cada sessão começa do zero:** + +> **Você:** "Use tabs para indentação, sempre adicione tratamento de erros." +> *(próxima sessão)* +> **Você:** "Eu já te disse — tabs, não espaços!" 😤 +> *(próxima sessão)* +> **Você:** "…sério, tabs. E tratamento de erros. De novo." + +**Com memory-lancedb-pro — seu agente aprende e lembra:** + +> **Você:** "Use tabs para indentação, sempre adicione tratamento de erros." +> *(próxima sessão — agente recupera automaticamente suas preferências)* +> **Agente:** *(aplica silenciosamente tabs + tratamento de erros)* ✅ +> **Você:** "Por que escolhemos PostgreSQL em vez de MongoDB no mês passado?" +> **Agente:** "Com base na nossa discussão de 12 de fevereiro, os principais motivos foram…" ✅ + +Essa é a diferença que um **Assistente de Memória IA** faz — aprende seu estilo, lembra decisões passadas e entrega respostas personalizadas sem você precisar se repetir. + +### O que mais ele pode fazer? + +| | O que você obtém | +|---|---| +| **Auto-Capture** | Seu agente aprende de cada conversa — sem necessidade de `memory_store` manual | +| **Extração inteligente** | Classificação LLM em 6 categorias: perfis, preferências, entidades, eventos, casos, padrões | +| **Esquecimento inteligente** | Modelo de decaimento Weibull — memórias importantes permanecem, ruído desaparece | +| **Busca híbrida** | Busca vetorial + BM25 full-text, fundida com reranking cross-encoder | +| **Injeção de contexto** | Memórias relevantes aparecem automaticamente antes de cada resposta | +| **Isolamento multi-scope** | Limites de memória por agente, por usuário, por projeto | +| **Qualquer provedor** | OpenAI, Jina, Gemini, Ollama ou qualquer API compatível com OpenAI | +| **Toolkit completo** | CLI, backup, migração, upgrade, exportação/importação — pronto para produção | + +--- + +## Início rápido + +### Opção A: Script de instalação com um clique (recomendado) + +O **[script de instalação](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** mantido pela comunidade gerencia instalação, atualização e reparo em um único comando: + +```bash +curl -fsSL https://raw.githubusercontent.com/CortexReach/toolbox/main/memory-lancedb-pro-setup/setup-memory.sh -o setup-memory.sh +bash setup-memory.sh +``` + +> Veja [Ecossistema](#ecossistema) abaixo para a lista completa de cenários cobertos e outras ferramentas da comunidade. + +### Opção B: Instalação manual + +**Via OpenClaw CLI (recomendado):** +```bash +openclaw plugins install memory-lancedb-pro@beta +``` + +**Ou via npm:** +```bash +npm i memory-lancedb-pro@beta +``` +> Se usar npm, você também precisará adicionar o diretório de instalação do plugin como caminho **absoluto** em `plugins.load.paths` no seu `openclaw.json`. Este é o problema de configuração mais comum. + +Adicione ao seu `openclaw.json`: + +```json +{ + "plugins": { + "slots": { "memory": "memory-lancedb-pro" }, + "entries": { + "memory-lancedb-pro": { + "enabled": true, + "config": { + "embedding": { + "provider": "openai-compatible", + "apiKey": "${OPENAI_API_KEY}", + "model": "text-embedding-3-small" + }, + "autoCapture": true, + "autoRecall": true, + "smartExtraction": true, + "extractMinMessages": 2, + "extractMaxChars": 8000, + "sessionMemory": { "enabled": false } + } + } + } + } +} +``` + +**Por que esses valores padrão?** +- `autoCapture` + `smartExtraction` → seu agente aprende automaticamente de cada conversa +- `autoRecall` → memórias relevantes são injetadas antes de cada resposta +- `extractMinMessages: 2` → a extração é acionada em chats normais de dois turnos +- `sessionMemory.enabled: false` → evita poluir a busca com resumos de sessão no início + +Valide e reinicie: + +```bash +openclaw config validate +openclaw gateway restart +openclaw logs --follow --plain | grep "memory-lancedb-pro" +``` + +Você deve ver: +- `memory-lancedb-pro: smart extraction enabled` +- `memory-lancedb-pro@...: plugin registered` + +Pronto! Seu agente agora tem memória de longo prazo. + +
+Mais caminhos de instalação (usuários existentes, upgrades) + +**Já está usando OpenClaw?** + +1. Adicione o plugin com um caminho **absoluto** em `plugins.load.paths` +2. Vincule o slot de memória: `plugins.slots.memory = "memory-lancedb-pro"` +3. Verifique: `openclaw plugins info memory-lancedb-pro && openclaw memory-pro stats` + +**Atualizando de versões anteriores ao v1.1.0?** + +```bash +# 1) Backup +openclaw memory-pro export --scope global --output memories-backup.json +# 2) Dry run +openclaw memory-pro upgrade --dry-run +# 3) Run upgrade +openclaw memory-pro upgrade +# 4) Verify +openclaw memory-pro stats +``` + +Veja `CHANGELOG-v1.1.0.md` para mudanças de comportamento e justificativa de upgrade. + +
+ +
+Importação rápida via Telegram Bot (clique para expandir) + +Se você está usando a integração Telegram do OpenClaw, a maneira mais fácil é enviar um comando de importação diretamente para o Bot principal em vez de editar a configuração manualmente. + +Envie esta mensagem: + +```text +Help me connect this memory plugin with the most user-friendly configuration: https://github.com/CortexReach/memory-lancedb-pro + +Requirements: +1. Set it as the only active memory plugin +2. Use Jina for embedding +3. Use Jina for reranker +4. Use gpt-4o-mini for the smart-extraction LLM +5. Enable autoCapture, autoRecall, smartExtraction +6. extractMinMessages=2 +7. sessionMemory.enabled=false +8. captureAssistant=false +9. retrieval mode=hybrid, vectorWeight=0.7, bm25Weight=0.3 +10. rerank=cross-encoder, candidatePoolSize=12, minScore=0.6, hardMinScore=0.62 +11. Generate the final openclaw.json config directly, not just an explanation +``` + +
+ +--- + +## Ecossistema + +memory-lancedb-pro é o plugin principal. A comunidade construiu ferramentas ao redor dele para tornar a configuração e o uso diário ainda mais suaves: + +### Script de instalação — Instalação, atualização e reparo com um clique + +> **[CortexReach/toolbox/memory-lancedb-pro-setup](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** + +Não é apenas um instalador simples — o script lida inteligentemente com diversos cenários reais: + +| Sua situação | O que o script faz | +|---|---| +| Nunca instalou | Download → instalar dependências → escolher config → gravar em openclaw.json → reiniciar | +| Instalado via `git clone`, preso em um commit antigo | `git fetch` + `checkout` automático para a versão mais recente → reinstalar dependências → verificar | +| Config tem campos inválidos | Detecção automática via filtro de schema, remoção de campos não suportados | +| Instalado via `npm` | Pula atualização git, lembra de executar `npm update` por conta própria | +| CLI `openclaw` quebrado por config inválida | Fallback: ler caminho do workspace diretamente do arquivo `openclaw.json` | +| `extensions/` em vez de `plugins/` | Detecção automática da localização do plugin a partir da config ou sistema de arquivos | +| Já está atualizado | Executa apenas verificações de saúde, sem alterações | + +```bash +bash setup-memory.sh # Install or upgrade +bash setup-memory.sh --dry-run # Preview only +bash setup-memory.sh --beta # Include pre-release versions +bash setup-memory.sh --uninstall # Revert config and remove plugin +``` + +Presets de provedores integrados: **Jina / DashScope / SiliconFlow / OpenAI / Ollama**, ou traga sua própria API compatível com OpenAI. Para uso completo (incluindo `--ref`, `--selfcheck-only` e mais), veja o [README do script de instalação](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup). + +### Claude Code / OpenClaw Skill — Configuração guiada por IA + +> **[CortexReach/memory-lancedb-pro-skill](https://github.com/CortexReach/memory-lancedb-pro-skill)** + +Instale esta Skill e seu agente de IA (Claude Code ou OpenClaw) ganha conhecimento profundo de todas as funcionalidades do memory-lancedb-pro. Basta dizer **"me ajude a ativar a melhor configuração"** e obtenha: + +- **Workflow de configuração guiado em 7 etapas** com 4 planos de implantação: + - Full Power (Jina + OpenAI) / Budget (reranker SiliconFlow gratuito) / Simple (apenas OpenAI) / Totalmente local (Ollama, custo API zero) +- **Todas as 9 ferramentas MCP** usadas corretamente: `memory_recall`, `memory_store`, `memory_forget`, `memory_update`, `memory_stats`, `memory_list`, `self_improvement_log`, `self_improvement_extract_skill`, `self_improvement_review` *(o toolkit completo requer `enableManagementTools: true` — a configuração padrão do Quick Start expõe as 4 ferramentas principais)* +- **Prevenção de armadilhas comuns**: ativação de plugin workspace, `autoRecall` padrão false, cache jiti, variáveis de ambiente, isolamento de scope, etc. + +**Instalação para Claude Code:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.claude/skills/memory-lancedb-pro +``` + +**Instalação para OpenClaw:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.openclaw/workspace/skills/memory-lancedb-pro-skill +``` + +--- + +## Tutorial em vídeo + +> Guia completo: instalação, configuração e funcionamento interno da busca híbrida. + +[![YouTube Video](https://img.shields.io/badge/YouTube-Watch%20Now-red?style=for-the-badge&logo=youtube)](https://youtu.be/MtukF1C8epQ) +**https://youtu.be/MtukF1C8epQ** + +[![Bilibili Video](https://img.shields.io/badge/Bilibili-Watch%20Now-00A1D6?style=for-the-badge&logo=bilibili&logoColor=white)](https://www.bilibili.com/video/BV1zUf2BGEgn/) +**https://www.bilibili.com/video/BV1zUf2BGEgn/** + +--- + +## Arquitetura + +``` +┌─────────────────────────────────────────────────────────┐ +│ index.ts (Entry Point) │ +│ Plugin Registration · Config Parsing · Lifecycle Hooks │ +└────────┬──────────┬──────────┬──────────┬───────────────┘ + │ │ │ │ + ┌────▼───┐ ┌────▼───┐ ┌───▼────┐ ┌──▼──────────┐ + │ store │ │embedder│ │retriever│ │ scopes │ + │ .ts │ │ .ts │ │ .ts │ │ .ts │ + └────────┘ └────────┘ └────────┘ └─────────────┘ + │ │ + ┌────▼───┐ ┌─────▼──────────┐ + │migrate │ │noise-filter.ts │ + │ .ts │ │adaptive- │ + └────────┘ │retrieval.ts │ + └────────────────┘ + ┌─────────────┐ ┌──────────┐ + │ tools.ts │ │ cli.ts │ + │ (Agent API) │ │ (CLI) │ + └─────────────┘ └──────────┘ +``` + +> Para um mergulho profundo na arquitetura completa, veja [docs/memory_architecture_analysis.md](docs/memory_architecture_analysis.md). + +
+Referência de arquivos (clique para expandir) + +| Arquivo | Finalidade | +| --- | --- | +| `index.ts` | Ponto de entrada do plugin. Registra na API de Plugin do OpenClaw, analisa config, monta lifecycle hooks | +| `openclaw.plugin.json` | Metadados do plugin + declaração completa de config via JSON Schema | +| `cli.ts` | Comandos CLI: `memory-pro list/search/stats/delete/delete-bulk/export/import/reembed/upgrade/migrate` | +| `src/store.ts` | Camada de armazenamento LanceDB. Criação de tabelas / Indexação FTS / Busca vetorial / Busca BM25 / CRUD | +| `src/embedder.ts` | Abstração de embedding. Compatível com qualquer provedor de API compatível com OpenAI | +| `src/retriever.ts` | Motor de busca híbrida. Vector + BM25 → Fusão Híbrida → Rerank → Decaimento do Ciclo de Vida → Filtro | +| `src/scopes.ts` | Controle de acesso multi-scope | +| `src/tools.ts` | Definições de ferramentas do agente: `memory_recall`, `memory_store`, `memory_forget`, `memory_update` + ferramentas de gerenciamento | +| `src/noise-filter.ts` | Filtra recusas do agente, meta-perguntas, saudações e conteúdo de baixa qualidade | +| `src/adaptive-retrieval.ts` | Determina se uma consulta precisa de busca na memória | +| `src/migrate.ts` | Migração do `memory-lancedb` integrado para o Pro | +| `src/smart-extractor.ts` | Extração LLM em 6 categorias com armazenamento em camadas L0/L1/L2 e deduplicação em dois estágios | +| `src/decay-engine.ts` | Modelo de decaimento exponencial esticado Weibull | +| `src/tier-manager.ts` | Promoção/rebaixamento em três níveis: Peripheral ↔ Working ↔ Core | + +
+ +--- + +## Funcionalidades principais + +### Busca híbrida + +``` +Query → embedQuery() ─┐ + ├─→ Hybrid Fusion → Rerank → Lifecycle Decay Boost → Length Norm → Filter +Query → BM25 FTS ─────┘ +``` + +- **Busca vetorial** — similaridade semântica via LanceDB ANN (distância cosseno) +- **Busca full-text BM25** — correspondência exata de palavras-chave via índice FTS do LanceDB +- **Fusão híbrida** — pontuação vetorial como base, resultados BM25 recebem boost ponderado (não é RRF padrão — ajustado para qualidade de recall no mundo real) +- **Pesos configuráveis** — `vectorWeight`, `bm25Weight`, `minScore` + +### Reranking Cross-Encoder + +- Adaptadores integrados para **Jina**, **SiliconFlow**, **Voyage AI** e **Pinecone** +- Compatível com qualquer endpoint compatível com Jina (ex.: Hugging Face TEI, DashScope) +- Pontuação híbrida: 60% cross-encoder + 40% pontuação fundida original +- Degradação elegante: fallback para similaridade cosseno em caso de falha da API + +### Pipeline de pontuação multi-estágio + +| Estágio | Efeito | +| --- | --- | +| **Fusão híbrida** | Combina recall semântico e correspondência exata | +| **Rerank Cross-Encoder** | Promove resultados semanticamente precisos | +| **Boost de decaimento do ciclo de vida** | Frescor Weibull + frequência de acesso + importância × confiança | +| **Normalização de comprimento** | Impede que entradas longas dominem (âncora: 500 caracteres) | +| **Pontuação mínima rígida** | Remove resultados irrelevantes (padrão: 0.35) | +| **Diversidade MMR** | Similaridade cosseno > 0.85 → rebaixado | + +### Extração inteligente de memória (v1.1.0) + +- **Extração LLM em 6 categorias**: perfil, preferências, entidades, eventos, casos, padrões +- **Armazenamento em camadas L0/L1/L2**: L0 (índice em uma frase) → L1 (resumo estruturado) → L2 (narrativa completa) +- **Deduplicação em dois estágios**: pré-filtro de similaridade vetorial (≥0.7) → decisão semântica LLM (CREATE/MERGE/SKIP) +- **Fusão consciente de categorias**: `profile` sempre funde, `events`/`cases` apenas adicionam + +### Gerenciamento do ciclo de vida da memória (v1.1.0) + +- **Motor de decaimento Weibull**: pontuação composta = frescor + frequência + valor intrínseco +- **Promoção em três níveis**: `Peripheral ↔ Working ↔ Core` com limiares configuráveis +- **Reforço por acesso**: memórias recuperadas frequentemente decaem mais lentamente (estilo repetição espaçada) +- **Meia-vida modulada pela importância**: memórias importantes decaem mais lentamente + +### Isolamento multi-scope + +- Scopes integrados: `global`, `agent:`, `custom:`, `project:`, `user:` +- Controle de acesso no nível do agente via `scopes.agentAccess` +- Padrão: cada agente acessa `global` + seu próprio scope `agent:` + +### Auto-Capture e Auto-Recall + +- **Auto-Capture** (`agent_end`): extrai preferências/fatos/decisões/entidades das conversas, deduplica, armazena até 3 por turno +- **Auto-Recall** (`before_agent_start`): injeta contexto `` (até 3 entradas) + +### Filtragem de ruído e busca adaptativa + +- Filtra conteúdo de baixa qualidade: recusas do agente, meta-perguntas, saudações +- Pula a busca para: saudações, comandos slash, confirmações simples, emoji +- Força a busca para palavras-chave de memória ("lembra", "anteriormente", "da última vez") +- Limiares CJK (chinês: 6 caracteres vs inglês: 15 caracteres) + +--- + +
+Comparação com o memory-lancedb integrado (clique para expandir) + +| Funcionalidade | `memory-lancedb` integrado | **memory-lancedb-pro** | +| --- | :---: | :---: | +| Busca vetorial | Sim | Sim | +| Busca full-text BM25 | - | Sim | +| Fusão híbrida (Vector + BM25) | - | Sim | +| Rerank cross-encoder (multi-provedor) | - | Sim | +| Boost de frescor e decaimento temporal | - | Sim | +| Normalização de comprimento | - | Sim | +| Diversidade MMR | - | Sim | +| Isolamento multi-scope | - | Sim | +| Filtragem de ruído | - | Sim | +| Busca adaptativa | - | Sim | +| CLI de gerenciamento | - | Sim | +| Memória de sessão | - | Sim | +| Embeddings conscientes de tarefa | - | Sim | +| **Extração inteligente LLM (6 categorias)** | - | Sim (v1.1.0) | +| **Decaimento Weibull + Promoção de nível** | - | Sim (v1.1.0) | +| Qualquer embedding compatível com OpenAI | Limitado | Sim | + +
+ +--- + +## Configuração + +
+Exemplo de configuração completa + +```json +{ + "embedding": { + "apiKey": "${JINA_API_KEY}", + "model": "jina-embeddings-v5-text-small", + "baseURL": "https://api.jina.ai/v1", + "dimensions": 1024, + "taskQuery": "retrieval.query", + "taskPassage": "retrieval.passage", + "normalized": true + }, + "dbPath": "~/.openclaw/memory/lancedb-pro", + "autoCapture": true, + "autoRecall": true, + "retrieval": { + "mode": "hybrid", + "vectorWeight": 0.7, + "bm25Weight": 0.3, + "minScore": 0.3, + "rerank": "cross-encoder", + "rerankApiKey": "${JINA_API_KEY}", + "rerankModel": "jina-reranker-v3", + "rerankEndpoint": "https://api.jina.ai/v1/rerank", + "rerankProvider": "jina", + "candidatePoolSize": 20, + "recencyHalfLifeDays": 14, + "recencyWeight": 0.1, + "filterNoise": true, + "lengthNormAnchor": 500, + "hardMinScore": 0.35, + "timeDecayHalfLifeDays": 60, + "reinforcementFactor": 0.5, + "maxHalfLifeMultiplier": 3 + }, + "enableManagementTools": false, + "scopes": { + "default": "global", + "definitions": { + "global": { "description": "Shared knowledge" }, + "agent:discord-bot": { "description": "Discord bot private" } + }, + "agentAccess": { + "discord-bot": ["global", "agent:discord-bot"] + } + }, + "sessionMemory": { + "enabled": false, + "messageCount": 15 + }, + "smartExtraction": true, + "llm": { + "apiKey": "${OPENAI_API_KEY}", + "model": "gpt-4o-mini", + "baseURL": "https://api.openai.com/v1" + }, + "extractMinMessages": 2, + "extractMaxChars": 8000 +} +``` + +
+ +
+Provedores de Embedding + +Funciona com **qualquer API de embedding compatível com OpenAI**: + +| Provedor | Modelo | Base URL | Dimensões | +| --- | --- | --- | --- | +| **Jina** (recomendado) | `jina-embeddings-v5-text-small` | `https://api.jina.ai/v1` | 1024 | +| **OpenAI** | `text-embedding-3-small` | `https://api.openai.com/v1` | 1536 | +| **Voyage** | `voyage-4-lite` / `voyage-4` | `https://api.voyageai.com/v1` | 1024 / 1024 | +| **Google Gemini** | `gemini-embedding-001` | `https://generativelanguage.googleapis.com/v1beta/openai/` | 3072 | +| **Ollama** (local) | `nomic-embed-text` | `http://localhost:11434/v1` | específico do provedor | + +
+ +
+Provedores de Rerank + +O reranking cross-encoder suporta múltiplos provedores via `rerankProvider`: + +| Provedor | `rerankProvider` | Modelo de exemplo | +| --- | --- | --- | +| **Jina** (padrão) | `jina` | `jina-reranker-v3` | +| **SiliconFlow** (plano gratuito disponível) | `siliconflow` | `BAAI/bge-reranker-v2-m3` | +| **Voyage AI** | `voyage` | `rerank-2.5` | +| **Pinecone** | `pinecone` | `bge-reranker-v2-m3` | + +Qualquer endpoint de rerank compatível com Jina também funciona — defina `rerankProvider: "jina"` e aponte `rerankEndpoint` para seu serviço (ex.: Hugging Face TEI, DashScope `qwen3-rerank`). + +
+ +
+Extração inteligente (LLM) — v1.1.0 + +Quando `smartExtraction` está habilitado (padrão: `true`), o plugin usa um LLM para extrair e classificar memórias de forma inteligente em vez de gatilhos baseados em regex. + +| Campo | Tipo | Padrão | Descrição | +|-------|------|--------|-----------| +| `smartExtraction` | boolean | `true` | Habilitar/desabilitar extração LLM em 6 categorias | +| `llm.auth` | string | `api-key` | `api-key` usa `llm.apiKey` / `embedding.apiKey`; `oauth` usa um arquivo de token OAuth com escopo de plugin por padrão | +| `llm.apiKey` | string | *(fallback para `embedding.apiKey`)* | Chave de API para o provedor LLM | +| `llm.model` | string | `openai/gpt-oss-120b` | Nome do modelo LLM | +| `llm.baseURL` | string | *(fallback para `embedding.baseURL`)* | Endpoint da API LLM | +| `llm.oauthProvider` | string | `openai-codex` | ID do provedor OAuth usado quando `llm.auth` é `oauth` | +| `llm.oauthPath` | string | `~/.openclaw/.memory-lancedb-pro/oauth.json` | Arquivo de token OAuth usado quando `llm.auth` é `oauth` | +| `llm.timeoutMs` | number | `30000` | Timeout da requisição LLM em milissegundos | +| `extractMinMessages` | number | `2` | Mensagens mínimas antes da extração ser acionada | +| `extractMaxChars` | number | `8000` | Máximo de caracteres enviados ao LLM | + + +Configuração `llm` com OAuth (usa cache de login existente do Codex / ChatGPT para chamadas LLM): +```json +{ + "llm": { + "auth": "oauth", + "oauthProvider": "openai-codex", + "model": "gpt-5.4", + "oauthPath": "${HOME}/.openclaw/.memory-lancedb-pro/oauth.json", + "timeoutMs": 30000 + } +} +``` + +Notas para `llm.auth: "oauth"`: + +- `llm.oauthProvider` é atualmente `openai-codex`. +- Tokens OAuth têm como padrão `~/.openclaw/.memory-lancedb-pro/oauth.json`. +- Você pode definir `llm.oauthPath` se quiser armazenar esse arquivo em outro lugar. +- `auth login` faz snapshot da configuração `llm` anterior (api-key) ao lado do arquivo OAuth, e `auth logout` restaura esse snapshot quando disponível. +- Mudar de `api-key` para `oauth` não transfere automaticamente `llm.baseURL`. Defina-o manualmente no modo OAuth apenas quando você intencionalmente quiser um backend personalizado compatível com ChatGPT/Codex. + +
+ +
+Configuração do ciclo de vida (Decaimento + Nível) + +| Campo | Padrão | Descrição | +|-------|--------|-----------| +| `decay.recencyHalfLifeDays` | `30` | Meia-vida base para decaimento de frescor Weibull | +| `decay.frequencyWeight` | `0.3` | Peso da frequência de acesso na pontuação composta | +| `decay.intrinsicWeight` | `0.3` | Peso de `importance × confidence` | +| `decay.betaCore` | `0.8` | Beta Weibull para memórias `core` | +| `decay.betaWorking` | `1.0` | Beta Weibull para memórias `working` | +| `decay.betaPeripheral` | `1.3` | Beta Weibull para memórias `peripheral` | +| `tier.coreAccessThreshold` | `10` | Contagem mínima de recall antes de promover para `core` | +| `tier.peripheralAgeDays` | `60` | Limiar de idade para rebaixar memórias inativas | + +
+ +
+Reforço por acesso + +Memórias recuperadas com frequência decaem mais lentamente (estilo repetição espaçada). + +Chaves de configuração (em `retrieval`): +- `reinforcementFactor` (0-2, padrão: `0.5`) — defina `0` para desabilitar +- `maxHalfLifeMultiplier` (1-10, padrão: `3`) — limite rígido na meia-vida efetiva + +
+ +--- + +## Comandos CLI + +```bash +openclaw memory-pro list [--scope global] [--category fact] [--limit 20] [--json] +openclaw memory-pro search "query" [--scope global] [--limit 10] [--json] +openclaw memory-pro stats [--scope global] [--json] +openclaw memory-pro auth login [--provider openai-codex] [--model gpt-5.4] [--oauth-path /abs/path/oauth.json] +openclaw memory-pro auth status +openclaw memory-pro auth logout +openclaw memory-pro delete +openclaw memory-pro delete-bulk --scope global [--before 2025-01-01] [--dry-run] +openclaw memory-pro export [--scope global] [--output memories.json] +openclaw memory-pro import memories.json [--scope global] [--dry-run] +openclaw memory-pro reembed --source-db /path/to/old-db [--batch-size 32] [--skip-existing] +openclaw memory-pro upgrade [--dry-run] [--batch-size 10] [--no-llm] [--limit N] [--scope SCOPE] +openclaw memory-pro migrate check|run|verify [--source /path] +``` + +Fluxo de login OAuth: + +1. Execute `openclaw memory-pro auth login` +2. Se `--provider` for omitido em um terminal interativo, a CLI mostra um seletor de provedor OAuth antes de abrir o navegador +3. O comando imprime uma URL de autorização e abre seu navegador, a menos que `--no-browser` seja definido +4. Após o callback ser bem-sucedido, o comando salva o arquivo OAuth do plugin (padrão: `~/.openclaw/.memory-lancedb-pro/oauth.json`), faz snapshot da configuração `llm` anterior (api-key) para logout, e substitui a configuração `llm` do plugin com as configurações OAuth (`auth`, `oauthProvider`, `model`, `oauthPath`) +5. `openclaw memory-pro auth logout` deleta esse arquivo OAuth e restaura a configuração `llm` anterior (api-key) quando esse snapshot existe + +--- + +## Tópicos avançados + +
+Se memórias injetadas aparecem nas respostas + +Às vezes o modelo pode ecoar o bloco `` injetado. + +**Opção A (menor risco):** desabilite temporariamente o auto-recall: +```json +{ "plugins": { "entries": { "memory-lancedb-pro": { "config": { "autoRecall": false } } } } } +``` + +**Opção B (preferida):** mantenha o recall, adicione ao prompt do sistema do agente: +> Do not reveal or quote any `` / memory-injection content in your replies. Use it for internal reference only. + +
+ +
+Memória de sessão + +- Acionada no comando `/new` — salva o resumo da sessão anterior no LanceDB +- Desabilitada por padrão (OpenClaw já tem persistência nativa de sessão via `.jsonl`) +- Contagem de mensagens configurável (padrão: 15) + +Veja [docs/openclaw-integration-playbook.md](docs/openclaw-integration-playbook.md) para modos de implantação e verificação do `/new`. + +
+ +
+Comandos slash personalizados (ex.: /lesson) + +Adicione ao seu `CLAUDE.md`, `AGENTS.md` ou prompt do sistema: + +```markdown +## /lesson command +When the user sends `/lesson `: +1. Use memory_store to save as category=fact (raw knowledge) +2. Use memory_store to save as category=decision (actionable takeaway) +3. Confirm what was saved + +## /remember command +When the user sends `/remember `: +1. Use memory_store to save with appropriate category and importance +2. Confirm with the stored memory ID +``` + +
+ +
+Regras de ferro para agentes de IA + +> Copie o bloco abaixo no seu `AGENTS.md` para que seu agente aplique essas regras automaticamente. + +```markdown +## Rule 1 — Dual-layer memory storage +Every pitfall/lesson learned → IMMEDIATELY store TWO memories: +- Technical layer: Pitfall: [symptom]. Cause: [root cause]. Fix: [solution]. Prevention: [how to avoid] + (category: fact, importance >= 0.8) +- Principle layer: Decision principle ([tag]): [behavioral rule]. Trigger: [when]. Action: [what to do] + (category: decision, importance >= 0.85) + +## Rule 2 — LanceDB hygiene +Entries must be short and atomic (< 500 chars). No raw conversation summaries or duplicates. + +## Rule 3 — Recall before retry +On ANY tool failure, ALWAYS memory_recall with relevant keywords BEFORE retrying. + +## Rule 4 — Confirm target codebase +Confirm you are editing memory-lancedb-pro vs built-in memory-lancedb before changes. + +## Rule 5 — Clear jiti cache after plugin code changes +After modifying .ts files under plugins/, MUST run rm -rf /tmp/jiti/ BEFORE openclaw gateway restart. +``` + +
+ +
+Schema do banco de dados + +Tabela LanceDB `memories`: + +| Campo | Tipo | Descrição | +| --- | --- | --- | +| `id` | string (UUID) | Chave primária | +| `text` | string | Texto da memória (indexado FTS) | +| `vector` | float[] | Vetor de embedding | +| `category` | string | Categoria de armazenamento: `preference` / `fact` / `decision` / `entity` / `reflection` / `other` | +| `scope` | string | Identificador de scope (ex.: `global`, `agent:main`) | +| `importance` | float | Pontuação de importância 0-1 | +| `timestamp` | int64 | Timestamp de criação (ms) | +| `metadata` | string (JSON) | Metadados estendidos | + +Chaves `metadata` comuns no v1.1.0: `l0_abstract`, `l1_overview`, `l2_content`, `memory_category`, `tier`, `access_count`, `confidence`, `last_accessed_at` + +> **Nota sobre categorias:** O campo `category` de nível superior usa 6 categorias de armazenamento. As 6 categorias semânticas da Extração Inteligente (`profile` / `preferences` / `entities` / `events` / `cases` / `patterns`) são armazenadas em `metadata.memory_category`. + +
+ +
+Solução de problemas + +### "Cannot mix BigInt and other types" (LanceDB / Apache Arrow) + +No LanceDB 0.26+, algumas colunas numéricas podem ser retornadas como `BigInt`. Atualize para **memory-lancedb-pro >= 1.0.14** — este plugin agora converte valores usando `Number(...)` antes de operações aritméticas. + +
+ +--- + +## Documentação + +| Documento | Descrição | +| --- | --- | +| [Playbook de integração OpenClaw](docs/openclaw-integration-playbook.md) | Modos de implantação, verificação, matriz de regressão | +| [Análise da arquitetura de memória](docs/memory_architecture_analysis.md) | Análise aprofundada da arquitetura completa | +| [CHANGELOG v1.1.0](docs/CHANGELOG-v1.1.0.md) | Mudanças de comportamento v1.1.0 e justificativa de upgrade | +| [Chunking de contexto longo](docs/long-context-chunking.md) | Estratégia de chunking para documentos longos | + +--- + +## Beta: Smart Memory v1.1.0 + +> Status: Beta — disponível via `npm i memory-lancedb-pro@beta`. Usuários estáveis no `latest` não são afetados. + +| Funcionalidade | Descrição | +|---------|-------------| +| **Extração inteligente** | Extração LLM em 6 categorias com metadados L0/L1/L2. Fallback para regex quando desabilitado. | +| **Pontuação do ciclo de vida** | Decaimento Weibull integrado à busca — memórias frequentes e importantes ficam mais bem ranqueadas. | +| **Gerenciamento de níveis** | Sistema de três níveis (Core → Working → Peripheral) com promoção/rebaixamento automático. | + +Feedback: [GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issues) · Reverter: `npm i memory-lancedb-pro@latest` + +--- + +## Dependências + +| Pacote | Finalidade | +| --- | --- | +| `@lancedb/lancedb` ≥0.26.2 | Banco de dados vetorial (ANN + FTS) | +| `openai` ≥6.21.0 | Cliente de API de Embedding compatível com OpenAI | +| `@sinclair/typebox` 0.34.48 | Definições de tipo JSON Schema | + +--- + +## Contributors + +

+@win4r +@kctony +@Akatsuki-Ryu +@JasonSuz +@Minidoracat +@furedericca-lab +@joe2643 +@AliceLJY +@chenjiyong +

+ +Full list: [Contributors](https://github.com/CortexReach/memory-lancedb-pro/graphs/contributors) + +## Star History + + + + + + Star History Chart + + + +## Licença + +MIT + +--- + +## Meu QR Code WeChat + + diff --git a/README_RU.md b/README_RU.md new file mode 100644 index 00000000..8fcb1031 --- /dev/null +++ b/README_RU.md @@ -0,0 +1,773 @@ +
+ +# 🧠 memory-lancedb-pro · 🦞OpenClaw Plugin + +**ИИ-ассистент памяти для агентов [OpenClaw](https://github.com/openclaw/openclaw)** + +*Дайте вашему ИИ-агенту мозг, который действительно помнит: между сессиями, между агентами и с течением времени.* + +Плагин долгосрочной памяти для OpenClaw на базе LanceDB, который сохраняет предпочтения, решения и контекст проекта, а затем автоматически вспоминает их в будущих сессиях. + +[![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-blue)](https://github.com/openclaw/openclaw) +[![npm version](https://img.shields.io/npm/v/memory-lancedb-pro)](https://www.npmjs.com/package/memory-lancedb-pro) +[![LanceDB](https://img.shields.io/badge/LanceDB-Vectorstore-orange)](https://lancedb.com) +[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) + +[English](README.md) | [简体中文](README_CN.md) | [繁體中文](README_TW.md) | [日本語](README_JA.md) | [한국어](README_KO.md) | [Français](README_FR.md) | [Español](README_ES.md) | [Deutsch](README_DE.md) | [Italiano](README_IT.md) | [Русский](README_RU.md) | [Português (Brasil)](README_PT-BR.md) + +
+ +--- + +## Почему memory-lancedb-pro? + +Большинство ИИ-агентов страдают амнезией. Они забывают все, как только вы начинаете новый чат. + +**memory-lancedb-pro** — это production-grade плагин долгосрочной памяти для OpenClaw, который превращает вашего агента в настоящего **ИИ-ассистента памяти**. Он автоматически фиксирует важное, позволяет шуму естественно угасать и поднимает нужное воспоминание в нужный момент. Никаких ручных тегов, никаких мучений с конфигурацией. + +### Как это выглядит на практике + +**Без памяти: каждая сессия начинается с нуля** + +> **Вы:** "Используй табы для отступов и всегда добавляй обработку ошибок." +> *(следующая сессия)* +> **Вы:** "Я же уже говорил: табы, а не пробелы!" 😤 +> *(еще одна сессия)* +> **Вы:** "...серьезно, табы. И обработка ошибок. Снова." + +**С memory-lancedb-pro агент учится и помнит** + +> **Вы:** "Используй табы для отступов и всегда добавляй обработку ошибок." +> *(следующая сессия: агент автоматически вспоминает ваши предпочтения)* +> **Агент:** *(молча применяет табы + обработку ошибок)* ✅ +> **Вы:** "Почему в прошлом месяце мы выбрали PostgreSQL, а не MongoDB?" +> **Агент:** "Судя по нашему обсуждению 12 февраля, основные причины были..." ✅ + +В этом и есть разница: **ИИ-ассистент памяти** изучает ваш стиль, вспоминает прошлые решения и дает персонализированные ответы без необходимости повторять одно и то же. + +### Что еще он умеет? + +| | Что вы получаете | +|---|---| +| **Автозахват** | Агент учится на каждом разговоре, без ручного `memory_store` | +| **Умное извлечение** | Классификация на основе LLM по 6 категориям: профили, предпочтения, сущности, события, кейсы, паттерны | +| **Интеллектуальное забывание** | Модель затухания Weibull: важные воспоминания остаются, шум естественно исчезает | +| **Гибридный поиск** | Векторный поиск + полнотекстовый BM25 с объединением и cross-encoder rerank | +| **Инъекция контекста** | Релевантные воспоминания автоматически подаются перед каждым ответом | +| **Изоляция областей памяти** | Границы памяти на уровне агента, пользователя и проекта | +| **Любой провайдер** | OpenAI, Jina, Gemini, Ollama или любой OpenAI-compatible API | +| **Полный набор инструментов** | CLI, backup, migration, upgrade, export/import — готово к продакшену | + +--- + +## Быстрый старт + +### Вариант A: скрипт установки в один клик (рекомендуется) + +Поддерживаемый сообществом **[скрипт установки](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** берет на себя установку, обновление и восстановление одной командой: + +```bash +curl -fsSL https://raw.githubusercontent.com/CortexReach/toolbox/main/memory-lancedb-pro-setup/setup-memory.sh -o setup-memory.sh +bash setup-memory.sh +``` + +> Полный список сценариев, которые покрывает скрипт, и другие инструменты сообщества смотрите ниже в разделе [Экосистема](#экосистема). + +### Вариант B: ручная установка + +**Через OpenClaw CLI (рекомендуется):** +```bash +openclaw plugins install memory-lancedb-pro@beta +``` + +**Или через npm:** +```bash +npm i memory-lancedb-pro@beta +``` +> Если используете npm, вам также нужно добавить директорию установки плагина как **абсолютный** путь в `plugins.load.paths` вашего `openclaw.json`. Это самая частая проблема при настройке. + +Добавьте в `openclaw.json`: + +```json +{ + "plugins": { + "slots": { "memory": "memory-lancedb-pro" }, + "entries": { + "memory-lancedb-pro": { + "enabled": true, + "config": { + "embedding": { + "provider": "openai-compatible", + "apiKey": "${OPENAI_API_KEY}", + "model": "text-embedding-3-small" + }, + "autoCapture": true, + "autoRecall": true, + "smartExtraction": true, + "extractMinMessages": 2, + "extractMaxChars": 8000, + "sessionMemory": { "enabled": false } + } + } + } + } +} +``` + +**Почему именно такие значения по умолчанию?** +- `autoCapture` + `smartExtraction` → агент автоматически учится на каждом разговоре +- `autoRecall` → релевантные воспоминания подставляются перед каждым ответом +- `extractMinMessages: 2` → извлечение срабатывает в обычном двухходовом диалоге +- `sessionMemory.enabled: false` → поиск не засоряется сводками сессий с первого дня + +Проверьте и перезапустите: + +```bash +openclaw config validate +openclaw gateway restart +openclaw logs --follow --plain | grep "memory-lancedb-pro" +``` + +Вы должны увидеть: +- `memory-lancedb-pro: smart extraction enabled` +- `memory-lancedb-pro@...: plugin registered` + +Готово. Теперь у вашего агента есть долгосрочная память. + +
+Дополнительные варианты установки (для действующих пользователей и апгрейдов) + +**Уже используете OpenClaw?** + +1. Добавьте плагин в `plugins.load.paths` как **абсолютный** путь +2. Привяжите memory slot: `plugins.slots.memory = "memory-lancedb-pro"` +3. Проверьте: `openclaw plugins info memory-lancedb-pro && openclaw memory-pro stats` + +**Обновляетесь с версии до v1.1.0?** + +```bash +# 1) Резервная копия +openclaw memory-pro export --scope global --output memories-backup.json +# 2) Пробный запуск +openclaw memory-pro upgrade --dry-run +# 3) Выполнить апгрейд +openclaw memory-pro upgrade +# 4) Проверка +openclaw memory-pro stats +``` + +Изменения поведения и причины апгрейда описаны в `CHANGELOG-v1.1.0.md`. + +
+ +
+Быстрый импорт для Telegram Bot (нажмите, чтобы раскрыть) + +Если вы используете Telegram-интеграцию OpenClaw, самый простой путь — отправить команду импорта прямо основному боту вместо ручного редактирования конфига. + +Отправьте такое сообщение: + +```text +Help me connect this memory plugin with the most user-friendly configuration: https://github.com/CortexReach/memory-lancedb-pro + +Requirements: +1. Set it as the only active memory plugin +2. Use Jina for embedding +3. Use Jina for reranker +4. Use gpt-4o-mini for the smart-extraction LLM +5. Enable autoCapture, autoRecall, smartExtraction +6. extractMinMessages=2 +7. sessionMemory.enabled=false +8. captureAssistant=false +9. retrieval mode=hybrid, vectorWeight=0.7, bm25Weight=0.3 +10. rerank=cross-encoder, candidatePoolSize=12, minScore=0.6, hardMinScore=0.62 +11. Generate the final openclaw.json config directly, not just an explanation +``` + +
+ +--- + +## Экосистема + +memory-lancedb-pro — это основной плагин. Сообщество построило вокруг него инструменты, чтобы установка и ежедневная работа были еще проще. + +### Скрипт установки: установка, апгрейд и ремонт в один клик + +> **[CortexReach/toolbox/memory-lancedb-pro-setup](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** + +Это не просто установщик: скрипт грамотно обрабатывает широкий набор реальных сценариев. + +| Ваша ситуация | Что делает скрипт | +|---|---| +| Никогда не устанавливали | Скачивает заново → ставит зависимости → помогает выбрать конфиг → записывает в `openclaw.json` → перезапускает | +| Установлено через `git clone`, но застряли на старом коммите | Автоматически делает `git fetch` + `checkout` на актуальную версию → переустанавливает зависимости → проверяет | +| В конфиге есть невалидные поля | Автоматически находит их через schema filter и удаляет неподдерживаемые значения | +| Установлено через `npm` | Пропускает git-обновление и напоминает вручную запустить `npm update` | +| `openclaw` CLI сломан из-за невалидного конфига | Фолбэк: читает путь workspace напрямую из файла `openclaw.json` | +| Используется `extensions/`, а не `plugins/` | Автоматически определяет расположение плагина по конфигу или файловой системе | +| Уже актуальная версия | Запускает только health checks, без изменений | + +```bash +bash setup-memory.sh # Установить или обновить +bash setup-memory.sh --dry-run # Только предпросмотр +bash setup-memory.sh --beta # Включить pre-release версии +bash setup-memory.sh --uninstall # Откатить конфиг и удалить плагин +``` + +Встроенные пресеты провайдеров: **Jina / DashScope / SiliconFlow / OpenAI / Ollama**, либо любой собственный OpenAI-compatible API. Полное использование (включая `--ref`, `--selfcheck-only` и другое) смотрите в [README скрипта установки](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup). + +### Навык Claude Code / OpenClaw: настройка под управлением ИИ + +> **[CortexReach/memory-lancedb-pro-skill](https://github.com/CortexReach/memory-lancedb-pro-skill)** + +Установите этот навык, и ваш ИИ-агент (Claude Code или OpenClaw) получит глубокое знание всех возможностей memory-lancedb-pro. Достаточно сказать **"help me enable the best config"**, и вы получите: + +- **Пошаговый процесс настройки из 7 шагов** с 4 вариантами деплоя: + - Полная мощность (Jina + OpenAI) / Экономный (бесплатный reranker от SiliconFlow) / Простой (только OpenAI) / Полностью локальный (Ollama, нулевая стоимость API) +- **Корректное использование всех 9 инструментов MCP**: `memory_recall`, `memory_store`, `memory_forget`, `memory_update`, `memory_stats`, `memory_list`, `self_improvement_log`, `self_improvement_extract_skill`, `self_improvement_review` *(полный набор доступен при `enableManagementTools: true` — стандартный Quick Start открывает только 4 базовых инструмента)* +- **Защиту от типичных ошибок**: включение плагина в workspace, `autoRecall` со значением false по умолчанию, кэш jiti, переменные окружения, изоляция областей памяти и другое + +**Установка для Claude Code:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.claude/skills/memory-lancedb-pro +``` + +**Установка для OpenClaw:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.openclaw/workspace/skills/memory-lancedb-pro-skill +``` + +--- + +## Видеоруководство + +> Полный разбор: установка, настройка и внутреннее устройство гибридного поиска. + +[![YouTube Video](https://img.shields.io/badge/YouTube-Watch%20Now-red?style=for-the-badge&logo=youtube)](https://youtu.be/MtukF1C8epQ) +**https://youtu.be/MtukF1C8epQ** + +[![Bilibili Video](https://img.shields.io/badge/Bilibili-Watch%20Now-00A1D6?style=for-the-badge&logo=bilibili&logoColor=white)](https://www.bilibili.com/video/BV1zUf2BGEgn/) +**https://www.bilibili.com/video/BV1zUf2BGEgn/** + +--- + +## Архитектура + +``` +┌─────────────────────────────────────────────────────────┐ +│ index.ts (Entry Point) │ +│ Plugin Registration · Config Parsing · Lifecycle Hooks │ +└────────┬──────────┬──────────┬──────────┬───────────────┘ + │ │ │ │ + ┌────▼───┐ ┌────▼───┐ ┌───▼────┐ ┌──▼──────────┐ + │ store │ │embedder│ │retriever│ │ scopes │ + │ .ts │ │ .ts │ │ .ts │ │ .ts │ + └────────┘ └────────┘ └────────┘ └─────────────┘ + │ │ + ┌────▼───┐ ┌─────▼──────────┐ + │migrate │ │noise-filter.ts │ + │ .ts │ │adaptive- │ + └────────┘ │retrieval.ts │ + └────────────────┘ + ┌─────────────┐ ┌──────────┐ + │ tools.ts │ │ cli.ts │ + │ (Agent API) │ │ (CLI) │ + └─────────────┘ └──────────┘ +``` + +> Для глубокого разбора полной архитектуры смотрите [docs/memory_architecture_analysis.md](docs/memory_architecture_analysis.md). + +
+Справочник по файлам (нажмите, чтобы раскрыть) + +| Файл | Назначение | +| --- | --- | +| `index.ts` | Точка входа плагина. Регистрация в API плагинов OpenClaw, разбор конфига, подключение хуков жизненного цикла | +| `openclaw.plugin.json` | Метаданные плагина + полная декларация JSON Schema для конфига | +| `cli.ts` | CLI-команды: `memory-pro list/search/stats/delete/delete-bulk/export/import/reembed/upgrade/migrate` | +| `src/store.ts` | Слой хранения LanceDB. Создание таблиц / FTS-индекс / векторный поиск / BM25-поиск / CRUD | +| `src/embedder.ts` | Абстракция эмбеддингов. Совместима с любым провайдером OpenAI-compatible API | +| `src/retriever.ts` | Движок гибридного поиска. Векторный поиск + BM25 → гибридное объединение → реранжирование → затухание жизненного цикла → фильтрация | +| `src/scopes.ts` | Контроль доступа для нескольких областей памяти | +| `src/tools.ts` | Определения инструментов агента: `memory_recall`, `memory_store`, `memory_forget`, `memory_update` + административные инструменты | +| `src/noise-filter.ts` | Фильтрует отказы агента, мета-вопросы, приветствия и низкокачественный контент | +| `src/adaptive-retrieval.ts` | Определяет, нужен ли конкретному запросу поиск по памяти | +| `src/migrate.ts` | Миграция со встроенного `memory-lancedb` на Pro | +| `src/smart-extractor.ts` | Извлечение по 6 категориям на базе LLM с многослойным хранением L0/L1/L2 и двухэтапной дедупликацией | +| `src/decay-engine.ts` | Модель растянутого экспоненциального затухания Weibull | +| `src/tier-manager.ts` | Трехуровневое продвижение/понижение: Peripheral ↔ Working ↔ Core | + +
+ +--- + +## Ключевые возможности + +### Гибридный поиск + +``` +Query → embedQuery() ─┐ + ├─→ Hybrid Fusion → Rerank → Lifecycle Decay Boost → Length Norm → Filter +Query → BM25 FTS ─────┘ +``` + +- **Векторный поиск** — семантическая близость через LanceDB ANN (cosine distance) +- **Полнотекстовый BM25** — точное совпадение по ключевым словам через LanceDB FTS index +- **Hybrid Fusion** — векторный score служит базой, а BM25-попадания получают взвешенный буст (это не стандартный RRF, а вариант, настроенный под качество реального recall) +- **Настраиваемые веса** — `vectorWeight`, `bm25Weight`, `minScore` + +### Кросс-энкодерное реранжирование + +- Встроенные адаптеры для **Jina**, **SiliconFlow**, **Voyage AI** и **Pinecone** +- Совместимо с любым Jina-compatible endpoint (например, Hugging Face TEI, DashScope) +- Гибридный скоринг: 60% cross-encoder + 40% исходный fused score +- Graceful degradation: при сбое API откатывается к cosine similarity + +### Многоэтапный пайплайн скоринга + +| Этап | Эффект | +| --- | --- | +| **Hybrid Fusion** | Комбинирует семантический recall и точное совпадение | +| **Cross-Encoder Rerank** | Продвигает семантически точные попадания | +| **Lifecycle Decay Boost** | Свежесть по Weibull + частота доступа + важность × уверенность | +| **Length Normalization** | Не дает длинным записям доминировать (anchor: 500 chars) | +| **Hard Min Score** | Убирает нерелевантные результаты (по умолчанию: 0.35) | +| **MMR Diversity** | Cosine similarity > 0.85 → понижается | + +### Умное извлечение памяти (v1.1.0) + +- **LLM-powered извлечение по 6 категориям**: profile, preferences, entities, events, cases, patterns +- **Многослойное хранение L0/L1/L2**: L0 (одно предложение-индекс) → L1 (структурированное summary) → L2 (полный narrative) +- **Двухэтапная дедупликация**: предварительный фильтр по векторному сходству (≥0.7) → LLM-решение по смыслу (CREATE/MERGE/SKIP) +- **Слияние с учетом категории**: `profile` всегда merge, `events` и `cases` добавляются append-only + +### Управление жизненным циклом памяти (v1.1.0) + +- **Weibull Decay Engine**: composite score = recency + frequency + intrinsic value +- **Трехуровневое продвижение**: `Peripheral ↔ Working ↔ Core` с настраиваемыми порогами +- **Усиление при доступе**: часто вспоминаемые записи затухают медленнее (в духе spaced repetition) +- **Half-life с учетом важности**: важные воспоминания живут дольше + +### Изоляция между областями памяти + +- Встроенные области памяти: `global`, `agent:`, `custom:`, `project:`, `user:` +- Контроль доступа агента через `scopes.agentAccess` +- По умолчанию каждый агент видит `global` + собственную область `agent:` + +### Auto-Capture и Auto-Recall + +- **Auto-Capture** (`agent_end`): извлекает preference/fact/decision/entity из диалога, дедуплицирует и сохраняет до 3 записей за ход +- **Auto-Recall** (`before_agent_start`): внедряет контекст `` (до 3 записей) + +### Фильтрация шума и адаптивный поиск по памяти + +- Фильтрует низкокачественный контент: отказы агента, мета-вопросы, приветствия +- Пропускает поиск по памяти для приветствий, slash-команд, простых подтверждений и emoji +- Принудительно включает поиск по памяти по ключевым словам ("remember", "previously", "last time") +- Пороги с учетом CJK (китайский: 6 символов против английского: 15 символов) + +--- + +
+Сравнение со встроенным memory-lancedb (нажмите, чтобы раскрыть) + +| Возможность | Встроенный `memory-lancedb` | **memory-lancedb-pro** | +| --- | :---: | :---: | +| Векторный поиск | Yes | Yes | +| Полнотекстовый BM25 | - | Yes | +| Гибридное объединение (Vector + BM25) | - | Yes | +| Реранжирование cross-encoder (несколько провайдеров) | - | Yes | +| Буст по свежести и затухание во времени | - | Yes | +| Нормализация по длине | - | Yes | +| MMR-диверсификация | - | Yes | +| Изоляция областей памяти | - | Yes | +| Фильтрация шума | - | Yes | +| Адаптивный поиск по памяти | - | Yes | +| Административный CLI | - | Yes | +| Память сессий | - | Yes | +| Эмбеддинги с учетом задачи | - | Yes | +| **Умное извлечение LLM (6 категорий)** | - | Yes (v1.1.0) | +| **Затухание Weibull + продвижение по уровням** | - | Yes (v1.1.0) | +| Любые OpenAI-compatible эмбеддинги | Limited | Yes | + +
+ +--- + +## Конфигурация + +
+Полный пример конфигурации + +```json +{ + "embedding": { + "apiKey": "${JINA_API_KEY}", + "model": "jina-embeddings-v5-text-small", + "baseURL": "https://api.jina.ai/v1", + "dimensions": 1024, + "taskQuery": "retrieval.query", + "taskPassage": "retrieval.passage", + "normalized": true + }, + "dbPath": "~/.openclaw/memory/lancedb-pro", + "autoCapture": true, + "autoRecall": true, + "retrieval": { + "mode": "hybrid", + "vectorWeight": 0.7, + "bm25Weight": 0.3, + "minScore": 0.3, + "rerank": "cross-encoder", + "rerankApiKey": "${JINA_API_KEY}", + "rerankModel": "jina-reranker-v3", + "rerankEndpoint": "https://api.jina.ai/v1/rerank", + "rerankProvider": "jina", + "candidatePoolSize": 20, + "recencyHalfLifeDays": 14, + "recencyWeight": 0.1, + "filterNoise": true, + "lengthNormAnchor": 500, + "hardMinScore": 0.35, + "timeDecayHalfLifeDays": 60, + "reinforcementFactor": 0.5, + "maxHalfLifeMultiplier": 3 + }, + "enableManagementTools": false, + "scopes": { + "default": "global", + "definitions": { + "global": { "description": "Shared knowledge" }, + "agent:discord-bot": { "description": "Discord bot private" } + }, + "agentAccess": { + "discord-bot": ["global", "agent:discord-bot"] + } + }, + "sessionMemory": { + "enabled": false, + "messageCount": 15 + }, + "smartExtraction": true, + "llm": { + "apiKey": "${OPENAI_API_KEY}", + "model": "gpt-4o-mini", + "baseURL": "https://api.openai.com/v1" + }, + "extractMinMessages": 2, + "extractMaxChars": 8000 +} +``` + +
+ +
+Провайдеры эмбеддингов + +Работает с **любым OpenAI-compatible API для эмбеддингов**: + +| Provider | Model | Base URL | Dimensions | +| --- | --- | --- | --- | +| **Jina** (recommended) | `jina-embeddings-v5-text-small` | `https://api.jina.ai/v1` | 1024 | +| **OpenAI** | `text-embedding-3-small` | `https://api.openai.com/v1` | 1536 | +| **Voyage** | `voyage-4-lite` / `voyage-4` | `https://api.voyageai.com/v1` | 1024 / 1024 | +| **Google Gemini** | `gemini-embedding-001` | `https://generativelanguage.googleapis.com/v1beta/openai/` | 3072 | +| **Ollama** (local) | `nomic-embed-text` | `http://localhost:11434/v1` | зависит от провайдера | + +
+ +
+Провайдеры реранжирования + +Кросс-энкодерное реранжирование поддерживает несколько провайдеров через `rerankProvider`: + +| Provider | `rerankProvider` | Example Model | +| --- | --- | --- | +| **Jina** (default) | `jina` | `jina-reranker-v3` | +| **SiliconFlow** (есть бесплатный тариф) | `siliconflow` | `BAAI/bge-reranker-v2-m3` | +| **Voyage AI** | `voyage` | `rerank-2.5` | +| **Pinecone** | `pinecone` | `bge-reranker-v2-m3` | + +Подойдет и любой Jina-compatible rerank endpoint: задайте `rerankProvider: "jina"` и укажите ваш `rerankEndpoint` (например, Hugging Face TEI, DashScope `qwen3-rerank`). + +
+ +
+Smart Extraction (LLM) — v1.1.0 + +Когда включен `smartExtraction` (по умолчанию: `true`), плагин использует LLM для интеллектуального извлечения и классификации воспоминаний вместо правил на регулярных выражениях. + +| Поле | Тип | По умолчанию | Описание | +|-------|------|---------|-------------| +| `smartExtraction` | boolean | `true` | Включить/выключить извлечение по 6 категориям на базе LLM | +| `llm.auth` | string | `api-key` | `api-key` использует `llm.apiKey` / `embedding.apiKey`; `oauth` по умолчанию использует OAuth-файл токена в области плагина | +| `llm.apiKey` | string | *(по умолчанию берется из `embedding.apiKey`)* | API-ключ провайдера LLM | +| `llm.model` | string | `openai/gpt-oss-120b` | Имя модели LLM | +| `llm.baseURL` | string | *(по умолчанию берется из `embedding.baseURL`)* | URL LLM API | +| `llm.oauthProvider` | string | `openai-codex` | Идентификатор OAuth-провайдера, используемый при `llm.auth = "oauth"` | +| `llm.oauthPath` | string | `~/.openclaw/.memory-lancedb-pro/oauth.json` | Путь к OAuth-файлу токена при `llm.auth = "oauth"` | +| `llm.timeoutMs` | number | `30000` | Таймаут запроса к LLM в миллисекундах | +| `extractMinMessages` | number | `2` | Минимум сообщений до срабатывания извлечения | +| `extractMaxChars` | number | `8000` | Максимум символов, отправляемых в LLM | + + +OAuth `llm` config (использует существующий кэш логина Codex / ChatGPT для LLM-запросов): +```json +{ + "llm": { + "auth": "oauth", + "oauthProvider": "openai-codex", + "model": "gpt-5.4", + "oauthPath": "${HOME}/.openclaw/.memory-lancedb-pro/oauth.json", + "timeoutMs": 30000 + } +} +``` + +Примечания для `llm.auth: "oauth"`: + +- `llm.oauthProvider` сейчас равен `openai-codex`. +- По умолчанию OAuth token хранится в `~/.openclaw/.memory-lancedb-pro/oauth.json`. +- Если хотите хранить этот файл в другом месте, можно задать `llm.oauthPath`. +- `auth login` сохраняет снимок предыдущего `llm` конфига в режиме api-key рядом с OAuth-файлом, а `auth logout` восстанавливает этот снимок, если он доступен. +- При переключении с `api-key` на `oauth` значение `llm.baseURL` автоматически не переносится. Указывайте его вручную в OAuth-режиме только если вам действительно нужен кастомный ChatGPT/Codex-compatible backend. + +
+ +
+Конфигурация жизненного цикла (Decay + Tier) + +| Поле | По умолчанию | Описание | +|-------|---------|-------------| +| `decay.recencyHalfLifeDays` | `30` | Базовый период полураспада для Weibull recency decay | +| `decay.frequencyWeight` | `0.3` | Вес частоты доступа в composite score | +| `decay.intrinsicWeight` | `0.3` | Вес `importance × confidence` | +| `decay.betaCore` | `0.8` | Weibull beta для воспоминаний уровня `core` | +| `decay.betaWorking` | `1.0` | Weibull beta для `working` | +| `decay.betaPeripheral` | `1.3` | Weibull beta для `peripheral` | +| `tier.coreAccessThreshold` | `10` | Минимальное число recall перед повышением в `core` | +| `tier.peripheralAgeDays` | `60` | Порог возраста для понижения устаревших воспоминаний | + +
+ +
+Усиление за счет доступа + +Часто вспоминаемые записи затухают медленнее (в духе spaced repetition). + +Ключи конфига (в разделе `retrieval`): +- `reinforcementFactor` (0-2, по умолчанию: `0.5`) — задайте `0`, чтобы отключить +- `maxHalfLifeMultiplier` (1-10, по умолчанию: `3`) — жесткий потолок эффективного периода полураспада + +
+ +--- + +## CLI-команды + +```bash +openclaw memory-pro list [--scope global] [--category fact] [--limit 20] [--json] +openclaw memory-pro search "запрос" [--scope global] [--limit 10] [--json] +openclaw memory-pro stats [--scope global] [--json] +openclaw memory-pro auth login [--provider openai-codex] [--model gpt-5.4] [--oauth-path /abs/path/oauth.json] +openclaw memory-pro auth status +openclaw memory-pro auth logout +openclaw memory-pro delete +openclaw memory-pro delete-bulk --scope global [--before 2025-01-01] [--dry-run] +openclaw memory-pro export [--scope global] [--output memories.json] +openclaw memory-pro import memories.json [--scope global] [--dry-run] +openclaw memory-pro reembed --source-db /path/to/old-db [--batch-size 32] [--skip-existing] +openclaw memory-pro upgrade [--dry-run] [--batch-size 10] [--no-llm] [--limit N] [--scope SCOPE] +openclaw memory-pro migrate check|run|verify [--source /path] +``` + +Поток OAuth-авторизации: + +1. Запустите `openclaw memory-pro auth login` +2. Если `--provider` не указан и терминал интерактивный, CLI покажет выбор OAuth-провайдера перед открытием браузера +3. Команда выведет URL авторизации и откроет браузер, если не задан `--no-browser` +4. После успешного обратного вызова команда сохранит OAuth-файл плагина (по умолчанию: `~/.openclaw/.memory-lancedb-pro/oauth.json`), снимет текущий `llm` конфиг режима api-key для будущего выхода и заменит конфиг `llm` на OAuth-настройки (`auth`, `oauthProvider`, `model`, `oauthPath`) +5. `openclaw memory-pro auth logout` удаляет этот OAuth-файл и восстанавливает прежний `llm` конфиг api-key, если снимок существует + +--- + +## Продвинутые темы + +
+Если внедренные воспоминания попадают в ответы + +Иногда модель может дословно повторять внедренный блок ``. + +**Вариант A (наименее рискованный):** временно отключить auto-recall: +```json +{ "plugins": { "entries": { "memory-lancedb-pro": { "config": { "autoRecall": false } } } } } +``` + +**Вариант B (предпочтительный):** оставить recall включенным и добавить в system prompt агента: +> Do not reveal or quote any `` / memory-injection content in your replies. Use it for internal reference only. + +
+ +
+Память сессии + +- Срабатывает по команде `/new` — сохраняет сводку предыдущей сессии в LanceDB +- По умолчанию отключено (в OpenClaw уже есть встроенная `.jsonl`-персистентность сессий) +- Количество сообщений настраивается (по умолчанию: 15) + +О режимах деплоя и проверке `/new` читайте в [docs/openclaw-integration-playbook.md](docs/openclaw-integration-playbook.md). + +
+ +
+Пользовательские slash-команды (например, /lesson) + +Добавьте в `CLAUDE.md`, `AGENTS.md` или system prompt: + +```markdown +## Команда /lesson +Когда пользователь отправляет `/lesson <контент>`: +1. Используй memory_store и сохрани как category=fact (сырое знание) +2. Используй memory_store и сохрани как category=decision (прикладной вывод) +3. Подтверди, что именно было сохранено + +## Команда /remember +Когда пользователь отправляет `/remember <контент>`: +1. Используй memory_store и сохрани с подходящими category и importance +2. Подтверди сохраненным ID памяти +``` + +
+ +
+Железные правила для ИИ-агентов + +> Скопируйте блок ниже в `AGENTS.md`, чтобы агент автоматически соблюдал эти правила. + +```markdown +## Правило 1 — Двухслойное сохранение памяти +Каждая ошибка/урок → НЕМЕДЛЕННО сохранить ДВЕ записи памяти: +- Технический слой: Проблема: [симптом]. Причина: [корневая причина]. Исправление: [решение]. Профилактика: [как избежать] + (category: fact, importance >= 0.8) +- Принципиальный слой: Принцип решения ([tag]): [правило поведения]. Триггер: [когда]. Действие: [что делать] + (category: decision, importance >= 0.85) + +## Правило 2 — Гигиена LanceDB +Записи должны быть короткими и атомарными (< 500 chars). Никаких сырых summary разговоров и дубликатов. + +## Правило 3 — Recall перед повторной попыткой +При ЛЮБОЙ ошибке инструмента ВСЕГДА выполняй memory_recall по релевантным ключевым словам ПЕРЕД повторной попыткой. + +## Правило 4 — Подтверди целевую кодовую базу +Перед изменениями убедись, что редактируешь memory-lancedb-pro, а не встроенный memory-lancedb. + +## Правило 5 — Очищай кэш jiti после изменений кода плагина +После изменения .ts-файлов в plugins/ ОБЯЗАТЕЛЬНО выполни rm -rf /tmp/jiti/ перед openclaw gateway restart. +``` + +
+ +
+Схема базы данных + +Таблица LanceDB `memories`: + +| Поле | Тип | Описание | +| --- | --- | --- | +| `id` | string (UUID) | Первичный ключ | +| `text` | string | Текст памяти (индексируется для FTS) | +| `vector` | float[] | Вектор эмбеддинга | +| `category` | string | Категория хранения: `preference` / `fact` / `decision` / `entity` / `reflection` / `other` | +| `scope` | string | Идентификатор области памяти (например, `global`, `agent:main`) | +| `importance` | float | Оценка важности от 0 до 1 | +| `timestamp` | int64 | Временная метка создания (мс) | +| `metadata` | string (JSON) | Расширенные метаданные | + +Обычные ключи `metadata` в v1.1.0: `l0_abstract`, `l1_overview`, `l2_content`, `memory_category`, `tier`, `access_count`, `confidence`, `last_accessed_at` + +> **Примечание о категориях:** поле верхнего уровня `category` использует 6 storage categories. Семантические метки Smart Extraction (`profile` / `preferences` / `entities` / `events` / `cases` / `patterns`) сохраняются в `metadata.memory_category`. + +
+ +
+Устранение неполадок + +### "Cannot mix BigInt and other types" (LanceDB / Apache Arrow) + +Начиная с LanceDB 0.26+, некоторые числовые колонки могут возвращаться как `BigInt`. Обновитесь до **memory-lancedb-pro >= 1.0.14**: теперь плагин приводит такие значения через `Number(...)` перед арифметикой. + +
+ +--- + +## Документация + +| Документ | Описание | +| --- | --- | +| [OpenClaw Integration Playbook](docs/openclaw-integration-playbook.md) | Режимы деплоя, проверка, матрица регрессии | +| [Memory Architecture Analysis](docs/memory_architecture_analysis.md) | Глубокий разбор полной архитектуры | +| [CHANGELOG v1.1.0](docs/CHANGELOG-v1.1.0.md) | Изменения поведения в v1.1.0 и причины апгрейда | +| [Long-Context Chunking](docs/long-context-chunking.md) | Стратегия разбиения длинных документов | + +--- + +## Бета: Smart Memory v1.1.0 + +> Статус: Beta — доступно через `npm i memory-lancedb-pro@beta`. Пользователи стабильного `latest` не затронуты. + +| Возможность | Описание | +|---------|-------------| +| **Умное извлечение** | Извлечение по 6 категориям на базе LLM с метаданными L0/L1/L2. При отключении откатывается к регулярным правилам. | +| **Оценка жизненного цикла** | Затухание Weibull встроено в поиск по памяти: записи с высокой частотой и важностью ранжируются выше. | +| **Управление уровнями** | Трехуровневая система (Core → Working → Peripheral) с автоматическим повышением и понижением. | + +Обратная связь: [GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issues) · Откат: `npm i memory-lancedb-pro@latest` + +--- + +## Зависимости + +| Пакет | Назначение | +| --- | --- | +| `@lancedb/lancedb` ≥0.26.2 | Векторная база данных (ANN + FTS) | +| `openai` ≥6.21.0 | Клиент OpenAI-compatible Embedding API | +| `@sinclair/typebox` 0.34.48 | Определения типов для JSON Schema | + +--- + +## Участники + +

+@win4r +@kctony +@Akatsuki-Ryu +@JasonSuz +@Minidoracat +@furedericca-lab +@joe2643 +@AliceLJY +@chenjiyong +

+ +Полный список: [Участники](https://github.com/CortexReach/memory-lancedb-pro/graphs/contributors) + +## История звезд + + + + + + Star History Chart + + + +## Лицензия + +MIT + +--- + +## Мой QR-код WeChat + + diff --git a/README_TW.md b/README_TW.md new file mode 100644 index 00000000..b582d40f --- /dev/null +++ b/README_TW.md @@ -0,0 +1,773 @@ +
+ +# 🧠 memory-lancedb-pro · 🦞OpenClaw Plugin + +**[OpenClaw](https://github.com/openclaw/openclaw) 智慧體的 AI 記憶助理** + +*讓你的 AI 智慧體擁有真正的記憶力——跨工作階段、跨智慧體、跨時間。* + +基於 LanceDB 的 OpenClaw 長期記憶外掛,自動儲存偏好、決策和專案上下文,在後續工作階段中自動回憶。 + +[![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-blue)](https://github.com/openclaw/openclaw) +[![npm version](https://img.shields.io/npm/v/memory-lancedb-pro)](https://www.npmjs.com/package/memory-lancedb-pro) +[![LanceDB](https://img.shields.io/badge/LanceDB-Vectorstore-orange)](https://lancedb.com) +[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) + +[English](README.md) | [简体中文](README_CN.md) | [繁體中文](README_TW.md) | [日本語](README_JA.md) | [한국어](README_KO.md) | [Français](README_FR.md) | [Español](README_ES.md) | [Deutsch](README_DE.md) | [Italiano](README_IT.md) | [Русский](README_RU.md) | [Português (Brasil)](README_PT-BR.md) + +
+ +--- + +## 為什麼選 memory-lancedb-pro? + +大多數 AI 智慧體都有「失憶症」——每次新對話,之前聊過的全部清零。 + +**memory-lancedb-pro** 是 OpenClaw 的生產級長期記憶外掛,把你的智慧體變成一個真正的 **AI 記憶助理**——自動擷取重要資訊,讓雜訊自然衰減,在恰當的時候回憶起恰當的內容。無需手動標記,無需複雜設定。 + +### AI 記憶助理實際效果 + +**沒有記憶——每次都從零開始:** + +> **你:** 「縮排用 tab,所有函式都要加錯誤處理。」 +> *(下一次工作階段)* +> **你:** 「我都說了用 tab 不是空格!」 😤 +> *(再下一次工作階段)* +> **你:** 「……我真的說了第三遍了,tab,還有錯誤處理。」 + +**有了 memory-lancedb-pro——你的智慧體學會了、記住了:** + +> **你:** 「縮排用 tab,所有函式都要加錯誤處理。」 +> *(下一次工作階段——智慧體自動回憶你的偏好)* +> **智慧體:** *(默默改成 tab 縮排,並補上錯誤處理)* ✅ +> **你:** 「上個月我們為什麼選了 PostgreSQL 而不是 MongoDB?」 +> **智慧體:** 「根據我們 2 月 12 日的討論,主要原因是……」 ✅ + +這就是 **AI 記憶助理** 的價值——學習你的風格,回憶過去的決策,提供個人化的回應,不再讓你重複自己。 + +### 還能做什麼? + +| | 你能得到的 | +|---|---| +| **自動擷取** | 智慧體從每次對話中學習——不需要手動呼叫 `memory_store` | +| **智慧擷取** | LLM 驅動的 6 類分類:使用者輪廓、偏好、實體、事件、案例、模式 | +| **智慧遺忘** | Weibull 衰減模型——重要記憶留存,雜訊自然消退 | +| **混合檢索** | 向量 + BM25 全文搜尋,融合交叉編碼器重排序 | +| **上下文注入** | 相關記憶在每次回覆前自動浮現 | +| **多作用域隔離** | 按智慧體、按使用者、按專案隔離記憶邊界 | +| **任意服務商** | OpenAI、Jina、Gemini、Ollama 或任意 OpenAI 相容 API | +| **完整工具鏈** | CLI、備份、遷移、升級、匯入匯出——生產可用 | + +--- + +## 快速開始 + +### 方式 A:一鍵安裝指令碼(推薦) + +社群維護的 **[安裝指令碼](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** 一條指令搞定安裝、升級和修復: + +```bash +curl -fsSL https://raw.githubusercontent.com/CortexReach/toolbox/main/memory-lancedb-pro-setup/setup-memory.sh -o setup-memory.sh +bash setup-memory.sh +``` + +> 指令碼涵蓋的完整場景和其他社群工具,詳見下方 [生態工具](#生態工具)。 + +### 方式 B:手動安裝 + +**透過 OpenClaw CLI(推薦):** +```bash +openclaw plugins install memory-lancedb-pro@beta +``` + +**或透過 npm:** +```bash +npm i memory-lancedb-pro@beta +``` +> 如果用 npm 安裝,你還需要在 `openclaw.json` 的 `plugins.load.paths` 中新增外掛安裝目錄的 **絕對路徑**。這是最常見的安裝問題。 + +在 `openclaw.json` 中新增設定: + +```json +{ + "plugins": { + "slots": { "memory": "memory-lancedb-pro" }, + "entries": { + "memory-lancedb-pro": { + "enabled": true, + "config": { + "embedding": { + "provider": "openai-compatible", + "apiKey": "${OPENAI_API_KEY}", + "model": "text-embedding-3-small" + }, + "autoCapture": true, + "autoRecall": true, + "smartExtraction": true, + "extractMinMessages": 2, + "extractMaxChars": 8000, + "sessionMemory": { "enabled": false } + } + } + } + } +} +``` + +**為什麼用這些預設值?** +- `autoCapture` + `smartExtraction` → 智慧體自動從每次對話中學習 +- `autoRecall` → 相關記憶在每次回覆前自動注入 +- `extractMinMessages: 2` → 正常兩輪對話即觸發擷取 +- `sessionMemory.enabled: false` → 避免工作階段摘要在初期汙染檢索結果 + +驗證並重啟: + +```bash +openclaw config validate +openclaw gateway restart +openclaw logs --follow --plain | grep "memory-lancedb-pro" +``` + +你應該能看到: +- `memory-lancedb-pro: smart extraction enabled` +- `memory-lancedb-pro@...: plugin registered` + +完成!你的智慧體現在擁有長期記憶了。 + +
+更多安裝路徑(現有使用者、升級) + +**已在使用 OpenClaw?** + +1. 在 `plugins.load.paths` 中新增外掛的 **絕對路徑** +2. 繫結記憶插槽:`plugins.slots.memory = "memory-lancedb-pro"` +3. 驗證:`openclaw plugins info memory-lancedb-pro && openclaw memory-pro stats` + +**從 v1.1.0 之前的版本升級?** + +```bash +# 1) 備份 +openclaw memory-pro export --scope global --output memories-backup.json +# 2) 試執行 +openclaw memory-pro upgrade --dry-run +# 3) 執行升級 +openclaw memory-pro upgrade +# 4) 驗證 +openclaw memory-pro stats +``` + +詳見 [`CHANGELOG-v1.1.0.md`](docs/CHANGELOG-v1.1.0.md) 了解行為變更和升級說明。 + +
+ +
+Telegram Bot 快速匯入(點選展開) + +如果你在使用 OpenClaw 的 Telegram 整合,最簡單的方式是直接給主 Bot 發訊息,而不是手動編輯設定檔。 + +以下為英文原文,方便直接複製傳送給 Bot: + +```text +Help me connect this memory plugin with the most user-friendly configuration: https://github.com/CortexReach/memory-lancedb-pro + +Requirements: +1. Set it as the only active memory plugin +2. Use Jina for embedding +3. Use Jina for reranker +4. Use gpt-4o-mini for the smart-extraction LLM +5. Enable autoCapture, autoRecall, smartExtraction +6. extractMinMessages=2 +7. sessionMemory.enabled=false +8. captureAssistant=false +9. retrieval mode=hybrid, vectorWeight=0.7, bm25Weight=0.3 +10. rerank=cross-encoder, candidatePoolSize=12, minScore=0.6, hardMinScore=0.62 +11. Generate the final openclaw.json config directly, not just an explanation +``` + +
+ +--- + +## 生態工具 + +memory-lancedb-pro 是核心外掛。社群圍繞它建構了配套工具,讓安裝和日常使用更加順暢: + +### 安裝指令碼——一鍵安裝、升級和修復 + +> **[CortexReach/toolbox/memory-lancedb-pro-setup](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)** + +不只是簡單的安裝器——指令碼能智慧處理各種常見場景: + +| 你的情況 | 指令碼會做什麼 | +|---|---| +| 從未安裝 | 全新下載 → 安裝依賴 → 選擇設定 → 寫入 openclaw.json → 重啟 | +| 透過 `git clone` 安裝,卡在舊版本 | 自動 `git fetch` + `checkout` 到最新 → 重裝依賴 → 驗證 | +| 設定中有無效欄位 | 自動偵測並透過 schema 過濾移除不支援的欄位 | +| 透過 `npm` 安裝 | 跳過 git 更新,提醒你自行執行 `npm update` | +| `openclaw` CLI 因無效設定崩潰 | 降級方案:直接從 `openclaw.json` 檔案讀取工作目錄路徑 | +| `extensions/` 而非 `plugins/` | 從設定或檔案系統自動偵測外掛位置 | +| 已是最新版 | 僅執行健康檢查,不做變動 | + +```bash +bash setup-memory.sh # 安裝或升級 +bash setup-memory.sh --dry-run # 僅預覽 +bash setup-memory.sh --beta # 包含預發布版本 +bash setup-memory.sh --uninstall # 還原設定並移除外掛 +``` + +內建服務商預設:**Jina / DashScope / SiliconFlow / OpenAI / Ollama**,或自帶任意 OpenAI 相容 API。完整用法(含 `--ref`、`--selfcheck-only` 等)詳見[安裝指令碼 README](https://github.com/CortexReach/toolbox/tree/main/memory-lancedb-pro-setup)。 + +### Claude Code / OpenClaw Skill——AI 引導式設定 + +> **[CortexReach/memory-lancedb-pro-skill](https://github.com/CortexReach/memory-lancedb-pro-skill)** + +安裝這個 Skill,你的 AI 智慧體(Claude Code 或 OpenClaw)就能深度掌握 memory-lancedb-pro 的所有功能。只需說 **「help me enable the best config」** 即可獲得: + +- **7 步引導式設定流程**,提供 4 套部署方案: + - 滿血版(Jina + OpenAI)/ 省錢版(免費 SiliconFlow 重排序)/ 簡約版(僅 OpenAI)/ 全本機版(Ollama,零 API 成本) +- **全部 9 個 MCP 工具** 的正確用法:`memory_recall`、`memory_store`、`memory_forget`、`memory_update`、`memory_stats`、`memory_list`、`self_improvement_log`、`self_improvement_extract_skill`、`self_improvement_review` *(完整工具集需要設定 `enableManagementTools: true`——預設快速設定僅公開 4 個核心工具)* +- **避開常見陷阱**:workspace 外掛啟用、`autoRecall` 預設 false、jiti 快取、環境變數、作用域隔離等 + +**Claude Code 安裝:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.claude/skills/memory-lancedb-pro +``` + +**OpenClaw 安裝:** +```bash +git clone https://github.com/CortexReach/memory-lancedb-pro-skill.git ~/.openclaw/workspace/skills/memory-lancedb-pro-skill +``` + +--- + +## 影片教學 + +> 完整演示:安裝、設定、混合檢索內部原理。 + +[![YouTube Video](https://img.shields.io/badge/YouTube-Watch%20Now-red?style=for-the-badge&logo=youtube)](https://youtu.be/MtukF1C8epQ) +**https://youtu.be/MtukF1C8epQ** + +[![Bilibili Video](https://img.shields.io/badge/Bilibili-Watch%20Now-00A1D6?style=for-the-badge&logo=bilibili&logoColor=white)](https://www.bilibili.com/video/BV1zUf2BGEgn/) +**https://www.bilibili.com/video/BV1zUf2BGEgn/** + +--- + +## 架構 + +``` +┌─────────────────────────────────────────────────────────┐ +│ index.ts (入口) │ +│ 外掛註冊 · 設定解析 · 生命週期鉤子 │ +└────────┬──────────┬──────────┬──────────┬───────────────┘ + │ │ │ │ + ┌────▼───┐ ┌────▼───┐ ┌───▼────┐ ┌──▼──────────┐ + │ store │ │embedder│ │retriever│ │ scopes │ + │ .ts │ │ .ts │ │ .ts │ │ .ts │ + └────────┘ └────────┘ └────────┘ └─────────────┘ + │ │ + ┌────▼───┐ ┌─────▼──────────┐ + │migrate │ │noise-filter.ts │ + │ .ts │ │adaptive- │ + └────────┘ │retrieval.ts │ + └────────────────┘ + ┌─────────────┐ ┌──────────┐ + │ tools.ts │ │ cli.ts │ + │ (智慧體 API)│ │ (CLI) │ + └─────────────┘ └──────────┘ +``` + +> 完整架構解析見 [docs/memory_architecture_analysis.md](docs/memory_architecture_analysis.md)。 + +
+檔案說明(點選展開) + +| 檔案 | 用途 | +| --- | --- | +| `index.ts` | 外掛入口,註冊 OpenClaw Plugin API、解析設定、掛載生命週期鉤子 | +| `openclaw.plugin.json` | 外掛中繼資料 + 完整 JSON Schema 設定宣告 | +| `cli.ts` | CLI 指令:`memory-pro list/search/stats/delete/delete-bulk/export/import/reembed/upgrade/migrate` | +| `src/store.ts` | LanceDB 儲存層:建表 / 全文索引 / 向量搜尋 / BM25 搜尋 / CRUD | +| `src/embedder.ts` | Embedding 抽象層,相容任意 OpenAI 相容 API | +| `src/retriever.ts` | 混合檢索引擎:向量 + BM25 → 混合融合 → 重排序 → 生命週期衰減 → 過濾 | +| `src/scopes.ts` | 多作用域存取控制 | +| `src/tools.ts` | 智慧體工具定義:`memory_recall`、`memory_store`、`memory_forget`、`memory_update` + 管理工具 | +| `src/noise-filter.ts` | 過濾智慧體拒絕回覆、元問題、打招呼等低品質內容 | +| `src/adaptive-retrieval.ts` | 判斷查詢是否需要記憶檢索 | +| `src/migrate.ts` | 從內建 `memory-lancedb` 遷移到 Pro | +| `src/smart-extractor.ts` | LLM 驅動的 6 類擷取,支援 L0/L1/L2 分層儲存和兩階段去重 | +| `src/decay-engine.ts` | Weibull 拉伸指數衰減模型 | +| `src/tier-manager.ts` | 三級晉升/降級:外圍 ↔ 工作 ↔ 核心 | + +
+ +--- + +## 核心功能 + +### 混合檢索 + +``` +查詢 → embedQuery() ─┐ + ├─→ 混合融合 → 重排序 → 生命週期衰減加權 → 長度正規化 → 過濾 +查詢 → BM25 全文 ─────┘ +``` + +- **向量搜尋** — 基於 LanceDB ANN 的語意相似度(餘弦距離) +- **BM25 全文搜尋** — 透過 LanceDB FTS 索引進行精確關鍵字比對 +- **混合融合** — 以向量分數為基礎,BM25 命中結果獲得加權提升(非標準 RRF——針對實際召回品質調優) +- **可設定權重** — `vectorWeight`、`bm25Weight`、`minScore` + +### 交叉編碼器重排序 + +- 內建 **Jina**、**SiliconFlow**、**Voyage AI** 和 **Pinecone** 適配器 +- 相容任意 Jina 相容端點(如 Hugging Face TEI、DashScope) +- 混合打分:60% 交叉編碼器 + 40% 原始融合分數 +- 優雅降級:API 失敗時回退到餘弦相似度 + +### 多階段評分管線 + +| 階段 | 效果 | +| --- | --- | +| **混合融合** | 結合語意召回和精確比對召回 | +| **交叉編碼器重排序** | 提升語意精確命中的排名 | +| **生命週期衰減加權** | Weibull 時效性 + 存取頻率 + 重要性 × 置信度 | +| **長度正規化** | 防止長條目主導結果(錨點:500 字元) | +| **硬最低分** | 移除無關結果(預設:0.35) | +| **MMR 多樣性** | 餘弦相似度 > 0.85 → 降權 | + +### 智慧記憶擷取(v1.1.0) + +- **LLM 驅動的 6 類擷取**:使用者輪廓、偏好、實體、事件、案例、模式 +- **L0/L1/L2 分層儲存**:L0(一句話索引)→ L1(結構化摘要)→ L2(完整敘述) +- **兩階段去重**:向量相似度預過濾(≥0.7)→ LLM 語意決策(CREATE/MERGE/SKIP) +- **類別感知合併**:`profile` 始終合併,`events`/`cases` 僅追加 + +### 記憶生命週期管理(v1.1.0) + +- **Weibull 衰減引擎**:綜合分數 = 時效性 + 頻率 + 內在價值 +- **三級晉升**:`外圍 ↔ 工作 ↔ 核心`,閾值可設定 +- **存取強化**:頻繁被召回的記憶衰減更慢(類似間隔重複機制) +- **重要性調制半衰期**:重要記憶衰減更慢 + +### 多作用域隔離 + +- 內建作用域:`global`、`agent:`、`custom:`、`project:`、`user:` +- 透過 `scopes.agentAccess` 實現智慧體級別的存取控制 +- 預設:每個智慧體存取 `global` + 自己的 `agent:` 作用域 + +### 自動擷取與自動回憶 + +- **自動擷取**(`agent_end`):從對話中擷取偏好/事實/決策/實體,去重後每輪最多儲存 3 條 +- **自動回憶**(`before_agent_start`):注入 `` 上下文(最多 3 條) + +### 雜訊過濾與自適應檢索 + +- 過濾低品質內容:智慧體拒絕回覆、元問題、打招呼 +- 跳過檢索:打招呼、斜線指令、簡單確認、表情符號 +- 強制檢索:記憶關鍵字(「記得」、「之前」、「上次」) +- CJK 感知閾值(中文:6 字元 vs 英文:15 字元) + +--- + +
+與內建 memory-lancedb 的對比(點選展開) + +| 功能 | 內建 `memory-lancedb` | **memory-lancedb-pro** | +| --- | :---: | :---: | +| 向量搜尋 | 有 | 有 | +| BM25 全文搜尋 | - | 有 | +| 混合融合(向量 + BM25) | - | 有 | +| 交叉編碼器重排序(多服務商) | - | 有 | +| 時效性提升和時間衰減 | - | 有 | +| 長度正規化 | - | 有 | +| MMR 多樣性 | - | 有 | +| 多作用域隔離 | - | 有 | +| 雜訊過濾 | - | 有 | +| 自適應檢索 | - | 有 | +| 管理 CLI | - | 有 | +| 工作階段記憶 | - | 有 | +| 任務感知 Embedding | - | 有 | +| **LLM 智慧擷取(6 類)** | - | 有(v1.1.0) | +| **Weibull 衰減 + 層級晉升** | - | 有(v1.1.0) | +| 任意 OpenAI 相容 Embedding | 有限 | 有 | + +
+ +--- + +## 設定 + +
+完整設定範例 + +```json +{ + "embedding": { + "apiKey": "${JINA_API_KEY}", + "model": "jina-embeddings-v5-text-small", + "baseURL": "https://api.jina.ai/v1", + "dimensions": 1024, + "taskQuery": "retrieval.query", + "taskPassage": "retrieval.passage", + "normalized": true + }, + "dbPath": "~/.openclaw/memory/lancedb-pro", + "autoCapture": true, + "autoRecall": true, + "retrieval": { + "mode": "hybrid", + "vectorWeight": 0.7, + "bm25Weight": 0.3, + "minScore": 0.3, + "rerank": "cross-encoder", + "rerankApiKey": "${JINA_API_KEY}", + "rerankModel": "jina-reranker-v3", + "rerankEndpoint": "https://api.jina.ai/v1/rerank", + "rerankProvider": "jina", + "candidatePoolSize": 20, + "recencyHalfLifeDays": 14, + "recencyWeight": 0.1, + "filterNoise": true, + "lengthNormAnchor": 500, + "hardMinScore": 0.35, + "timeDecayHalfLifeDays": 60, + "reinforcementFactor": 0.5, + "maxHalfLifeMultiplier": 3 + }, + "enableManagementTools": false, + "scopes": { + "default": "global", + "definitions": { + "global": { "description": "Shared knowledge" }, + "agent:discord-bot": { "description": "Discord bot private" } + }, + "agentAccess": { + "discord-bot": ["global", "agent:discord-bot"] + } + }, + "sessionMemory": { + "enabled": false, + "messageCount": 15 + }, + "smartExtraction": true, + "llm": { + "apiKey": "${OPENAI_API_KEY}", + "model": "gpt-4o-mini", + "baseURL": "https://api.openai.com/v1" + }, + "extractMinMessages": 2, + "extractMaxChars": 8000 +} +``` + +
+ +
+Embedding 服務商 + +相容 **任意 OpenAI 相容 Embedding API**: + +| 服務商 | 模型 | Base URL | 維度 | +| --- | --- | --- | --- | +| **Jina**(推薦) | `jina-embeddings-v5-text-small` | `https://api.jina.ai/v1` | 1024 | +| **OpenAI** | `text-embedding-3-small` | `https://api.openai.com/v1` | 1536 | +| **Voyage** | `voyage-4-lite` / `voyage-4` | `https://api.voyageai.com/v1` | 1024 / 1024 | +| **Google Gemini** | `gemini-embedding-001` | `https://generativelanguage.googleapis.com/v1beta/openai/` | 3072 | +| **Ollama**(本地) | `nomic-embed-text` | `http://localhost:11434/v1` | 取決於模型 | + +
+ +
+重排序服務商 + +交叉編碼器重排序透過 `rerankProvider` 支援多個服務商: + +| 服務商 | `rerankProvider` | 範例模型 | +| --- | --- | --- | +| **Jina**(預設) | `jina` | `jina-reranker-v3` | +| **SiliconFlow**(有免費額度) | `siliconflow` | `BAAI/bge-reranker-v2-m3` | +| **Voyage AI** | `voyage` | `rerank-2.5` | +| **Pinecone** | `pinecone` | `bge-reranker-v2-m3` | + +任何 Jina 相容的重排序端點也可以使用——設定 `rerankProvider: "jina"` 並將 `rerankEndpoint` 指向你的服務(如 Hugging Face TEI、DashScope `qwen3-rerank`)。 + +
+ +
+智慧擷取(LLM)— v1.1.0 + +當 `smartExtraction` 啟用(預設 `true`)時,外掛使用 LLM 智慧擷取和分類記憶,取代基於正則的觸發方式。 + +| 欄位 | 類型 | 預設值 | 說明 | +|------|------|--------|------| +| `smartExtraction` | boolean | `true` | 是否啟用 LLM 智慧 6 類別擷取 | +| `llm.auth` | string | `api-key` | `api-key` 使用 `llm.apiKey` / `embedding.apiKey`;`oauth` 預設使用外掛級 OAuth token 檔案 | +| `llm.apiKey` | string | *(複用 `embedding.apiKey`)* | LLM 服務商 API Key | +| `llm.model` | string | `openai/gpt-oss-120b` | LLM 模型名稱 | +| `llm.baseURL` | string | *(複用 `embedding.baseURL`)* | LLM API 端點 | +| `llm.oauthProvider` | string | `openai-codex` | `llm.auth` 為 `oauth` 時使用的 OAuth provider id | +| `llm.oauthPath` | string | `~/.openclaw/.memory-lancedb-pro/oauth.json` | `llm.auth` 為 `oauth` 時使用的 OAuth token 檔案 | +| `llm.timeoutMs` | number | `30000` | LLM 請求逾時(毫秒) | +| `extractMinMessages` | number | `2` | 觸發擷取的最小訊息數 | +| `extractMaxChars` | number | `8000` | 傳送給 LLM 的最大字元數 | + + +OAuth `llm` 設定(使用現有 Codex / ChatGPT 登入快取來發送 LLM 請求): +```json +{ + "llm": { + "auth": "oauth", + "oauthProvider": "openai-codex", + "model": "gpt-5.4", + "oauthPath": "${HOME}/.openclaw/.memory-lancedb-pro/oauth.json", + "timeoutMs": 30000 + } +} +``` + +`llm.auth: "oauth"` 說明: + +- `llm.oauthProvider` 目前僅支援 `openai-codex`。 +- OAuth token 預設存放在 `~/.openclaw/.memory-lancedb-pro/oauth.json`。 +- 如需自訂路徑,可設定 `llm.oauthPath`。 +- `auth login` 會在 OAuth 檔案旁邊快照原來的 `api-key` 模式 `llm` 設定;`auth logout` 在可用時會恢復這份快照。 +- 從 `api-key` 切到 `oauth` 時不會自動沿用 `llm.baseURL`;只有在你明確需要自訂 ChatGPT/Codex 相容後端時,才應在 `oauth` 模式下手動設定。 + +
+ +
+生命週期設定(衰減 + 層級) + +| 欄位 | 預設值 | 說明 | +|------|--------|------| +| `decay.recencyHalfLifeDays` | `30` | Weibull 時效性衰減的基礎半衰期 | +| `decay.frequencyWeight` | `0.3` | 存取頻率在綜合分數中的權重 | +| `decay.intrinsicWeight` | `0.3` | `重要性 × 置信度` 的權重 | +| `decay.betaCore` | `0.8` | `核心` 記憶的 Weibull beta | +| `decay.betaWorking` | `1.0` | `工作` 記憶的 Weibull beta | +| `decay.betaPeripheral` | `1.3` | `外圍` 記憶的 Weibull beta | +| `tier.coreAccessThreshold` | `10` | 晉升到 `核心` 所需的最小召回次數 | +| `tier.peripheralAgeDays` | `60` | 降級過期記憶的天數閾值 | + +
+ +
+存取強化 + +頻繁被召回的記憶衰減更慢(類似間隔重複機制)。 + +設定項(在 `retrieval` 下): +- `reinforcementFactor`(0-2,預設 `0.5`)— 設為 `0` 可停用 +- `maxHalfLifeMultiplier`(1-10,預設 `3`)— 有效半衰期的硬上限 + +
+ +--- + +## CLI 指令 + +```bash +openclaw memory-pro list [--scope global] [--category fact] [--limit 20] [--json] +openclaw memory-pro search "查詢" [--scope global] [--limit 10] [--json] +openclaw memory-pro stats [--scope global] [--json] +openclaw memory-pro auth login [--provider openai-codex] [--model gpt-5.4] [--oauth-path /abs/path/oauth.json] +openclaw memory-pro auth status +openclaw memory-pro auth logout +openclaw memory-pro delete +openclaw memory-pro delete-bulk --scope global [--before 2025-01-01] [--dry-run] +openclaw memory-pro export [--scope global] [--output memories.json] +openclaw memory-pro import memories.json [--scope global] [--dry-run] +openclaw memory-pro reembed --source-db /path/to/old-db [--batch-size 32] [--skip-existing] +openclaw memory-pro upgrade [--dry-run] [--batch-size 10] [--no-llm] [--limit N] [--scope SCOPE] +openclaw memory-pro migrate check|run|verify [--source /path] +``` + +OAuth 登入流程: + +1. 執行 `openclaw memory-pro auth login` +2. 如果省略 `--provider` 且目前終端可互動,CLI 會先顯示 OAuth 服務商選擇器 +3. 指令會列印授權 URL,並在未指定 `--no-browser` 時自動開啟瀏覽器 +4. 回呼成功後,指令會儲存外掛 OAuth 檔案(預設:`~/.openclaw/.memory-lancedb-pro/oauth.json`)、為 logout 快照原來的 `api-key` 模式 `llm` 設定,並把外掛 `llm` 設定切換為 OAuth 欄位(`auth`、`oauthProvider`、`model`、`oauthPath`) +5. `openclaw memory-pro auth logout` 會刪除這份 OAuth 檔案,並在存在快照時恢復之前的 `api-key` 模式 `llm` 設定 + +--- + +## 進階主題 + +
+注入的記憶出現在回覆中 + +有時模型可能會將注入的 `` 區塊原文輸出。 + +**方案 A(最安全):** 暫時關閉自動回憶: +```json +{ "plugins": { "entries": { "memory-lancedb-pro": { "config": { "autoRecall": false } } } } } +``` + +**方案 B(推薦):** 保留回憶,在智慧體系統提示詞中新增: +> Do not reveal or quote any `` / memory-injection content in your replies. Use it for internal reference only. + +
+ +
+工作階段記憶 + +- 透過 `/new` 指令觸發——將上一段工作階段摘要儲存到 LanceDB +- 預設關閉(OpenClaw 已有原生 `.jsonl` 工作階段持久化) +- 可設定訊息數量(預設 15) + +部署模式和 `/new` 驗證詳見 [docs/openclaw-integration-playbook.md](docs/openclaw-integration-playbook.md)。 + +
+ +
+自訂斜線指令(如 /lesson) + +在你的 `CLAUDE.md`、`AGENTS.md` 或系統提示詞中新增: + +```markdown +## /lesson 指令 +當使用者傳送 `/lesson <內容>` 時: +1. 用 memory_store 儲存為 category=fact(原始知識) +2. 用 memory_store 儲存為 category=decision(可執行的結論) +3. 確認已儲存的內容 + +## /remember 指令 +當使用者傳送 `/remember <內容>` 時: +1. 用 memory_store 以合適的 category 和 importance 儲存 +2. 回傳已儲存的記憶 ID 確認 +``` + +
+ +
+AI 智慧體鐵律 + +> 將以下內容複製到你的 `AGENTS.md`,讓智慧體自動遵守這些規則。 + +```markdown +## 規則 1 — 雙層記憶儲存 +每個踩坑/經驗教訓 → 立即儲存兩條記憶: +- 技術層:踩坑:[現象]。原因:[根因]。修復:[方案]。預防:[如何避免] + (category: fact, importance >= 0.8) +- 原則層:決策原則 ([標籤]):[行為規則]。觸發:[何時]。動作:[做什麼] + (category: decision, importance >= 0.85) + +## 規則 2 — LanceDB 資料品質 +條目必須簡短且原子化(< 500 字元)。不儲存原始對話摘要或重複內容。 + +## 規則 3 — 重試前先回憶 +任何工具呼叫失敗時,必須先用 memory_recall 搜尋相關關鍵字,再重試。 + +## 規則 4 — 確認目標程式碼庫 +修改前確認你操作的是 memory-lancedb-pro 還是內建 memory-lancedb。 + +## 規則 5 — 修改外掛程式碼後清除 jiti 快取 +修改 plugins/ 下的 .ts 檔案後,必須先清除 /tmp/jiti/ 目錄再重啟 openclaw gateway。 +``` + +
+ +
+資料庫 Schema + +LanceDB 表 `memories`: + +| 欄位 | 類型 | 說明 | +| --- | --- | --- | +| `id` | string (UUID) | 主鍵 | +| `text` | string | 記憶文字(全文索引) | +| `vector` | float[] | Embedding 向量 | +| `category` | string | 儲存類別:`preference` / `fact` / `decision` / `entity` / `reflection` / `other` | +| `scope` | string | 作用域識別碼(如 `global`、`agent:main`) | +| `importance` | float | 重要性分數 0-1 | +| `timestamp` | int64 | 建立時間戳記(毫秒) | +| `metadata` | string (JSON) | 擴充中繼資料 | + +v1.1.0 常用 `metadata` 欄位:`l0_abstract`、`l1_overview`、`l2_content`、`memory_category`、`tier`、`access_count`、`confidence`、`last_accessed_at` + +> **關於分類的說明:** 頂層 `category` 欄位使用 6 個儲存類別。智慧擷取的 6 類語意標籤(`profile` / `preferences` / `entities` / `events` / `cases` / `patterns`)儲存在 `metadata.memory_category` 中。 + +
+ +
+故障排除 + +### "Cannot mix BigInt and other types"(LanceDB / Apache Arrow) + +在 LanceDB 0.26+ 上,某些數值欄位可能以 `BigInt` 形式回傳。升級到 **memory-lancedb-pro >= 1.0.14**——外掛現在會在運算前使用 `Number(...)` 進行類型轉換。 + +
+ +--- + +## 文件 + +| 文件 | 說明 | +| --- | --- | +| [OpenClaw 整合手冊](docs/openclaw-integration-playbook.md) | 部署模式、驗證、迴歸矩陣 | +| [記憶架構分析](docs/memory_architecture_analysis.md) | 完整架構深度解析 | +| [CHANGELOG v1.1.0](docs/CHANGELOG-v1.1.0.md) | v1.1.0 行為變更和升級說明 | +| [長上下文分塊](docs/long-context-chunking.md) | 長文件分塊策略 | + +--- + +## 測試版:智慧記憶 v1.1.0 + +> 狀態:Beta(測試版)——透過 `npm i memory-lancedb-pro@beta` 安裝。使用 `latest` 的穩定版使用者不受影響。 + +| 功能 | 說明 | +|------|------| +| **智慧擷取** | LLM 驅動的 6 類擷取,支援 L0/L1/L2 中繼資料。停用時回退到正則模式。 | +| **生命週期評分** | Weibull 衰減整合到檢索中——高頻和高重要性記憶排名更高。 | +| **層級管理** | 三級系統(核心 → 工作 → 外圍),自動晉升/降級。 | + +回饋:[GitHub Issues](https://github.com/CortexReach/memory-lancedb-pro/issues) · 回退:`npm i memory-lancedb-pro@latest` + +--- + +## 依賴 + +| 套件 | 用途 | +| --- | --- | +| `@lancedb/lancedb` ≥0.26.2 | 向量資料庫(ANN + FTS) | +| `openai` ≥6.21.0 | OpenAI 相容 Embedding API 客戶端 | +| `@sinclair/typebox` 0.34.48 | JSON Schema 類型定義 | + +--- + +## Contributors + +

+@win4r +@kctony +@Akatsuki-Ryu +@JasonSuz +@Minidoracat +@furedericca-lab +@joe2643 +@AliceLJY +@chenjiyong +

+ +Full list: [Contributors](https://github.com/CortexReach/memory-lancedb-pro/graphs/contributors) + +## Star History + + + + + + Star History Chart + + + +## 授權條款 + +MIT + +--- + +## 我的微信 QR Code + + diff --git a/cli.ts b/cli.ts index 0781f217..dd062ad3 100644 --- a/cli.ts +++ b/cli.ts @@ -4,12 +4,26 @@ import type { Command } from "commander"; import { readFileSync } from "node:fs"; +import { mkdir, readFile, rm, writeFile } from "node:fs/promises"; +import { homedir } from "node:os"; +import path from "node:path"; +import * as readline from "node:readline"; +import JSON5 from "json5"; import { loadLanceDB, type MemoryEntry, type MemoryStore } from "./src/store.js"; import { createRetriever, type MemoryRetriever } from "./src/retriever.js"; import type { MemoryScopeManager } from "./src/scopes.js"; import type { MemoryMigrator } from "./src/migrate.js"; import { createMemoryUpgrader } from "./src/memory-upgrader.js"; import type { LlmClient } from "./src/llm-client.js"; +import { + getDefaultOauthModelForProvider, + getOAuthProviderLabel, + isOauthModelSupported, + listOAuthProviders, + normalizeOauthModel, + normalizeOAuthProviderId, + performOAuthLogin, +} from "./src/llm-oauth.js"; // ============================================================================ // Types @@ -22,6 +36,16 @@ interface CLIContext { migrator: MemoryMigrator; embedder?: import("./src/embedder.js").Embedder; llmClient?: LlmClient; + pluginId?: string; + pluginConfig?: Record; + oauthTestHooks?: { + openUrl?: (url: string) => void | Promise; + authorizeUrl?: (url: string) => void | Promise; + chooseProvider?: ( + providers: Array<{ id: string; label: string; defaultModel: string }>, + currentProviderId: string, + ) => string | Promise; + }; } // ============================================================================ @@ -43,6 +67,335 @@ function clampInt(value: number, min: number, max: number): number { return Math.max(min, Math.min(max, Math.trunc(n))); } +function resolveOpenClawConfigPath(explicit?: string): string { + const openclawHome = resolveOpenClawHome(); + if (explicit && explicit.trim()) { + return path.resolve(explicit.trim()); + } + + const fromEnv = process.env.OPENCLAW_CONFIG_PATH?.trim(); + if (fromEnv) { + return path.resolve(fromEnv); + } + + return path.join(openclawHome, "openclaw.json"); +} + +function resolveOpenClawHome(): string { + return process.env.OPENCLAW_HOME?.trim() + ? path.resolve(process.env.OPENCLAW_HOME.trim()) + : path.join(homedir(), ".openclaw"); +} + +function resolveDefaultOauthPath(): string { + return path.join(resolveOpenClawHome(), ".memory-lancedb-pro", "oauth.json"); +} + +function resolveLoginOauthPath(rawPath: unknown): string { + const trimmed = typeof rawPath === "string" ? rawPath.trim() : ""; + const candidate = trimmed || resolveDefaultOauthPath(); + return path.resolve(candidate); +} + +function resolveConfiguredOauthPath(configPath: string, rawPath: unknown): string { + const trimmed = typeof rawPath === "string" ? rawPath.trim() : ""; + if (!trimmed) { + return resolveDefaultOauthPath(); + } + if (path.isAbsolute(trimmed)) { + return trimmed; + } + return path.resolve(path.dirname(configPath), trimmed); +} + +type RestorableApiKeyLlmConfig = { + auth?: "api-key"; + apiKey?: string; + model?: string; + baseURL?: string; + timeoutMs?: number; +}; + +type OAuthLlmBackup = { + version: 1; + hadLlmConfig: boolean; + llm: RestorableApiKeyLlmConfig; +}; + +function isPlainObject(value: unknown): value is Record { + return typeof value === "object" && value !== null && !Array.isArray(value); +} + +function isOauthLlmConfig(value: unknown): boolean { + return isPlainObject(value) && value.auth === "oauth"; +} + +function extractRestorableApiKeyLlmConfig(value: unknown): RestorableApiKeyLlmConfig { + if (!isPlainObject(value)) { + return {}; + } + + const result: RestorableApiKeyLlmConfig = {}; + if (value.auth === "api-key") { + result.auth = "api-key"; + } + if (typeof value.apiKey === "string") { + result.apiKey = value.apiKey; + } + if (typeof value.model === "string") { + result.model = value.model; + } + if (typeof value.baseURL === "string") { + result.baseURL = value.baseURL; + } + if (typeof value.timeoutMs === "number" && Number.isFinite(value.timeoutMs) && value.timeoutMs > 0) { + result.timeoutMs = Math.trunc(value.timeoutMs); + } + return result; +} + +function extractOauthSafeLlmConfig(value: unknown): RestorableApiKeyLlmConfig { + if (!isPlainObject(value)) { + return {}; + } + + const result: RestorableApiKeyLlmConfig = {}; + if (typeof value.baseURL === "string") { + result.baseURL = value.baseURL; + } + if (typeof value.timeoutMs === "number" && Number.isFinite(value.timeoutMs) && value.timeoutMs > 0) { + result.timeoutMs = Math.trunc(value.timeoutMs); + } + return result; +} + +function hasRestorableApiKeyLlmConfig(value: RestorableApiKeyLlmConfig): boolean { + return Object.keys(value).length > 0; +} + +function buildLogoutFallbackLlmConfig(value: unknown): RestorableApiKeyLlmConfig { + if (isOauthLlmConfig(value)) { + return extractOauthSafeLlmConfig(value); + } + return extractRestorableApiKeyLlmConfig(value); +} + +function getOauthBackupPath(oauthPath: string): string { + const parsed = path.parse(oauthPath); + const fileName = parsed.ext + ? `${parsed.name}.llm-backup${parsed.ext}` + : `${parsed.base}.llm-backup.json`; + return path.join(parsed.dir, fileName); +} + +async function saveOauthLlmBackup(oauthPath: string, llm: unknown, hadLlmConfig: boolean): Promise { + const backupPath = getOauthBackupPath(oauthPath); + const payload: OAuthLlmBackup = { + version: 1, + hadLlmConfig, + llm: extractRestorableApiKeyLlmConfig(llm), + }; + await mkdir(path.dirname(backupPath), { recursive: true }); + await writeFile(backupPath, JSON.stringify(payload, null, 2) + "\n", "utf8"); +} + +async function loadOauthLlmBackup(oauthPath: string): Promise { + const backupPath = getOauthBackupPath(oauthPath); + try { + const raw = await readFile(backupPath, "utf8"); + const parsed = JSON.parse(raw); + if (!isPlainObject(parsed) || parsed.version !== 1 || typeof parsed.hadLlmConfig !== "boolean") { + return null; + } + return { + version: 1, + hadLlmConfig: parsed.hadLlmConfig, + llm: extractRestorableApiKeyLlmConfig(parsed.llm), + }; + } catch { + return null; + } +} + +const OAUTH_PROVIDER_CHOICES = listOAuthProviders() + .map((provider) => `${provider.id} (${provider.label})`) + .join(", "); + +function pickOauthProvider(currentProvider: string | undefined, overrideProvider: string | undefined): { + providerId: string; + source: "override" | "config" | "default"; +} { + if (overrideProvider && overrideProvider.trim()) { + return { providerId: normalizeOAuthProviderId(overrideProvider), source: "override" }; + } + + if (currentProvider && currentProvider.trim()) { + try { + return { providerId: normalizeOAuthProviderId(currentProvider), source: "config" }; + } catch { + // Fall back to the default provider when the saved config is stale or invalid. + } + } + + return { providerId: normalizeOAuthProviderId(), source: "default" }; +} + +async function promptOauthProviderSelection( + currentProviderId: string, + testHook?: CLIContext["oauthTestHooks"]["chooseProvider"], +): Promise<{ providerId: string; source: "prompt" | "default" }> { + const providers = listOAuthProviders(); + if (providers.length === 0) { + throw new Error("No OAuth providers are available."); + } + + if (testHook) { + const selected = await testHook(providers, currentProviderId); + return { providerId: normalizeOAuthProviderId(selected), source: "prompt" }; + } + + if (!process.stdin.isTTY || !process.stdout.isTTY) { + return { providerId: currentProviderId, source: "default" }; + } + + let selectedIndex = providers.findIndex((provider) => provider.id === currentProviderId); + if (selectedIndex < 0) selectedIndex = 0; + + readline.emitKeypressEvents(process.stdin); + const canSetRawMode = typeof process.stdin.setRawMode === "function"; + const previousRawMode = canSetRawMode ? !!process.stdin.isRaw : false; + const menuLines = 2 + providers.length; + let hasRendered = false; + + const render = () => { + if (hasRendered) { + readline.moveCursor(process.stdout, 0, -menuLines); + readline.cursorTo(process.stdout, 0); + readline.clearScreenDown(process.stdout); + } else { + process.stdout.write("\n"); + hasRendered = true; + } + + process.stdout.write("Select OAuth provider\n"); + process.stdout.write("Use arrow keys and Enter.\n"); + providers.forEach((provider, index) => { + const marker = index === selectedIndex ? ">" : " "; + process.stdout.write( + `${marker} ${provider.label} (${provider.id}) [default model: ${provider.defaultModel}]\n`, + ); + }); + }; + + return await new Promise((resolve, reject) => { + const cleanup = () => { + process.stdin.off("keypress", onKeypress); + if (canSetRawMode) { + process.stdin.setRawMode(previousRawMode); + } + process.stdin.pause(); + process.stdout.write("\n"); + }; + + const onKeypress = (_str: string, key: { name?: string; ctrl?: boolean }) => { + if (key.ctrl && key.name === "c") { + cleanup(); + reject(new Error("OAuth login cancelled while selecting a provider.")); + return; + } + + if (key.name === "escape") { + cleanup(); + reject(new Error("OAuth login cancelled while selecting a provider.")); + return; + } + + if (key.name === "up" || key.name === "left") { + selectedIndex = (selectedIndex - 1 + providers.length) % providers.length; + render(); + return; + } + + if (key.name === "down" || key.name === "right") { + selectedIndex = (selectedIndex + 1) % providers.length; + render(); + return; + } + + if (key.name === "return" || key.name === "enter") { + const provider = providers[selectedIndex]; + cleanup(); + resolve({ providerId: provider.id, source: "prompt" }); + } + }; + + render(); + process.stdin.on("keypress", onKeypress); + process.stdin.resume(); + if (canSetRawMode) { + process.stdin.setRawMode(true); + } + }); +} + +async function resolveOauthProviderSelection( + currentProvider: string | undefined, + overrideProvider: string | undefined, + chooseProviderHook?: CLIContext["oauthTestHooks"]["chooseProvider"], +): Promise<{ providerId: string; source: "override" | "config" | "default" | "prompt" }> { + if (overrideProvider && overrideProvider.trim()) { + return pickOauthProvider(currentProvider, overrideProvider); + } + + const initial = pickOauthProvider(currentProvider, undefined); + return await promptOauthProviderSelection(initial.providerId, chooseProviderHook); +} + +function pickOauthModel( + providerId: string, + currentModel: string | undefined, + overrideModel: string | undefined, +): { model: string; source: "override" | "config" | "default" } { + if (overrideModel && overrideModel.trim()) { + if (!isOauthModelSupported(providerId, overrideModel)) { + throw new Error( + `Model "${overrideModel}" is not supported for OAuth provider ${providerId}. Use a compatible model such as ${getDefaultOauthModelForProvider(providerId)}.`, + ); + } + return { model: overrideModel.trim(), source: "override" }; + } + + if (isOauthModelSupported(providerId, currentModel)) { + return { model: currentModel!.trim(), source: "config" }; + } + + return { model: getDefaultOauthModelForProvider(providerId), source: "default" }; +} + +async function loadOpenClawConfig(configPath: string): Promise> { + const raw = await readFile(configPath, "utf8"); + const parsed = JSON5.parse(raw); + if (!parsed || typeof parsed !== "object" || Array.isArray(parsed)) { + throw new Error(`Invalid OpenClaw config at ${configPath}: expected object`); + } + return parsed as Record; +} + +function ensurePluginConfigRoot(config: Record, pluginId: string): Record { + config.plugins ||= {}; + config.plugins.entries ||= {}; + config.plugins.entries[pluginId] ||= { enabled: true, config: {} }; + const entry = config.plugins.entries[pluginId]; + entry.enabled = true; + entry.config ||= {}; + return entry.config as Record; +} + +async function saveOpenClawConfig(configPath: string, config: Record): Promise { + await mkdir(path.dirname(configPath), { recursive: true }); + await writeFile(configPath, JSON.stringify(config, null, 2) + "\n", "utf8"); +} + function formatMemory(memory: any, index?: number): string { const prefix = index !== undefined ? `${index + 1}. ` : ""; const id = memory?.id ? String(memory.id) : "unknown"; @@ -112,6 +465,187 @@ export function registerMemoryCLI(program: Command, context: CLIContext): void { console.log(getPluginVersion()); }); + const auth = memory + .command("auth") + .description("Manage OAuth authentication for smart-extraction LLM access"); + + auth + .command("login") + .description("Authenticate with ChatGPT/Codex in a browser, save the plugin OAuth file, and switch this plugin to llm.auth=oauth") + .option("--config ", "OpenClaw config file to update") + .option("--provider ", `OAuth provider to use (${OAUTH_PROVIDER_CHOICES})`) + .option("--model ", "Override the model saved into llm.model") + .option("--oauth-path ", "OAuth file path (default: ~/.openclaw/.memory-lancedb-pro/oauth.json)") + .option("--timeout ", "OAuth callback timeout in seconds", "120") + .option("--no-browser", "Do not auto-open the browser; print the authorization URL only") + .action(async (options) => { + try { + const pluginId = context.pluginId || "memory-lancedb-pro"; + const currentLlm = context.pluginConfig?.llm; + const currentProvider = currentLlm && typeof currentLlm === "object" && typeof (currentLlm as any).oauthProvider === "string" + ? String((currentLlm as any).oauthProvider) + : undefined; + const selectedProvider = await resolveOauthProviderSelection( + currentProvider, + options.provider, + context.oauthTestHooks?.chooseProvider, + ); + const currentModel = currentLlm && typeof currentLlm === "object" && typeof (currentLlm as any).model === "string" + ? String((currentLlm as any).model) + : undefined; + const selectedModel = pickOauthModel(selectedProvider.providerId, currentModel, options.model); + const oauthModel = normalizeOauthModel(selectedModel.model); + const configPath = resolveOpenClawConfigPath(options.config); + const oauthPath = resolveLoginOauthPath(options.oauthPath); + const timeoutMs = clampInt((parseInt(options.timeout, 10) || 120) * 1000, 15_000, 900_000); + + if (selectedModel.source === "default" && currentModel && currentModel.trim()) { + console.log( + `Configured llm.model "${currentModel}" is not supported by provider ${selectedProvider.providerId}. Falling back to ${getDefaultOauthModelForProvider(selectedProvider.providerId)}.`, + ); + } + + console.log(`Config file: ${configPath}`); + console.log(`Provider: ${getOAuthProviderLabel(selectedProvider.providerId)} (${selectedProvider.providerId}, ${selectedProvider.source})`); + console.log(`OAuth file: ${oauthPath}`); + console.log(`Model: ${oauthModel} (${selectedModel.source})`); + + const { session } = await performOAuthLogin({ + authPath: oauthPath, + timeoutMs, + noBrowser: options.browser === false, + model: selectedModel.model, + providerId: selectedProvider.providerId, + onOpenUrl: context.oauthTestHooks?.openUrl, + onAuthorizeUrl: async (url) => { + console.log(`Authorization URL: ${url}`); + await context.oauthTestHooks?.authorizeUrl?.(url); + }, + }); + + const openclawConfig = await loadOpenClawConfig(configPath); + const pluginConfig = ensurePluginConfigRoot(openclawConfig, pluginId); + const hadLlmConfig = isPlainObject(pluginConfig.llm); + const existingLlm = hadLlmConfig ? { ...(pluginConfig.llm as Record) } : {}; + const wasOauthMode = isOauthLlmConfig(existingLlm); + + if (!wasOauthMode) { + await saveOauthLlmBackup(oauthPath, pluginConfig.llm, hadLlmConfig); + } + + const nextLlm = wasOauthMode ? { ...existingLlm } : extractOauthSafeLlmConfig(existingLlm); + delete nextLlm.apiKey; + if (!wasOauthMode) { + delete nextLlm.baseURL; + } + pluginConfig.llm = { + ...nextLlm, + auth: "oauth", + oauthProvider: selectedProvider.providerId, + model: oauthModel, + oauthPath, + }; + await saveOpenClawConfig(configPath, openclawConfig); + + console.log(`OAuth login completed for account ${session.accountId}.`); + console.log( + `Updated ${pluginId} config: llm.auth=oauth, llm.oauthProvider=${selectedProvider.providerId}, llm.oauthPath=${oauthPath}, llm.model=${oauthModel}`, + ); + } catch (error) { + console.error("OAuth login failed:", error); + process.exit(1); + } + }); + + auth + .command("status") + .description("Show the current OAuth configuration for this plugin") + .option("--config ", "OpenClaw config file to inspect") + .action(async (options) => { + try { + const pluginId = context.pluginId || "memory-lancedb-pro"; + const configPath = resolveOpenClawConfigPath(options.config); + const openclawConfig = await loadOpenClawConfig(configPath); + const pluginConfig = ensurePluginConfigRoot(openclawConfig, pluginId); + const llm = typeof pluginConfig.llm === "object" && pluginConfig.llm ? pluginConfig.llm as Record : {}; + const oauthProviderRaw = typeof llm.oauthProvider === "string" && llm.oauthProvider.trim() + ? llm.oauthProvider.trim() + : normalizeOAuthProviderId(); + let oauthProviderDisplay = `${oauthProviderRaw} (unknown)`; + try { + oauthProviderDisplay = `${normalizeOAuthProviderId(oauthProviderRaw)} (${getOAuthProviderLabel(oauthProviderRaw)})`; + } catch { + // Leave the raw provider id visible for debugging stale or unsupported configs. + } + const oauthPath = resolveConfiguredOauthPath(configPath, llm.oauthPath); + + let tokenInfo = "missing"; + try { + const session = await readFile(oauthPath, "utf8"); + tokenInfo = session.trim() ? "present" : "empty"; + } catch { + tokenInfo = "missing"; + } + + console.log(`Config file: ${configPath}`); + console.log(`Plugin: ${pluginId}`); + console.log(`llm.auth: ${typeof llm.auth === "string" ? llm.auth : "api-key"}`); + console.log(`llm.oauthProvider: ${oauthProviderDisplay}`); + console.log(`llm.model: ${typeof llm.model === "string" ? llm.model : "openai/gpt-oss-120b"}`); + console.log(`llm.oauthPath: ${oauthPath}`); + console.log(`oauth file: ${tokenInfo}`); + } catch (error) { + console.error("OAuth status failed:", error); + process.exit(1); + } + }); + + auth + .command("logout") + .description("Delete the plugin OAuth file and switch this plugin back to llm.auth=api-key") + .option("--config ", "OpenClaw config file to update") + .option("--oauth-path ", "OAuth file path to remove") + .action(async (options) => { + try { + const pluginId = context.pluginId || "memory-lancedb-pro"; + const configPath = resolveOpenClawConfigPath(options.config); + const openclawConfig = await loadOpenClawConfig(configPath); + const pluginConfig = ensurePluginConfigRoot(openclawConfig, pluginId); + const llm = typeof pluginConfig.llm === "object" && pluginConfig.llm ? pluginConfig.llm as Record : {}; + const oauthPath = + options.oauthPath && String(options.oauthPath).trim() + ? resolveLoginOauthPath(options.oauthPath) + : resolveConfiguredOauthPath(configPath, llm.oauthPath); + const backupPath = getOauthBackupPath(oauthPath); + const backup = await loadOauthLlmBackup(oauthPath); + + await rm(oauthPath, { force: true }); + await rm(backupPath, { force: true }); + + if (backup) { + if (backup.hadLlmConfig) { + pluginConfig.llm = { ...backup.llm }; + } else { + delete pluginConfig.llm; + } + } else { + const fallbackLlm = buildLogoutFallbackLlmConfig(llm); + if (hasRestorableApiKeyLlmConfig(fallbackLlm)) { + pluginConfig.llm = fallbackLlm; + } else { + delete pluginConfig.llm; + } + } + await saveOpenClawConfig(configPath, openclawConfig); + + console.log(`Deleted OAuth file: ${oauthPath}`); + console.log(`Updated ${pluginId} config: llm.auth=api-key`); + } catch (error) { + console.error("OAuth logout failed:", error); + process.exit(1); + } + }); + // List memories memory .command("list") @@ -502,6 +1036,131 @@ export function registerMemoryCLI(program: Command, context: CLIContext): void { } }); + /** + * import-markdown: Import memories from Markdown memory files into the plugin store. + * Targets MEMORY.md and memory/YYYY-MM-DD.md files found in OpenClaw workspaces. + */ + memory + .command("import-markdown [workspace-glob]") + .description("Import memories from Markdown files (MEMORY.md, memory/YYYY-MM-DD.md) into the plugin store") + .option("--dry-run", "Show what would be imported without importing") + .option("--scope ", "Import into specific scope (default: global)") + .option( + "--openclaw-home ", + "OpenClaw home directory (default: ~/.openclaw)", + ) + .action(async (workspaceGlob, options) => { + const openclawHome = options.openclawHome + ? path.resolve(options.openclawHome) + : path.join(homedir(), ".openclaw"); + + const workspaceDir = path.join(openclawHome, "workspace"); + let imported = 0; + let skipped = 0; + let foundFiles = 0; + + if (!context.embedder) { + console.error( + "import-markdown requires an embedder. Use via plugin CLI or ensure embedder is configured.", + ); + process.exit(1); + } + + // Scan workspace directories + let workspaceEntries: string[]; + try { + const fsPromises = await import("node:fs/promises"); + workspaceEntries = await fsPromises.readdir(workspaceDir, { withFileTypes: true }); + } catch { + console.error(`Failed to read workspace directory: ${workspaceDir}`); + process.exit(1); + } + + // Collect all markdown files to scan + const mdFiles: Array<{ filePath: string; scope: string }> = []; + + for (const entry of workspaceEntries) { + if (!entry.isDirectory()) continue; + if (workspaceGlob && !entry.name.includes(workspaceGlob)) continue; + + const workspacePath = path.join(workspaceDir, entry.name); + + // MEMORY.md + const memoryMd = path.join(workspacePath, "MEMORY.md"); + try { + const { stat } = await import("node:fs/promises"); + await stat(memoryMd); + mdFiles.push({ filePath: memoryMd, scope: entry.name }); + } catch { /* not found */ } + + // memory/ directory + const memoryDir = path.join(workspacePath, "memory"); + try { + const { stat } = await import("node:fs/promises"); + const stats = await stat(memoryDir); + if (stats.isDirectory()) { + const { readdir } = await import("node:fs/promises"); + const files = await readdir(memoryDir); + for (const f of files) { + if (f.endsWith(".md") && /^\d{4}-\d{2}-\d{2}/.test(f)) { + mdFiles.push({ filePath: path.join(memoryDir, f), scope: entry.name }); + } + } + } + } catch { /* not found */ } + } + + if (mdFiles.length === 0) { + console.log("No Markdown memory files found."); + return; + } + + const targetScope = options.scope || "global"; + + // Parse each file for memory entries (lines starting with "- ") + for (const { filePath, scope } of mdFiles) { + foundFiles++; + const { readFile } = await import("node:fs/promises"); + const content = await readFile(filePath, "utf-8"); + const lines = content.split("\n"); + + for (const line of lines) { + // Skip non-memory lines + if (!line.startsWith("- ")) continue; + const text = line.slice(2).trim(); + if (text.length < 5) { skipped++; continue; } + + if (options.dryRun) { + console.log(` [dry-run] would import: ${text.slice(0, 80)}...`); + imported++; + continue; + } + + try { + const vector = await context.embedder!.embedQuery(text); + await context.store.store({ + text, + vector, + importance: 0.7, + category: "other", + scope: targetScope, + metadata: { importedFrom: filePath, sourceScope: scope }, + }); + imported++; + } catch (err) { + console.warn(` Failed to import: ${text.slice(0, 60)}... — ${err}`); + skipped++; + } + } + } + + if (options.dryRun) { + console.log(`\nDRY RUN — found ${foundFiles} files, ${imported} entries would be imported, ${skipped} skipped`); + } else { + console.log(`\nImport complete: ${imported} imported, ${skipped} skipped (scanned ${foundFiles} files)`); + } + }); + // Re-embed an existing LanceDB into the current target DB (A/B testing) memory .command("reembed") diff --git a/index.ts b/index.ts index 32f5e778..52f1962e 100644 --- a/index.ts +++ b/index.ts @@ -13,16 +13,29 @@ import { pathToFileURL } from "node:url"; import { createRequire } from "node:module"; import { spawn } from "node:child_process"; +// Detect CLI mode: when running as a CLI subcommand (e.g. `openclaw memory-pro stats`), +// OpenClaw sets OPENCLAW_CLI=1 in the process environment. Registration and +// lifecycle logs are noisy in CLI context (printed to stderr before command output), +// so we downgrade them to debug level when running in CLI mode. +const isCliMode = () => process.env.OPENCLAW_CLI === "1"; + // Import core components import { MemoryStore, validateStoragePath } from "./src/store.js"; import { createEmbedder, getVectorDimensions } from "./src/embedder.js"; import { createRetriever, DEFAULT_RETRIEVAL_CONFIG } from "./src/retriever.js"; -import { createScopeManager } from "./src/scopes.js"; +import { createScopeManager, resolveScopeFilter, isSystemBypassId, parseAgentIdFromSessionKey } from "./src/scopes.js"; import { createMigrator } from "./src/migrate.js"; import { registerAllMemoryTools } from "./src/tools.js"; import { appendSelfImprovementEntry, ensureSelfImprovementLearningFiles } from "./src/self-improvement-files.js"; import type { MdMirrorWriter } from "./src/tools.js"; import { shouldSkipRetrieval } from "./src/adaptive-retrieval.js"; +import { parseClawteamScopes, applyClawteamScopes } from "./src/clawteam-scope.js"; +import { + runCompaction, + shouldRunCompaction, + recordCompactionRun, + type CompactionConfig, +} from "./src/memory-compactor.js"; import { runWithReflectionTransientRetryOnce } from "./src/reflection-retry.js"; import { resolveReflectionSessionSearchDirs, stripResetSuffix } from "./src/session-recovery.js"; import { @@ -38,9 +51,11 @@ import { createReflectionEventId } from "./src/reflection-event-store.js"; import { buildReflectionMappedMetadata } from "./src/reflection-mapped-metadata.js"; import { createMemoryCLI } from "./cli.js"; import { isNoise } from "./src/noise-filter.js"; +import { normalizeAutoCaptureText } from "./src/auto-capture-cleanup.js"; // Import smart extraction & lifecycle components -import { SmartExtractor } from "./src/smart-extractor.js"; +import { SmartExtractor, createExtractionRateLimiter } from "./src/smart-extractor.js"; +import { compressTexts, estimateConversationValue } from "./src/session-compressor.js"; import { NoisePrototypeBank } from "./src/noise-prototypes.js"; import { createLlmClient } from "./src/llm-client.js"; import { createDecayEngine, DEFAULT_DECAY_CONFIG } from "./src/decay-engine.js"; @@ -52,6 +67,18 @@ import { stringifySmartMetadata, toLifecycleMemory, } from "./src/smart-metadata.js"; +import { + filterUserMdExclusiveRecallResults, + isUserMdExclusiveMemory, + type WorkspaceBoundaryConfig, +} from "./src/workspace-boundary.js"; +import { + normalizeAdmissionControlConfig, + resolveRejectedAuditFilePath, + type AdmissionControlConfig, + type AdmissionRejectionAuditEntry, +} from "./src/admission-control.js"; +import { analyzeIntent, applyCategoryBoost } from "./src/intent-analyzer.js"; // ============================================================================ // Configuration & Types @@ -64,6 +91,7 @@ interface PluginConfig { model?: string; baseURL?: string; dimensions?: number; + omitDimensions?: boolean; taskQuery?: string; taskPassage?: string; normalized?: boolean; @@ -74,6 +102,13 @@ interface PluginConfig { autoRecall?: boolean; autoRecallMinLength?: number; autoRecallMinRepeated?: number; + autoRecallTimeoutMs?: number; + autoRecallMaxItems?: number; + autoRecallMaxChars?: number; + autoRecallPerItemMaxChars?: number; + /** Hard per-turn injection cap (safety valve). Overrides autoRecallMaxItems if lower. Default: 10. */ + maxRecallPerTurn?: number; + recallMode?: "full" | "summary" | "adaptive" | "off"; captureAssistant?: boolean; retrieval?: { mode?: "hybrid" | "vector"; @@ -85,7 +120,13 @@ interface PluginConfig { rerankApiKey?: string; rerankModel?: string; rerankEndpoint?: string; - rerankProvider?: "jina" | "siliconflow" | "voyage" | "pinecone" | "dashscope"; + rerankProvider?: + | "jina" + | "siliconflow" + | "voyage" + | "pinecone" + | "dashscope" + | "tei"; recencyHalfLifeDays?: number; recencyWeight?: number; filterNoise?: boolean; @@ -122,9 +163,13 @@ interface PluginConfig { // Smart extraction config smartExtraction?: boolean; llm?: { + auth?: "api-key" | "oauth"; apiKey?: string; model?: string; baseURL?: string; + oauthProvider?: string; + oauthPath?: string; + timeoutMs?: number; }; extractMinMessages?: number; extractMaxChars?: number; @@ -156,6 +201,24 @@ interface PluginConfig { dedupeErrorSignals?: boolean; }; mdMirror?: { enabled?: boolean; dir?: string }; + workspaceBoundary?: WorkspaceBoundaryConfig; + admissionControl?: AdmissionControlConfig; + memoryCompaction?: { + enabled?: boolean; + minAgeDays?: number; + similarityThreshold?: number; + minClusterSize?: number; + maxMemoriesToScan?: number; + cooldownHours?: number; + }; + sessionCompression?: { + enabled?: boolean; + minScoreToKeep?: number; + }; + extractionThrottle?: { + skipLowValue?: boolean; + maxExtractionsPerHour?: number; + }; } type ReflectionThinkLevel = "off" | "minimal" | "low" | "medium" | "high"; @@ -191,6 +254,23 @@ function resolveEnvVars(value: string): string { }); } +function resolveFirstApiKey(apiKey: string | string[]): string { + const key = Array.isArray(apiKey) ? apiKey[0] : apiKey; + if (!key) { + throw new Error("embedding.apiKey is empty"); + } + return resolveEnvVars(key); +} + +function resolveOptionalPathWithEnv( + api: Pick, + value: string | undefined, + fallback: string, +): string { + const raw = typeof value === "string" && value.trim().length > 0 ? value.trim() : fallback; + return api.resolvePath(resolveEnvVars(raw)); +} + function parsePositiveInt(value: unknown): number | undefined { if (typeof value === "number" && Number.isFinite(value) && value > 0) { return Math.floor(value); @@ -205,11 +285,30 @@ function parsePositiveInt(value: unknown): number | undefined { return undefined; } +function clampInt(value: number, min: number, max: number): number { + if (!Number.isFinite(value)) return min; + return Math.min(max, Math.max(min, Math.floor(value))); +} + +function resolveLlmTimeoutMs(config: PluginConfig): number { + return parsePositiveInt(config.llm?.timeoutMs) ?? 30000; +} + function resolveHookAgentId( explicitAgentId: string | undefined, sessionKey: string | undefined, ): string { - return explicitAgentId || parseAgentIdFromSessionKey(sessionKey) || "main"; + const trimmedExplicit = explicitAgentId?.trim(); + return (trimmedExplicit && trimmedExplicit.length > 0 + ? trimmedExplicit + : parseAgentIdFromSessionKey(sessionKey)) || "main"; +} + +function resolveSourceFromSessionKey(sessionKey: string | undefined): string { + const trimmed = sessionKey?.trim() ?? ""; + const match = /^agent:[^:]+:([^:]+)/.exec(trimmed); + const source = match?.[1]?.trim(); + return source || "unknown"; } function summarizeAgentEndMessages(messages: unknown[]): string { @@ -328,6 +427,7 @@ function getExtensionApiImportSpecifiers(): string[] { specifiers.push(toImportSpecifier("/usr/lib/node_modules/openclaw/dist/extensionAPI.js")); specifiers.push(toImportSpecifier("/usr/local/lib/node_modules/openclaw/dist/extensionAPI.js")); + specifiers.push(toImportSpecifier("/opt/homebrew/lib/node_modules/openclaw/dist/extensionAPI.js")); return [...new Set(specifiers.filter(Boolean))]; } @@ -541,13 +641,6 @@ async function loadSelfImprovementReminderContent(workspaceDir?: string): Promis } } -function parseAgentIdFromSessionKey(sessionKey: string | undefined): string | undefined { - const sk = (sessionKey ?? "").trim(); - const parts = sk.split(":"); - if (parts.length >= 2 && parts[0] === "agent" && parts[1]) return parts[1]; - return undefined; -} - function resolveAgentPrimaryModelRef(cfg: unknown, agentId: string): string | undefined { try { const root = cfg as Record; @@ -649,65 +742,10 @@ function shouldSkipReflectionMessage(role: string, text: string): boolean { return false; } -const AUTO_CAPTURE_INBOUND_META_SENTINELS = [ - "Conversation info (untrusted metadata):", - "Sender (untrusted metadata):", - "Thread starter (untrusted, for context):", - "Replied message (untrusted, for context):", - "Forwarded message context (untrusted metadata):", - "Chat history since last reply (untrusted, for context):", -] as const; - -const AUTO_CAPTURE_SESSION_RESET_PREFIX = - "A new session was started via /new or /reset. Execute your Session Startup sequence now"; -const AUTO_CAPTURE_ADDRESSING_PREFIX_RE = /^(?:<@!?[0-9]+>|@[A-Za-z0-9_.-]+)\s*/; const AUTO_CAPTURE_MAP_MAX_ENTRIES = 2000; const AUTO_CAPTURE_EXPLICIT_REMEMBER_RE = /^(?:请|請)?(?:记住|記住|记一下|記一下|别忘了|別忘了)[。.!??!]*$/u; -function isAutoCaptureInboundMetaSentinelLine(line: string): boolean { - const trimmed = line.trim(); - return AUTO_CAPTURE_INBOUND_META_SENTINELS.some((sentinel) => sentinel === trimmed); -} - -function stripLeadingInboundMetadata(text: string): string { - if (!text || !AUTO_CAPTURE_INBOUND_META_SENTINELS.some((sentinel) => text.includes(sentinel))) { - return text; - } - - const lines = text.split("\n"); - let index = 0; - while (index < lines.length && lines[index].trim() === "") { - index++; - } - - while (index < lines.length && isAutoCaptureInboundMetaSentinelLine(lines[index])) { - index++; - if (index < lines.length && lines[index].trim() === "```json") { - index++; - while (index < lines.length && lines[index].trim() !== "```") { - index++; - } - if (index < lines.length && lines[index].trim() === "```") { - index++; - } - } else { - // Sentinel line not followed by a ```json fenced block — unexpected format. - // Log and return original text to avoid lossy stripping. - _autoCaptureDebugLog( - `memory-lancedb-pro: stripLeadingInboundMetadata: sentinel line not followed by json fenced block at line ${index}, returning original text`, - ); - return text; - } - - while (index < lines.length && lines[index].trim() === "") { - index++; - } - } - - return lines.slice(index).join("\n").trim(); -} - /** * Prune a Map to stay within the given maximum number of entries. * Deletes the oldest (earliest-inserted) keys when over the limit. @@ -722,28 +760,6 @@ function pruneMapIfOver(map: Map, maxEntries: number): void { } } -function stripAutoCaptureSessionResetPrefix(text: string): string { - const trimmed = text.trim(); - if (!trimmed.startsWith(AUTO_CAPTURE_SESSION_RESET_PREFIX)) { - return trimmed; - } - - const blankLineIndex = trimmed.indexOf("\n\n"); - if (blankLineIndex >= 0) { - return trimmed.slice(blankLineIndex + 2).trim(); - } - - const lines = trimmed.split("\n"); - if (lines.length <= 2) { - return ""; - } - return lines.slice(2).join("\n").trim(); -} - -function stripAutoCaptureAddressingPrefix(text: string): string { - return text.replace(AUTO_CAPTURE_ADDRESSING_PREFIX_RE, "").trim(); -} - function isExplicitRememberCommand(text: string): boolean { return AUTO_CAPTURE_EXPLICIT_REMEMBER_RE.test(text.trim()); } @@ -773,34 +789,6 @@ function buildAutoCaptureConversationKeyFromSessionKey(sessionKey: string): stri return suffix || null; } -function stripAutoCaptureInjectedPrefix(role: string, text: string): string { - if (role !== "user") { - return text.trim(); - } - - let normalized = text.trim(); - normalized = normalized.replace(/^\s*[\s\S]*?<\/relevant-memories>\s*/i, ""); - normalized = normalized.replace( - /^\[UNTRUSTED DATA[^\n]*\][\s\S]*?\[END UNTRUSTED DATA\]\s*/i, - "", - ); - normalized = stripAutoCaptureSessionResetPrefix(normalized); - normalized = stripLeadingInboundMetadata(normalized); - normalized = stripAutoCaptureAddressingPrefix(normalized); - return normalized.trim(); -} - -/** Module-level debug logger for auto-capture helpers; set during plugin registration. */ -let _autoCaptureDebugLog: (msg: string) => void = () => { }; - -function normalizeAutoCaptureText(role: unknown, text: string): string | null { - if (typeof role !== "string") return null; - const normalized = stripAutoCaptureInjectedPrefix(role, text); - if (!normalized) return null; - if (shouldSkipReflectionMessage(role, normalized)) return null; - return normalized; -} - function redactSecrets(text: string): string { const patterns: RegExp[] = [ /Bearer\s+[A-Za-z0-9\-._~+/]+=*/g, @@ -888,31 +876,48 @@ function extractTextFromToolResult(result: unknown): string { } } +function summarizeRecentConversationMessages( + messages: readonly unknown[], + messageCount: number, +): string | null { + if (!Array.isArray(messages) || messages.length === 0) return null; + + const recent: string[] = []; + for (let index = messages.length - 1; index >= 0 && recent.length < messageCount; index--) { + const raw = messages[index]; + if (!raw || typeof raw !== "object") continue; + + const msg = raw as Record; + const role = typeof msg.role === "string" ? msg.role : ""; + if (role !== "user" && role !== "assistant") continue; + + const text = extractTextContent(msg.content); + if (!text || shouldSkipReflectionMessage(role, text)) continue; + + recent.push(`${role}: ${redactSecrets(text)}`); + } + + if (recent.length === 0) return null; + recent.reverse(); + return recent.join("\n"); +} + async function readSessionConversationForReflection(filePath: string, messageCount: number): Promise { try { const lines = (await readFile(filePath, "utf-8")).trim().split("\n"); - const messages: string[] = []; + const messages: unknown[] = []; for (const line of lines) { try { const entry = JSON.parse(line); if (entry?.type !== "message" || !entry?.message) continue; - - const msg = entry.message as Record; - const role = typeof msg.role === "string" ? msg.role : ""; - if (role !== "user" && role !== "assistant") continue; - - const text = extractTextContent(msg.content); - if (!text || shouldSkipReflectionMessage(role, text)) continue; - - messages.push(`${role}: ${redactSecrets(text)}`); + messages.push(entry.message); } catch { // ignore JSON parse errors } } - if (messages.length === 0) return null; - return messages.slice(-messageCount).join("\n"); + return summarizeRecentConversationMessages(messages, messageCount); } catch { return null; } @@ -1552,6 +1557,36 @@ function createMdMirrorWriter( }; } +// ============================================================================ +// Admission Control Audit Writer +// ============================================================================ + +function createAdmissionRejectionAuditWriter( + config: PluginConfig, + resolvedDbPath: string, + api: OpenClawPluginApi, +): ((entry: AdmissionRejectionAuditEntry) => Promise) | null { + if ( + config.admissionControl?.enabled !== true || + config.admissionControl.persistRejectedAudits !== true + ) { + return null; + } + + const filePath = api.resolvePath( + resolveRejectedAuditFilePath(resolvedDbPath, config.admissionControl), + ); + + return async (entry: AdmissionRejectionAuditEntry) => { + try { + await mkdir(dirname(filePath), { recursive: true }); + await appendFile(filePath, `${JSON.stringify(entry)}\n`, "utf8"); + } catch (err) { + api.logger.warn(`memory-lancedb-pro: admission rejection audit write failed: ${String(err)}`); + } + }; +} + // ============================================================================ // Version // ============================================================================ @@ -1611,6 +1646,7 @@ const memoryLanceDBProPlugin = { model: config.embedding.model || "text-embedding-3-small", baseURL: config.embedding.baseURL, dimensions: config.embedding.dimensions, + omitDimensions: config.embedding.omitDimensions, taskQuery: config.embedding.taskQuery, taskPassage: config.embedding.taskPassage, normalized: config.embedding.normalized, @@ -1635,25 +1671,48 @@ const memoryLanceDBProPlugin = { { decayEngine }, ); const scopeManager = createScopeManager(config.scopes); + + // ClawTeam integration: extend accessible scopes via env var + const clawteamScopes = parseClawteamScopes(process.env.CLAWTEAM_MEMORY_SCOPE); + if (clawteamScopes.length > 0) { + applyClawteamScopes(scopeManager, clawteamScopes); + api.logger.info(`memory-lancedb-pro: CLAWTEAM_MEMORY_SCOPE added scopes: ${clawteamScopes.join(", ")}`); + } + const migrator = createMigrator(store); // Initialize smart extraction let smartExtractor: SmartExtractor | null = null; if (config.smartExtraction !== false) { try { - const llmApiKey = config.llm?.apiKey - ? resolveEnvVars(config.llm.apiKey) - : resolveEnvVars(config.embedding.apiKey); - const llmBaseURL = config.llm?.baseURL - ? resolveEnvVars(config.llm.baseURL) - : config.embedding.baseURL; + const llmAuth = config.llm?.auth || "api-key"; + const llmApiKey = llmAuth === "oauth" + ? undefined + : config.llm?.apiKey + ? resolveEnvVars(config.llm.apiKey) + : resolveFirstApiKey(config.embedding.apiKey); + const llmBaseURL = llmAuth === "oauth" + ? (config.llm?.baseURL ? resolveEnvVars(config.llm.baseURL) : undefined) + : config.llm?.baseURL + ? resolveEnvVars(config.llm.baseURL) + : config.embedding.baseURL; const llmModel = config.llm?.model || "openai/gpt-oss-120b"; + const llmOauthPath = llmAuth === "oauth" + ? resolveOptionalPathWithEnv(api, config.llm?.oauthPath, ".memory-lancedb-pro/oauth.json") + : undefined; + const llmOauthProvider = llmAuth === "oauth" + ? config.llm?.oauthProvider + : undefined; + const llmTimeoutMs = resolveLlmTimeoutMs(config); const llmClient = createLlmClient({ + auth: llmAuth, apiKey: llmApiKey, model: llmModel, baseURL: llmBaseURL, - timeoutMs: 30000, + oauthProvider: llmOauthProvider, + oauthPath: llmOauthPath, + timeoutMs: llmTimeoutMs, log: (msg: string) => api.logger.debug(msg), }); @@ -1665,22 +1724,43 @@ const memoryLanceDBProPlugin = { api.logger.debug(`memory-lancedb-pro: noise bank init: ${String(err)}`), ); + const admissionRejectionAuditWriter = createAdmissionRejectionAuditWriter( + config, + resolvedDbPath, + api, + ); + smartExtractor = new SmartExtractor(store, embedder, llmClient, { user: "User", - extractMinMessages: config.extractMinMessages ?? 2, + extractMinMessages: config.extractMinMessages ?? 4, extractMaxChars: config.extractMaxChars ?? 8000, defaultScope: config.scopes?.default ?? "global", + workspaceBoundary: config.workspaceBoundary, + admissionControl: config.admissionControl, + onAdmissionRejected: admissionRejectionAuditWriter ?? undefined, log: (msg: string) => api.logger.info(msg), debugLog: (msg: string) => api.logger.debug(msg), noiseBank, }); - api.logger.info("memory-lancedb-pro: smart extraction enabled (LLM model: " + llmModel + ", noise bank: ON)"); + (isCliMode() ? api.logger.debug : api.logger.info)( + "memory-lancedb-pro: smart extraction enabled (LLM model: " + + llmModel + + ", timeoutMs: " + + llmTimeoutMs + + ", noise bank: ON)", + ); } catch (err) { api.logger.warn(`memory-lancedb-pro: smart extraction init failed, falling back to regex: ${String(err)}`); } } + // Extraction rate limiter (Feature 7: Adaptive Extraction Throttling) + // NOTE: This rate limiter is global — shared across all agents in multi-agent setups. + const extractionRateLimiter = createExtractionRateLimiter({ + maxExtractionsPerHour: config.extractionThrottle?.maxExtractionsPerHour, + }); + async function sleep(ms: number): Promise { await new Promise(resolve => setTimeout(resolve, ms)); } @@ -1701,7 +1781,7 @@ const memoryLanceDBProPlugin = { async function runRecallLifecycle( results: Array<{ entry: { id: string; text: string; category: "preference" | "fact" | "decision" | "entity" | "other"; scope: string; importance: number; timestamp: number; metadata?: string } }>, - scopeFilter: string[], + scopeFilter?: string[], ): Promise> { const now = Date.now(); type LifecycleEntry = { @@ -1732,11 +1812,15 @@ const memoryLanceDBProPlugin = { ); try { - const recentEntries = await store.list(scopeFilter, undefined, 100, 0); - for (const entry of recentEntries) { - if (!lifecycleEntries.has(entry.id)) { - lifecycleEntries.set(entry.id, entry); + if (scopeFilter !== undefined) { + const recentEntries = await store.list(scopeFilter, undefined, 100, 0); + for (const entry of recentEntries) { + if (!lifecycleEntries.has(entry.id)) { + lifecycleEntries.set(entry.id, entry); + } } + } else { + api.logger.debug(`memory-lancedb-pro: skipping tier maintenance preload for bypass scope filter`); } } catch (err) { api.logger.warn(`memory-lancedb-pro: tier maintenance preload failed: ${String(err)}`); @@ -1850,17 +1934,40 @@ const memoryLanceDBProPlugin = { return clipped; }; - const loadAgentReflectionSlices = async (agentId: string, scopeFilter: string[]) => { - const cacheKey = `${agentId}::${[...scopeFilter].sort().join(",")}`; + const loadAgentReflectionSlices = async (agentId: string, scopeFilter?: string[]) => { + const scopeKey = Array.isArray(scopeFilter) + ? `scopes:${[...scopeFilter].sort().join(",")}` + : ""; + const cacheKey = `${agentId}::${scopeKey}`; const cached = reflectionByAgentCache.get(cacheKey); if (cached && Date.now() - cached.updatedAt < 15_000) return cached; - const entries = await store.list(scopeFilter, undefined, 120, 0); - const { invariants, derived } = loadAgentReflectionSlicesFromEntries({ + // Prefer reflection-category rows to avoid full-table reads on bypass callers. + // Fall back to an uncategorized scan only when the category query produced no + // agent-owned reflection slices, preserving backward compatibility with mixed-schema stores. + let entries = await store.list(scopeFilter, "reflection", 240, 0); + let slices = loadAgentReflectionSlicesFromEntries({ entries, agentId, deriveMaxAgeMs: DEFAULT_REFLECTION_DERIVED_MAX_AGE_MS, }); + if (slices.invariants.length === 0 && slices.derived.length === 0) { + const legacyEntries = await store.list(scopeFilter, undefined, 240, 0); + entries = legacyEntries.filter((entry) => { + try { + const metadata = parseReflectionMetadata(entry.metadata); + return isReflectionMetadataType(metadata.type) && isOwnedByAgent(metadata, agentId); + } catch { + return false; + } + }); + slices = loadAgentReflectionSlicesFromEntries({ + entries, + agentId, + deriveMaxAgeMs: DEFAULT_REFLECTION_DERIVED_MAX_AGE_MS, + }); + } + const { invariants, derived } = slices; const next = { updatedAt: Date.now(), invariants, derived }; reflectionByAgentCache.set(cacheKey, next); return next; @@ -1880,20 +1987,18 @@ const memoryLanceDBProPlugin = { const autoCapturePendingIngressTexts = new Map(); const autoCaptureRecentTexts = new Map(); - // Wire up the module-level debug logger for pure helper functions. - _autoCaptureDebugLog = (msg: string) => api.logger.debug(msg); - - api.logger.info( + const logReg = isCliMode() ? api.logger.debug : api.logger.info; + logReg( `memory-lancedb-pro@${pluginVersion}: plugin registered (db: ${resolvedDbPath}, model: ${config.embedding.model || "text-embedding-3-small"}, smartExtraction: ${smartExtractor ? 'ON' : 'OFF'})` ); - api.logger.info(`memory-lancedb-pro: diagnostic build tag loaded (${DIAG_BUILD_TAG})`); + logReg(`memory-lancedb-pro: diagnostic build tag loaded (${DIAG_BUILD_TAG})`); - api.on("message_received", (event, ctx) => { + api.on("message_received", (event: any, ctx: any) => { const conversationKey = buildAutoCaptureConversationKeyFromIngress( ctx.channelId, ctx.conversationId, ); - const normalized = normalizeAutoCaptureText("user", event.content); + const normalized = normalizeAutoCaptureText("user", event.content, shouldSkipReflectionMessage); if (conversationKey && normalized) { const queue = autoCapturePendingIngressTexts.get(conversationKey) || []; queue.push(normalized); @@ -1905,7 +2010,7 @@ const memoryLanceDBProPlugin = { ); }); - api.on("before_message_write", (event, ctx) => { + api.on("before_message_write", (event: any, ctx: any) => { const message = event.message as Record | undefined; const role = message && typeof message.role === "string" && message.role.trim().length > 0 @@ -1939,6 +2044,7 @@ const memoryLanceDBProPlugin = { agentId: undefined, // Will be determined at runtime from context workspaceDir: getDefaultWorkspaceDir(), mdMirror, + workspaceBoundary: config.workspaceBoundary, }, { enableManagementTools: config.enableManagementTools, @@ -1946,6 +2052,128 @@ const memoryLanceDBProPlugin = { } ); + // ======================================================================== + // Memory Compaction (Progressive Summarization) + // ======================================================================== + + if (config.enableManagementTools) { + api.registerTool({ + name: "memory_compact", + description: + "Consolidate semantically similar old memories into refined single entries " + + "(progressive summarization). Reduces noise and improves retrieval quality over time. " + + "Use dry_run:true first to preview the compaction plan without making changes.", + inputSchema: { + type: "object" as const, + properties: { + dry_run: { + type: "boolean", + description: "Preview clusters without writing changes. Default: false.", + }, + min_age_days: { + type: "number", + description: "Only compact memories at least this many days old. Default: 7.", + }, + similarity_threshold: { + type: "number", + description: "Cosine similarity threshold for clustering [0-1]. Default: 0.88.", + }, + scopes: { + type: "array", + items: { type: "string" }, + description: "Scope filter. Omit to compact all scopes.", + }, + }, + required: [], + }, + execute: async (args: Record) => { + const compactionCfg: CompactionConfig = { + enabled: true, + minAgeDays: + typeof args.min_age_days === "number" + ? args.min_age_days + : (config.memoryCompaction?.minAgeDays ?? 7), + similarityThreshold: + typeof args.similarity_threshold === "number" + ? Math.max(0, Math.min(1, args.similarity_threshold)) + : (config.memoryCompaction?.similarityThreshold ?? 0.88), + minClusterSize: config.memoryCompaction?.minClusterSize ?? 2, + maxMemoriesToScan: config.memoryCompaction?.maxMemoriesToScan ?? 200, + dryRun: args.dry_run === true, + cooldownHours: config.memoryCompaction?.cooldownHours ?? 24, + }; + const scopes = + Array.isArray(args.scopes) && args.scopes.length > 0 + ? (args.scopes as string[]) + : undefined; + + const result = await runCompaction( + store, + embedder, + compactionCfg, + scopes, + api.logger, + ); + + return { + content: [ + { + type: "text", + text: JSON.stringify( + { + scanned: result.scanned, + clustersFound: result.clustersFound, + memoriesDeleted: result.memoriesDeleted, + memoriesCreated: result.memoriesCreated, + dryRun: result.dryRun, + summary: result.dryRun + ? `Dry run: found ${result.clustersFound} cluster(s) in ${result.scanned} memories — no changes made.` + : `Compacted ${result.memoriesDeleted} memories into ${result.memoriesCreated} consolidated entries.`, + }, + null, + 2, + ), + }, + ], + }; + }, + }); + } + + // Auto-compaction at gateway_start (if enabled, respects cooldown) + if (config.memoryCompaction?.enabled) { + api.on("gateway_start", () => { + const compactionStateFile = join( + dirname(resolvedDbPath), + ".compaction-state.json", + ); + const compactionCfg: CompactionConfig = { + enabled: true, + minAgeDays: config.memoryCompaction!.minAgeDays ?? 7, + similarityThreshold: config.memoryCompaction!.similarityThreshold ?? 0.88, + minClusterSize: config.memoryCompaction!.minClusterSize ?? 2, + maxMemoriesToScan: config.memoryCompaction!.maxMemoriesToScan ?? 200, + dryRun: false, + cooldownHours: config.memoryCompaction!.cooldownHours ?? 24, + }; + + shouldRunCompaction(compactionStateFile, compactionCfg.cooldownHours) + .then(async (should) => { + if (!should) return; + await recordCompactionRun(compactionStateFile); + const result = await runCompaction(store, embedder, compactionCfg, undefined, api.logger); + if (result.clustersFound > 0) { + api.logger.info( + `memory-compactor [auto]: compacted ${result.memoriesDeleted} → ${result.memoriesCreated} entries`, + ); + } + }) + .catch((err) => { + api.logger.warn(`memory-compactor [auto]: failed: ${String(err)}`); + }); + }); + } + // ======================================================================== // Register CLI Commands // ======================================================================== @@ -1959,17 +2187,33 @@ const memoryLanceDBProPlugin = { embedder, llmClient: smartExtractor ? (() => { try { - const llmApiKey = config.llm?.apiKey - ? resolveEnvVars(config.llm.apiKey) - : resolveEnvVars(config.embedding.apiKey); - const llmBaseURL = config.llm?.baseURL - ? resolveEnvVars(config.llm.baseURL) - : config.embedding.baseURL; + const llmAuth = config.llm?.auth || "api-key"; + const llmApiKey = llmAuth === "oauth" + ? undefined + : config.llm?.apiKey + ? resolveEnvVars(config.llm.apiKey) + : resolveFirstApiKey(config.embedding.apiKey); + const llmBaseURL = llmAuth === "oauth" + ? (config.llm?.baseURL ? resolveEnvVars(config.llm.baseURL) : undefined) + : config.llm?.baseURL + ? resolveEnvVars(config.llm.baseURL) + : config.embedding.baseURL; + const llmOauthPath = llmAuth === "oauth" + ? resolveOptionalPathWithEnv(api, config.llm?.oauthPath, ".memory-lancedb-pro/oauth.json") + : undefined; + const llmOauthProvider = llmAuth === "oauth" + ? config.llm?.oauthProvider + : undefined; + const llmTimeoutMs = resolveLlmTimeoutMs(config); return createLlmClient({ + auth: llmAuth, apiKey: llmApiKey, model: config.llm?.model || "openai/gpt-oss-120b", baseURL: llmBaseURL, - timeoutMs: 30000, + oauthProvider: llmOauthProvider, + oauthPath: llmOauthPath, + timeoutMs: llmTimeoutMs, + log: (msg: string) => api.logger.debug(msg), }); } catch { return undefined; } })() : undefined, @@ -1983,46 +2227,107 @@ const memoryLanceDBProPlugin = { // Auto-recall: inject relevant memories before agent starts // Default is OFF to prevent the model from accidentally echoing injected context. - if (config.autoRecall === true) { - api.on("before_agent_start", async (event, ctx) => { + // recallMode: "full" (default when autoRecall=true) | "summary" (L0 only) | "adaptive" (intent-based) | "off" + const recallMode = config.recallMode || "full"; + if (config.autoRecall === true && recallMode !== "off") { + // Cache the most recent raw user message per session so the + // before_prompt_build gating can check the *user* text, not the full + // assembled prompt (which includes system instructions and is too long + // for the short-message skip heuristic in shouldSkipRetrieval). + const lastRawUserMessage = new Map(); + api.on("message_received", (event: any, ctx: any) => { + // Both message_received and before_prompt_build have channelId in ctx, + // so use it as the shared cache key for raw user message gating. + const cacheKey = ctx?.channelId || ctx?.conversationId || "default"; + const raw = typeof event.content === "string" ? event.content.trim() : ""; + // Strip leading bot mentions (@BotName or <@id>) so gating sees the + // actual user intent, not the mention prefix. + const text = raw.replace(/^(?:@\S+\s*|<@!?\d+>\s*)+/, "").trim(); + if (text) lastRawUserMessage.set(cacheKey, text); + }); + + const AUTO_RECALL_TIMEOUT_MS = parsePositiveInt(config.autoRecallTimeoutMs) ?? 5_000; // configurable; default raised from 3s to 5s for remote embedding APIs behind proxies + api.on("before_prompt_build", async (event: any, ctx: any) => { + // Manually increment turn counter for this session + const sessionId = ctx?.sessionId || "default"; + + // Use cached raw user message for gating (short-message skip, greeting + // detection, etc.). Fall back to event.prompt if no cached message is + // available (e.g. first message or non-channel triggers). + const cacheKey = ctx?.channelId || sessionId; + const gatingText = lastRawUserMessage.get(cacheKey) || event.prompt || ""; if ( !event.prompt || - shouldSkipRetrieval(event.prompt, config.autoRecallMinLength) + shouldSkipRetrieval(gatingText, config.autoRecallMinLength) ) { return; } - - // Manually increment turn counter for this session - const sessionId = ctx?.sessionId || "default"; const currentTurn = (turnCounter.get(sessionId) || 0) + 1; turnCounter.set(sessionId, currentTurn); - try { + // Wrap the entire recall pipeline in a timeout so slow embedding/rerank + // API calls cannot stall agent startup indefinitely. Without this guard + // the session lock is held for the full duration of the retrieval chain + // (embedding → rerank → lifecycle), which can silently drop messages on + // channels like Telegram when subsequent requests hit lock timeouts. + // See: https://github.com/CortexReach/memory-lancedb-pro/issues/253 + const recallWork = async (): Promise<{ prependContext: string } | undefined> => { // Determine agent ID and accessible scopes const agentId = resolveHookAgentId(ctx?.agentId, (event as any).sessionKey); - const accessibleScopes = scopeManager.getAccessibleScopes(agentId); + const accessibleScopes = resolveScopeFilter(scopeManager, agentId); + + // FR-04: Truncate long prompts (e.g. file attachments) before embedding. + // Auto-recall only needs the user's intent, not full attachment text. + const MAX_RECALL_QUERY_LENGTH = 1_000; + let recallQuery = event.prompt; + if (recallQuery.length > MAX_RECALL_QUERY_LENGTH) { + const originalLength = recallQuery.length; + recallQuery = recallQuery.slice(0, MAX_RECALL_QUERY_LENGTH); + api.logger.info( + `memory-lancedb-pro: auto-recall query truncated from ${originalLength} to ${MAX_RECALL_QUERY_LENGTH} chars` + ); + } + + const configMaxItems = clampInt(config.autoRecallMaxItems ?? 3, 1, 20); + const maxPerTurn = clampInt(config.maxRecallPerTurn ?? 10, 1, 50); + // maxRecallPerTurn acts as a hard ceiling on top of autoRecallMaxItems (#345) + const autoRecallMaxItems = Math.min(configMaxItems, maxPerTurn); + const autoRecallMaxChars = clampInt(config.autoRecallMaxChars ?? 600, 64, 8000); + const autoRecallPerItemMaxChars = clampInt(config.autoRecallPerItemMaxChars ?? 180, 32, 1000); + const retrieveLimit = clampInt(Math.max(autoRecallMaxItems * 2, autoRecallMaxItems), 1, 20); + + // Adaptive intent analysis (zero-LLM-cost pattern matching) + const intent = recallMode === "adaptive" ? analyzeIntent(recallQuery) : undefined; + if (intent) { + api.logger.debug?.( + `memory-lancedb-pro: adaptive recall intent=${intent.label} depth=${intent.depth} confidence=${intent.confidence} categories=[${intent.categories.join(",")}]`, + ); + } - const results = await retrieveWithRetry({ - query: event.prompt, - limit: 3, + const results = filterUserMdExclusiveRecallResults(await retrieveWithRetry({ + query: recallQuery, + limit: retrieveLimit, scopeFilter: accessibleScopes, source: "auto-recall", - }); + }), config.workspaceBoundary); if (results.length === 0) { return; } - const tierOverrides = await runRecallLifecycle(results, accessibleScopes); + // Apply intent-based category boost for adaptive mode + const rankedResults = intent ? applyCategoryBoost(results, intent) : results; + // Filter out redundant memories based on session history - const minRepeated = config.autoRecallMinRepeated ?? 0; + const minRepeated = config.autoRecallMinRepeated ?? 8; + let dedupFilteredCount = 0; // Only enable dedup logic when minRepeated > 0 - let finalResults = results; + let finalResults = rankedResults; if (minRepeated > 0) { const sessionHistory = recallHistory.get(sessionId) || new Map(); - const filteredResults = results.filter((r) => { + const filteredResults = rankedResults.filter((r) => { const lastTurn = sessionHistory.get(r.entry.id) ?? -999; const diff = currentTurn - lastTurn; const isRedundant = diff < minRepeated; @@ -2032,6 +2337,7 @@ const memoryLanceDBProPlugin = { `memory-lancedb-pro: skipping redundant memory ${r.entry.id.slice(0, 8)} (last seen at turn ${lastTurn}, current turn ${currentTurn}, min ${minRepeated})`, ); } + if (isRedundant) dedupFilteredCount++; return !isRedundant; }); @@ -2044,28 +2350,156 @@ const memoryLanceDBProPlugin = { return; } - // Update history with successfully injected memories - for (const r of filteredResults) { - sessionHistory.set(r.entry.id, currentTurn); + finalResults = filteredResults; + } + + let stateFilteredCount = 0; + let suppressedFilteredCount = 0; + const governanceEligible = finalResults.filter((r) => { + const meta = parseSmartMetadata(r.entry.metadata, r.entry); + if (meta.state !== "confirmed") { + stateFilteredCount++; + return false; } - recallHistory.set(sessionId, sessionHistory); + if (meta.memory_layer === "archive" || meta.memory_layer === "reflection") { + stateFilteredCount++; + return false; + } + if (meta.suppressed_until_turn > 0 && currentTurn <= meta.suppressed_until_turn) { + suppressedFilteredCount++; + return false; + } + return true; + }); - finalResults = filteredResults; + if (governanceEligible.length === 0) { + api.logger.info?.( + `memory-lancedb-pro: auto-recall skipped after governance filters (hits=${results.length}, dedupFiltered=${dedupFilteredCount}, stateFiltered=${stateFilteredCount}, suppressedFiltered=${suppressedFilteredCount})`, + ); + return; } - const memoryContext = finalResults - .map((r) => { - const metaObj = parseSmartMetadata(r.entry.metadata, r.entry); - const displayCategory = metaObj.memory_category || r.entry.category; - const displayTier = tierOverrides.get(r.entry.id) || metaObj.tier || ""; - const tierPrefix = displayTier ? `[${displayTier.charAt(0).toUpperCase()}]` : ""; - const abstract = metaObj.l0_abstract || r.entry.text; - return `- ${tierPrefix}[${displayCategory}:${r.entry.scope}] ${sanitizeForContext(abstract)}`; - }) - .join("\n"); + // Determine effective per-item char limit based on recall mode and intent depth + const effectivePerItemMaxChars = (() => { + if (recallMode === "summary") return Math.min(autoRecallPerItemMaxChars, 80); // L0 only + if (!intent) return autoRecallPerItemMaxChars; // "full" mode + // Adaptive mode: depth determines char budget + switch (intent.depth) { + case "l0": return Math.min(autoRecallPerItemMaxChars, 80); + case "l1": return autoRecallPerItemMaxChars; // default budget + case "full": return Math.min(autoRecallPerItemMaxChars * 3, 1000); + } + })(); + + const preBudgetCandidates = governanceEligible.map((r) => { + const metaObj = parseSmartMetadata(r.entry.metadata, r.entry); + const displayCategory = metaObj.memory_category || r.entry.category; + const displayTier = metaObj.tier || ""; + const tierPrefix = displayTier ? `[${displayTier.charAt(0).toUpperCase()}]` : ""; + // Select content tier based on recallMode/intent depth + const contentText = recallMode === "summary" + ? (metaObj.l0_abstract || r.entry.text) + : intent?.depth === "full" + ? (r.entry.text) // full text for deep queries + : (metaObj.l0_abstract || r.entry.text); // L0/L1 default + const summary = sanitizeForContext(contentText).slice(0, effectivePerItemMaxChars); + return { + id: r.entry.id, + prefix: `${tierPrefix}[${displayCategory}:${r.entry.scope}]`, + summary, + chars: summary.length, + meta: metaObj, + }; + }); + + const preBudgetItems = preBudgetCandidates.length; + const preBudgetChars = preBudgetCandidates.reduce((sum, item) => sum + item.chars, 0); + const selected = []; + let usedChars = 0; + + for (const candidate of preBudgetCandidates) { + if (selected.length >= autoRecallMaxItems) break; + const remaining = autoRecallMaxChars - usedChars; + if (remaining <= 0) break; + + if (candidate.chars <= remaining) { + selected.push({ + id: candidate.id, + line: `- ${candidate.prefix} ${candidate.summary}`, + chars: candidate.chars, + meta: candidate.meta, + }); + usedChars += candidate.chars; + continue; + } + + const shortened = candidate.summary.slice(0, remaining).trim(); + if (!shortened) continue; + const line = `- ${candidate.prefix} ${shortened}`; + selected.push({ + id: candidate.id, + line, + chars: shortened.length, + meta: candidate.meta, + }); + usedChars += shortened.length; + break; + } + + if (selected.length === 0) { + api.logger.info?.( + `memory-lancedb-pro: auto-recall skipped injection after budgeting (hits=${results.length}, dedupFiltered=${dedupFilteredCount}, maxItems=${autoRecallMaxItems}, maxChars=${autoRecallMaxChars})`, + ); + return; + } + + if (minRepeated > 0) { + const sessionHistory = recallHistory.get(sessionId) || new Map(); + for (const item of selected) { + sessionHistory.set(item.id, currentTurn); + } + recallHistory.set(sessionId, sessionHistory); + } + + const injectedAt = Date.now(); + await Promise.allSettled( + selected.map(async (item) => { + const meta = item.meta; + const staleInjected = + typeof meta.last_injected_at === "number" && + meta.last_injected_at > 0 && + ( + typeof meta.last_confirmed_use_at !== "number" || + meta.last_confirmed_use_at < meta.last_injected_at + ); + const nextBadRecallCount = staleInjected + ? meta.bad_recall_count + 1 + : meta.bad_recall_count; + const shouldSuppress = nextBadRecallCount >= 3 && minRepeated > 0; + await store.patchMetadata( + item.id, + { + injected_count: meta.injected_count + 1, + last_injected_at: injectedAt, + bad_recall_count: nextBadRecallCount, + suppressed_until_turn: shouldSuppress + ? Math.max(meta.suppressed_until_turn, currentTurn + minRepeated) + : meta.suppressed_until_turn, + }, + accessibleScopes, + ); + }), + ); + + const memoryContext = selected.map((item) => item.line).join("\n"); + + const injectedIds = selected.map((item) => item.id).join(",") || "(none)"; + api.logger.debug?.( + `memory-lancedb-pro: auto-recall stats hits=${results.length}, dedupFiltered=${dedupFilteredCount}, stateFiltered=${stateFilteredCount}, suppressedFiltered=${suppressedFilteredCount}, preBudgetItems=${preBudgetItems}, preBudgetChars=${preBudgetChars}, postBudgetItems=${selected.length}, postBudgetChars=${usedChars}, maxItems=${autoRecallMaxItems}, maxChars=${autoRecallMaxChars}, perItemMaxChars=${autoRecallPerItemMaxChars}, injectedIds=${injectedIds}`, + ); api.logger.info?.( - `memory-lancedb-pro: injecting ${finalResults.length} memories into context for agent ${agentId}`, + `memory-lancedb-pro: injecting ${selected.length} memories into context for agent ${agentId}`, ); return { @@ -2075,25 +2509,83 @@ const memoryLanceDBProPlugin = { `${memoryContext}\n` + `[END UNTRUSTED DATA]\n` + ``, + // Mark as ephemeral so the host framework's compaction logic can + // safely discard injected memory blocks instead of persisting them + // into the session transcript (#345). + ephemeral: true, }; + }; + + let timeoutId: ReturnType | undefined; + try { + const result = await Promise.race([ + recallWork().then((r) => { clearTimeout(timeoutId); return r; }), + new Promise((resolve) => { + timeoutId = setTimeout(() => { + api.logger.warn( + `memory-lancedb-pro: auto-recall timed out after ${AUTO_RECALL_TIMEOUT_MS}ms; skipping memory injection to avoid stalling agent startup`, + ); + resolve(undefined); + }, AUTO_RECALL_TIMEOUT_MS); + }), + ]); + return result; } catch (err) { + clearTimeout(timeoutId); api.logger.warn(`memory-lancedb-pro: recall failed: ${String(err)}`); } - }); + }, { priority: 10 }); + + // Clean up auto-recall session state on session end to prevent unbounded + // growth of recallHistory and turnCounter Maps (#345). + api.on("session_end", (_event: any, ctx: any) => { + const sessionId = ctx?.sessionId || ""; + if (sessionId) { + recallHistory.delete(sessionId); + turnCounter.delete(sessionId); + lastRawUserMessage.delete(sessionId); + } + // Also clean by channelId/conversationId if present (shared cache key) + const cacheKey = ctx?.channelId || ctx?.conversationId || ""; + if (cacheKey && cacheKey !== sessionId) { + lastRawUserMessage.delete(cacheKey); + } + }, { priority: 10 }); } // Auto-capture: analyze and store important information after agent ends if (config.autoCapture !== false) { - api.on("agent_end", async (event, ctx) => { + type AgentEndAutoCaptureHook = { + (event: any, ctx: any): void; + __lastRun?: Promise; + }; + + const agentEndAutoCaptureHook: AgentEndAutoCaptureHook = (event, ctx) => { if (!event.success || !event.messages || event.messages.length === 0) { return; } + // Fire-and-forget: run capture work in the background so the hook + // returns immediately and does not hold the session lock. Blocking + // here causes downstream channel deliveries (e.g. Telegram) to be + // silently dropped when the session store lock times out. + // See: https://github.com/CortexReach/memory-lancedb-pro/issues/260 + const backgroundRun = (async () => { try { + // Feature 7: Check extraction rate limit before any work + if (extractionRateLimiter.isRateLimited()) { + api.logger.debug( + `memory-lancedb-pro: auto-capture skipped (rate limited: ${extractionRateLimiter.getRecentCount()} extractions in last hour)`, + ); + return; + } + // Determine agent ID and default scope const agentId = resolveHookAgentId(ctx?.agentId, (event as any).sessionKey); - const accessibleScopes = scopeManager.getAccessibleScopes(agentId); - const defaultScope = scopeManager.getDefaultScope(agentId); + const accessibleScopes = resolveScopeFilter(scopeManager, agentId); + const defaultScope = isSystemBypassId(agentId) + ? config.scopes?.default ?? "global" + : scopeManager.getDefaultScope(agentId); const sessionKey = ctx?.sessionKey || (event as any).sessionKey || "unknown"; api.logger.debug( @@ -2121,7 +2613,7 @@ const memoryLanceDBProPlugin = { const content = msgObj.content; if (typeof content === "string") { - const normalized = normalizeAutoCaptureText(role, content); + const normalized = normalizeAutoCaptureText(role, content, shouldSkipReflectionMessage); if (!normalized) { skippedAutoCaptureTexts++; } else { @@ -2141,7 +2633,7 @@ const memoryLanceDBProPlugin = { typeof (block as Record).text === "string" ) { const text = (block as Record).text as string; - const normalized = normalizeAutoCaptureText(role, text); + const normalized = normalizeAutoCaptureText(role, text, shouldSkipReflectionMessage); if (!normalized) { skippedAutoCaptureTexts++; } else { @@ -2185,7 +2677,7 @@ const memoryLanceDBProPlugin = { pruneMapIfOver(autoCaptureRecentTexts, AUTO_CAPTURE_MAP_MAX_ENTRIES); } - const minMessages = config.extractMinMessages ?? 2; + const minMessages = config.extractMinMessages ?? 4; if (skippedAutoCaptureTexts > 0) { api.logger.debug( `memory-lancedb-pro: auto-capture skipped ${skippedAutoCaptureTexts} injected/system text block(s) for agent ${agentId}`, @@ -2216,8 +2708,39 @@ const memoryLanceDBProPlugin = { ); } + // ---------------------------------------------------------------- + // Feature 7: Skip low-value conversations + // ---------------------------------------------------------------- + if (config.extractionThrottle?.skipLowValue === true) { + const conversationValue = estimateConversationValue(texts); + if (conversationValue < 0.2) { + api.logger.debug( + `memory-lancedb-pro: auto-capture skipped for agent ${agentId} (low conversation value: ${conversationValue.toFixed(2)})`, + ); + return; + } + } + + // ---------------------------------------------------------------- + // Feature 1: Session compression — prioritize high-signal texts + // ---------------------------------------------------------------- + if (config.sessionCompression?.enabled === true && texts.length > 0) { + const maxChars = config.extractMaxChars ?? 8000; + const compressed = compressTexts(texts, maxChars, { + minScoreToKeep: config.sessionCompression?.minScoreToKeep, + }); + if (compressed.dropped > 0) { + api.logger.debug( + `memory-lancedb-pro: session compression for agent ${agentId}: dropped ${compressed.dropped}/${texts.length} texts (${compressed.totalChars} chars kept)`, + ); + texts = compressed.texts; + } + } + // ---------------------------------------------------------------- // Smart Extraction (Phase 1: LLM-powered 6-category extraction) + // Rate limiter charged AFTER successful extraction, not before, + // so no-op sessions don't consume the hourly quota. // ---------------------------------------------------------------- if (smartExtractor) { // Pre-filter: embedding-based noise detection (language-agnostic) @@ -2237,6 +2760,8 @@ const memoryLanceDBProPlugin = { conversationText, sessionKey, { scope: defaultScope, scopeFilter: accessibleScopes }, ); + // Charge rate limiter only after successful extraction + extractionRateLimiter.recordExtraction(); if (stats.created > 0 || stats.merged > 0) { api.logger.info( `memory-lancedb-pro: smart-extracted ${stats.created} created, ${stats.merged} merged, ${stats.skipped} skipped for agent ${agentId}` @@ -2244,6 +2769,12 @@ const memoryLanceDBProPlugin = { return; // Smart extraction handled everything } + if ((stats.boundarySkipped ?? 0) > 0) { + api.logger.info( + `memory-lancedb-pro: smart extraction skipped ${stats.boundarySkipped} USER.md-exclusive candidate(s) for agent ${agentId}; continuing to regex fallback for non-boundary texts`, + ); + } + api.logger.info( `memory-lancedb-pro: smart extraction produced no persisted memories for agent ${agentId} (created=${stats.created}, merged=${stats.merged}, skipped=${stats.skipped}); falling back to regex capture`, ); @@ -2278,9 +2809,16 @@ const memoryLanceDBProPlugin = { `memory-lancedb-pro: regex fallback found ${toCapture.length} capturable text(s) for agent ${agentId}`, ); - // Store each capturable piece (limit to 3 per conversation) + // Store each capturable piece (limit to 2 per conversation) let stored = 0; - for (const text of toCapture.slice(0, 3)) { + for (const text of toCapture.slice(0, 2)) { + if (isUserMdExclusiveMemory({ text }, config.workspaceBoundary)) { + api.logger.info( + `memory-lancedb-pro: skipped USER.md-exclusive auto-capture text for agent ${agentId}`, + ); + continue; + } + const category = detectCategory(text); const vector = await embedder.embedPassage(text); @@ -2297,7 +2835,7 @@ const memoryLanceDBProPlugin = { ); } - if (existing.length > 0 && existing[0].score > 0.95) { + if (existing.length > 0 && existing[0].score > 0.90) { continue; } @@ -2319,6 +2857,16 @@ const memoryLanceDBProPlugin = { l1_overview: `- ${text}`, l2_content: text, source_session: (event as any).sessionKey || "unknown", + source: "auto-capture", + // Write "confirmed" so auto-recall governance filter accepts + // these memories immediately. Previously "pending" caused a + // deadlock where auto-captured memories could never be + // auto-recalled (see #350). + state: "confirmed", + memory_layer: "working", + injected_count: 0, + bad_recall_count: 0, + suppressed_until_turn: 0, }, ), ), @@ -2342,7 +2890,12 @@ const memoryLanceDBProPlugin = { } catch (err) { api.logger.warn(`memory-lancedb-pro: capture failed: ${String(err)}`); } - }); + })(); + agentEndAutoCaptureHook.__lastRun = backgroundRun; + void backgroundRun; + }; + + api.on("agent_end", agentEndAutoCaptureHook); } // ======================================================================== @@ -2446,7 +2999,9 @@ const memoryLanceDBProPlugin = { }); } - api.logger.info("self-improvement: integrated hooks registered (agent:bootstrap, command:new, command:reset)"); + (isCliMode() ? api.logger.debug : api.logger.info)( + "self-improvement: integrated hooks registered (agent:bootstrap, command:new, command:reset)" + ); } // ======================================================================== @@ -2481,7 +3036,7 @@ const memoryLanceDBProPlugin = { return sourceAgentId; }; - api.on("after_tool_call", (event, ctx) => { + api.on("after_tool_call", (event: any, ctx: any) => { const sessionKey = typeof ctx.sessionKey === "string" ? ctx.sessionKey : ""; if (isInternalReflectionSessionKey(sessionKey)) return; if (!sessionKey) return; @@ -2517,14 +3072,17 @@ const memoryLanceDBProPlugin = { } }, { priority: 15 }); - api.on("before_agent_start", async (_event, ctx) => { + api.on("before_prompt_build", async (_event: any, ctx: any) => { const sessionKey = typeof ctx.sessionKey === "string" ? ctx.sessionKey : ""; if (isInternalReflectionSessionKey(sessionKey)) return; if (reflectionInjectMode !== "inheritance-only" && reflectionInjectMode !== "inheritance+derived") return; try { pruneReflectionSessionState(); - const agentId = typeof ctx.agentId === "string" && ctx.agentId.trim() ? ctx.agentId.trim() : "main"; - const scopes = scopeManager.getAccessibleScopes(agentId); + const agentId = resolveHookAgentId( + typeof ctx.agentId === "string" ? ctx.agentId : undefined, + sessionKey, + ); + const scopes = resolveScopeFilter(scopeManager, agentId); const slices = await loadAgentReflectionSlices(agentId, scopes); if (slices.invariants.length === 0) return; const body = slices.invariants.slice(0, 6).map((line, i) => `${i + 1}. ${line}`).join("\n"); @@ -2541,16 +3099,19 @@ const memoryLanceDBProPlugin = { } }, { priority: 12 }); - api.on("before_prompt_build", async (_event, ctx) => { + api.on("before_prompt_build", async (_event: any, ctx: any) => { const sessionKey = typeof ctx.sessionKey === "string" ? ctx.sessionKey : ""; if (isInternalReflectionSessionKey(sessionKey)) return; - const agentId = typeof ctx.agentId === "string" && ctx.agentId.trim() ? ctx.agentId.trim() : "main"; + const agentId = resolveHookAgentId( + typeof ctx.agentId === "string" ? ctx.agentId : undefined, + sessionKey, + ); pruneReflectionSessionState(); const blocks: string[] = []; if (reflectionInjectMode === "inheritance+derived") { try { - const scopes = scopeManager.getAccessibleScopes(agentId); + const scopes = resolveScopeFilter(scopeManager, agentId); const derivedCache = sessionKey ? reflectionDerivedBySession.get(sessionKey) : null; const derivedLines = derivedCache?.derived?.length ? derivedCache.derived @@ -2589,7 +3150,7 @@ const memoryLanceDBProPlugin = { return { prependContext: blocks.join("\n\n") }; }, { priority: 15 }); - api.on("session_end", (_event, ctx) => { + api.on("session_end", (_event: any, ctx: any) => { const sessionKey = typeof ctx.sessionKey === "string" ? ctx.sessionKey.trim() : ""; if (!sessionKey) return; reflectionErrorStateBySession.delete(sessionKey); @@ -2671,7 +3232,9 @@ const memoryLanceDBProPlugin = { const timeHms = timeIso.split(".")[0]; const timeCompact = timeIso.replace(/[:.]/g, ""); const reflectionRunAgentId = resolveReflectionRunAgentId(cfg, sourceAgentId); - const targetScope = scopeManager.getDefaultScope(sourceAgentId); + const targetScope = isSystemBypassId(sourceAgentId) + ? config.scopes?.default ?? "global" + : scopeManager.getDefaultScope(sourceAgentId); const toolErrorSignals = sessionKey ? (reflectionErrorStateBySession.get(sessionKey)?.entries ?? []).slice(-reflectionErrorReminderMaxEntries) : []; @@ -2869,128 +3432,119 @@ const memoryLanceDBProPlugin = { name: "memory-lancedb-pro.memory-reflection.command-reset", description: "Generate reflection log before /reset", }); - api.logger.info("memory-reflection: integrated hooks registered (command:new, command:reset, after_tool_call, before_agent_start, before_prompt_build)"); + (isCliMode() ? api.logger.debug : api.logger.info)( + "memory-reflection: integrated hooks registered (command:new, command:reset, after_tool_call, before_prompt_build, session_end)" + ); } if (config.sessionStrategy === "systemSessionMemory") { const sessionMessageCount = config.sessionMemory?.messageCount ?? 15; - api.registerHook("command:new", async (event) => { - try { - api.logger.debug("session-memory: hook triggered for /new command"); - - const context = (event.context || {}) as Record; - const sessionKey = typeof event.sessionKey === "string" ? event.sessionKey : ""; - const agentId = resolveHookAgentId( - (event.agentId as string) || (context.agentId as string) || undefined, - sessionKey || (context.sessionKey as string) || undefined, - ); - const defaultScope = scopeManager.getDefaultScope(agentId); - const workspaceDir = resolveWorkspaceDirFromContext(context); - const cfg = context.cfg; - const sessionEntry = (context.previousSessionEntry || context.sessionEntry || {}) as Record; - const currentSessionId = typeof sessionEntry.sessionId === "string" ? sessionEntry.sessionId : "unknown"; - let currentSessionFile = typeof sessionEntry.sessionFile === "string" ? sessionEntry.sessionFile : undefined; - const source = typeof context.commandSource === "string" ? context.commandSource : "unknown"; - - if (!currentSessionFile || currentSessionFile.includes(".reset.")) { - const searchDirs = resolveReflectionSessionSearchDirs({ - context, - cfg, - workspaceDir, - currentSessionFile, - sourceAgentId: agentId, - }); + const storeSystemSessionSummary = async (params: { + agentId: string; + defaultScope: string; + sessionKey: string; + sessionId: string; + source: string; + sessionContent: string; + timestampMs?: number; + }) => { + const now = new Date(params.timestampMs ?? Date.now()); + const dateStr = now.toISOString().split("T")[0]; + const timeStr = now.toISOString().split("T")[1].split(".")[0]; + const memoryText = [ + `Session: ${dateStr} ${timeStr} UTC`, + `Session Key: ${params.sessionKey}`, + `Session ID: ${params.sessionId}`, + `Source: ${params.source}`, + "", + "Conversation Summary:", + params.sessionContent, + ].join("\n"); + + const vector = await embedder.embedPassage(memoryText); + await store.store({ + text: memoryText, + vector, + category: "fact", + scope: params.defaultScope, + importance: 0.5, + metadata: stringifySmartMetadata( + buildSmartMetadata( + { + text: `Session summary for ${dateStr}`, + category: "fact", + importance: 0.5, + timestamp: Date.now(), + }, + { + l0_abstract: `Session summary for ${dateStr}`, + l1_overview: `- Session summary saved for ${params.sessionId}`, + l2_content: memoryText, + memory_category: "patterns", + tier: "peripheral", + confidence: 0.5, + type: "session-summary", + sessionKey: params.sessionKey, + sessionId: params.sessionId, + date: dateStr, + agentId: params.agentId, + scope: params.defaultScope, + }, + ), + ), + }); - for (const sessionsDir of searchDirs) { - const recovered = await findPreviousSessionFile( - sessionsDir, - currentSessionFile, - currentSessionId, - ); - if (recovered) { - currentSessionFile = recovered; - api.logger.debug(`session-memory: recovered session file: ${recovered}`); - break; - } - } - } + api.logger.info( + `session-memory: stored session summary for ${params.sessionId} (agent: ${params.agentId}, scope: ${params.defaultScope})` + ); + }; - if (!currentSessionFile) { - api.logger.debug("session-memory: no session file found, skipping"); - return; - } + api.on("before_reset", async (event, ctx) => { + if (event.reason !== "new") return; - const sessionContent = await readSessionConversationWithResetFallback( - currentSessionFile, - sessionMessageCount, + try { + const sessionKey = typeof ctx.sessionKey === "string" ? ctx.sessionKey : ""; + const agentId = resolveHookAgentId( + typeof ctx.agentId === "string" ? ctx.agentId : undefined, + sessionKey, ); + const defaultScope = isSystemBypassId(agentId) + ? config.scopes?.default ?? "global" + : scopeManager.getDefaultScope(agentId); + const currentSessionId = + typeof ctx.sessionId === "string" && ctx.sessionId.trim().length > 0 + ? ctx.sessionId + : "unknown"; + const source = resolveSourceFromSessionKey(sessionKey); + const sessionContent = + summarizeRecentConversationMessages(event.messages ?? [], sessionMessageCount) ?? + (typeof event.sessionFile === "string" + ? await readSessionConversationWithResetFallback(event.sessionFile, sessionMessageCount) + : null); + if (!sessionContent) { api.logger.debug("session-memory: no session content found, skipping"); return; } - const now = new Date(typeof event.timestamp === "number" ? event.timestamp : Date.now()); - const dateStr = now.toISOString().split("T")[0]; - const timeStr = now.toISOString().split("T")[1].split(".")[0]; - const memoryText = [ - `Session: ${dateStr} ${timeStr} UTC`, - `Session Key: ${sessionKey}`, - `Session ID: ${currentSessionId}`, - `Source: ${source}`, - "", - "Conversation Summary:", + await storeSystemSessionSummary({ + agentId, + defaultScope, + sessionKey, + sessionId: currentSessionId, + source, sessionContent, - ].join("\n"); - - const vector = await embedder.embedPassage(memoryText); - await store.store({ - text: memoryText, - vector, - category: "fact", - scope: defaultScope, - importance: 0.5, - metadata: stringifySmartMetadata( - buildSmartMetadata( - { - text: `Session summary for ${dateStr}`, - category: "fact", - importance: 0.5, - timestamp: Date.now(), - }, - { - l0_abstract: `Session summary for ${dateStr}`, - l1_overview: `- Session summary saved for ${currentSessionId}`, - l2_content: memoryText, - memory_category: "patterns", - tier: "peripheral", - confidence: 0.5, - type: "session-summary", - sessionKey, - sessionId: currentSessionId, - date: dateStr, - agentId, - scope: defaultScope, - }, - ), - ), }); - - api.logger.info( - `session-memory: stored session summary for ${currentSessionId} (agent: ${agentId}, scope: ${defaultScope})` - ); } catch (err) { api.logger.warn(`session-memory: failed to save: ${String(err)}`); } - }, { - name: "memory-lancedb-pro-session-memory", - description: "Store /new session summaries in LanceDB memory", }); - api.logger.info("session-memory: hook registered for command:new as memory-lancedb-pro-session-memory"); + (isCliMode() ? api.logger.debug : api.logger.info)("session-memory: typed before_reset hook registered for /new session summaries"); } if (config.sessionStrategy === "none") { - api.logger.info("session-strategy: using none (plugin memory-reflection hooks disabled)"); + (isCliMode() ? api.logger.debug : api.logger.info)("session-strategy: using none (plugin memory-reflection hooks disabled)"); } // ======================================================================== @@ -3192,6 +3746,12 @@ export function parsePluginConfig(value: unknown): PluginConfig { const sessionMemoryRaw = typeof cfg.sessionMemory === "object" && cfg.sessionMemory !== null ? cfg.sessionMemory as Record : null; + const workspaceBoundaryRaw = typeof cfg.workspaceBoundary === "object" && cfg.workspaceBoundary !== null + ? cfg.workspaceBoundary as Record + : null; + const userMdExclusiveRaw = typeof workspaceBoundaryRaw?.userMdExclusive === "object" && workspaceBoundaryRaw.userMdExclusive !== null + ? workspaceBoundaryRaw.userMdExclusive as Record + : null; const sessionStrategyRaw = cfg.sessionStrategy; const legacySessionMemoryEnabled = typeof sessionMemoryRaw?.enabled === "boolean" ? sessionMemoryRaw.enabled @@ -3227,6 +3787,10 @@ export function parsePluginConfig(value: unknown): PluginConfig { // Accept number, numeric string, or env-var string (e.g. "${EMBED_DIM}"). // Also accept legacy top-level `dimensions` for convenience. dimensions: parsePositiveInt(embedding.dimensions ?? cfg.dimensions), + omitDimensions: + typeof embedding.omitDimensions === "boolean" + ? embedding.omitDimensions + : undefined, taskQuery: typeof embedding.taskQuery === "string" ? embedding.taskQuery @@ -3249,7 +3813,11 @@ export function parsePluginConfig(value: unknown): PluginConfig { // Default OFF: only enable when explicitly set to true. autoRecall: cfg.autoRecall === true, autoRecallMinLength: parsePositiveInt(cfg.autoRecallMinLength), - autoRecallMinRepeated: parsePositiveInt(cfg.autoRecallMinRepeated), + autoRecallMinRepeated: parsePositiveInt(cfg.autoRecallMinRepeated) ?? 8, + autoRecallMaxItems: parsePositiveInt(cfg.autoRecallMaxItems) ?? 3, + autoRecallMaxChars: parsePositiveInt(cfg.autoRecallMaxChars) ?? 600, + autoRecallPerItemMaxChars: parsePositiveInt(cfg.autoRecallPerItemMaxChars) ?? 180, + maxRecallPerTurn: parsePositiveInt(cfg.maxRecallPerTurn) ?? 10, captureAssistant: cfg.captureAssistant === true, retrieval: typeof cfg.retrieval === "object" && cfg.retrieval !== null ? cfg.retrieval as any : undefined, decay: typeof cfg.decay === "object" && cfg.decay !== null ? cfg.decay as any : undefined, @@ -3257,7 +3825,7 @@ export function parsePluginConfig(value: unknown): PluginConfig { // Smart extraction config (Phase 1) smartExtraction: cfg.smartExtraction !== false, // Default ON llm: typeof cfg.llm === "object" && cfg.llm !== null ? cfg.llm as any : undefined, - extractMinMessages: parsePositiveInt(cfg.extractMinMessages) ?? 2, + extractMinMessages: parsePositiveInt(cfg.extractMinMessages) ?? 4, extractMaxChars: parsePositiveInt(cfg.extractMaxChars) ?? 8000, scopes: typeof cfg.scopes === "object" && cfg.scopes !== null ? cfg.scopes as any : undefined, enableManagementTools: cfg.enableManagementTools === true, @@ -3270,10 +3838,10 @@ export function parsePluginConfig(value: unknown): PluginConfig { ensureLearningFiles: (cfg.selfImprovement as Record).ensureLearningFiles !== false, } : { - enabled: false, - beforeResetNote: false, - skipSubagentBootstrap: false, - ensureLearningFiles: false, + enabled: true, + beforeResetNote: true, + skipSubagentBootstrap: true, + ensureLearningFiles: true, }, memoryReflection: memoryReflectionRaw ? { @@ -3330,6 +3898,61 @@ export function parsePluginConfig(value: unknown): PluginConfig { : undefined, } : undefined, + workspaceBoundary: + workspaceBoundaryRaw + ? { + userMdExclusive: userMdExclusiveRaw + ? { + enabled: userMdExclusiveRaw.enabled === true, + routeProfile: userMdExclusiveRaw.routeProfile !== false, + routeCanonicalName: userMdExclusiveRaw.routeCanonicalName !== false, + routeCanonicalAddressing: userMdExclusiveRaw.routeCanonicalAddressing !== false, + filterRecall: userMdExclusiveRaw.filterRecall !== false, + } + : undefined, + } + : undefined, + admissionControl: normalizeAdmissionControlConfig(cfg.admissionControl), + memoryCompaction: (() => { + const raw = + typeof cfg.memoryCompaction === "object" && cfg.memoryCompaction !== null + ? (cfg.memoryCompaction as Record) + : null; + if (!raw) return undefined; + return { + enabled: raw.enabled === true, + minAgeDays: parsePositiveInt(raw.minAgeDays) ?? 7, + similarityThreshold: + typeof raw.similarityThreshold === "number" + ? Math.max(0, Math.min(1, raw.similarityThreshold)) + : 0.88, + minClusterSize: parsePositiveInt(raw.minClusterSize) ?? 2, + maxMemoriesToScan: parsePositiveInt(raw.maxMemoriesToScan) ?? 200, + cooldownHours: parsePositiveInt(raw.cooldownHours) ?? 24, + }; + })(), + sessionCompression: + typeof cfg.sessionCompression === "object" && cfg.sessionCompression !== null + ? { + enabled: + (cfg.sessionCompression as Record).enabled === true, + minScoreToKeep: + typeof (cfg.sessionCompression as Record).minScoreToKeep === "number" + ? ((cfg.sessionCompression as Record).minScoreToKeep as number) + : 0.3, + } + : { enabled: false, minScoreToKeep: 0.3 }, + extractionThrottle: + typeof cfg.extractionThrottle === "object" && cfg.extractionThrottle !== null + ? { + skipLowValue: + (cfg.extractionThrottle as Record).skipLowValue === true, + maxExtractionsPerHour: + typeof (cfg.extractionThrottle as Record).maxExtractionsPerHour === "number" + ? ((cfg.extractionThrottle as Record).maxExtractionsPerHour as number) + : 30, + } + : { skipLowValue: false, maxExtractionsPerHour: 30 }, }; } diff --git a/openclaw.plugin.json b/openclaw.plugin.json index 96d9e3ee..a2cfb1f5 100644 --- a/openclaw.plugin.json +++ b/openclaw.plugin.json @@ -2,7 +2,7 @@ "id": "memory-lancedb-pro", "name": "Memory (LanceDB Pro)", "description": "Enhanced LanceDB-backed long-term memory with hybrid retrieval, multi-scope isolation, long-context chunking, and management CLI", - "version": "1.1.0-beta.8", + "version": "1.1.0-beta.10", "kind": "memory", "configSchema": { "type": "object", @@ -14,7 +14,10 @@ "properties": { "provider": { "type": "string", - "const": "openai-compatible" + "enum": [ + "openai-compatible", + "azure-openai" + ] }, "apiKey": { "oneOf": [ @@ -43,6 +46,10 @@ "type": "integer", "minimum": 1 }, + "omitDimensions": { + "type": "boolean", + "description": "When true, omit the dimensions parameter from embedding requests even if dimensions is configured" + }, "taskQuery": { "type": "string", "description": "Embedding task for queries (provider-specific, e.g. Jina: retrieval.query)" @@ -59,6 +66,10 @@ "type": "boolean", "default": true, "description": "Enable automatic chunking for documents exceeding embedding context limits" + }, + "apiVersion": { + "type": "string", + "description": "API version for Azure OpenAI (e.g. 2024-02-01). Only used when provider is azure-openai." } }, "required": [ @@ -102,8 +113,42 @@ "type": "integer", "minimum": 0, "maximum": 100, - "default": 0, - "description": "Minimum number of turns before the same memory can be recalled again in the same session. Set to 0 to disable deduplication (default behavior: inject all memories)." + "default": 8, + "description": "Minimum number of turns before the same memory can be recalled again in the same session. Set to 0 to disable deduplication." + }, + "autoRecallMaxItems": { + "type": "integer", + "minimum": 1, + "maximum": 20, + "default": 3, + "description": "Maximum number of memories auto-injected per turn." + }, + "autoRecallMaxChars": { + "type": "integer", + "minimum": 64, + "maximum": 8000, + "default": 600, + "description": "Maximum total character budget for auto-injected memory summaries." + }, + "autoRecallPerItemMaxChars": { + "type": "integer", + "minimum": 32, + "maximum": 1000, + "default": 180, + "description": "Maximum character budget per auto-injected memory summary." + }, + "maxRecallPerTurn": { + "type": "integer", + "minimum": 1, + "maximum": 50, + "default": 10, + "description": "Hard per-turn injection cap applied after dedup. Acts as a safety ceiling on top of autoRecallMaxItems to prevent context inflation. Default: 10." + }, + "recallMode": { + "type": "string", + "enum": ["full", "summary", "adaptive", "off"], + "default": "full", + "description": "Auto-recall depth mode. 'full': inject with configured per-item budget. 'summary': L0 abstracts only (compact). 'adaptive': analyze query intent to auto-select category and depth. 'off': disable auto-recall injection." }, "captureAssistant": { "type": "boolean" @@ -117,7 +162,7 @@ "type": "integer", "minimum": 1, "maximum": 100, - "default": 2, + "default": 4, "description": "Minimum conversation messages required before smart extraction runs." }, "extractMaxChars": { @@ -127,6 +172,100 @@ "default": 8000, "description": "Maximum conversation characters sent to the smart extraction LLM." }, + "admissionControl": { + "type": "object", + "additionalProperties": false, + "description": "A-MAC-style admission governance on the smart-extraction write path. Rejects low-value candidates before persistence while preserving downstream dedup behavior for admitted candidates.", + "properties": { + "enabled": { + "type": "boolean", + "default": false + }, + "preset": { + "type": "string", + "enum": [ + "balanced", + "conservative", + "high-recall" + ], + "default": "balanced", + "description": "Named admission tuning preset. Explicit admissionControl fields still override the selected preset." + }, + "utilityMode": { + "type": "string", + "enum": [ + "standalone", + "off" + ], + "default": "standalone" + }, + "rejectThreshold": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.45 + }, + "admitThreshold": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.6 + }, + "noveltyCandidatePoolSize": { + "type": "integer", + "minimum": 1, + "maximum": 20, + "default": 8 + }, + "auditMetadata": { + "type": "boolean", + "default": true + }, + "persistRejectedAudits": { + "type": "boolean", + "default": true + }, + "rejectedAuditFilePath": { + "type": "string", + "description": "Optional JSONL file path for durable admission reject audit records. Defaults to a file beside the plugin memory data directory." + }, + "recency": { + "type": "object", + "additionalProperties": false, + "properties": { + "halfLifeDays": { + "type": "integer", + "minimum": 1, + "maximum": 365, + "default": 14 + } + } + }, + "weights": { + "type": "object", + "additionalProperties": false, + "properties": { + "utility": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.1 }, + "confidence": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.1 }, + "novelty": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.1 }, + "recency": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.1 }, + "typePrior": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.6 } + } + }, + "typePriors": { + "type": "object", + "additionalProperties": false, + "properties": { + "profile": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.95 }, + "preferences": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.9 }, + "entities": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.75 }, + "events": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.45 }, + "cases": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.8 }, + "patterns": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.85 } + } + } + } + }, "retrieval": { "type": "object", "additionalProperties": false, @@ -178,7 +317,7 @@ "rerankEndpoint": { "type": "string", "default": "https://api.jina.ai/v1/rerank", - "description": "Reranker API endpoint URL. Compatible with Jina, SiliconFlow, Pinecone, or any service with a similar interface." + "description": "Reranker API endpoint URL. Compatible with Jina-compatible endpoints and dedicated adapters such as TEI, SiliconFlow, Voyage, Pinecone, and DashScope." }, "rerankProvider": { "type": "string", @@ -187,10 +326,11 @@ "siliconflow", "voyage", "pinecone", - "dashscope" + "dashscope", + "tei" ], "default": "jina", - "description": "Reranker provider format. Determines request/response shape and auth header. DashScope uses gte-rerank-v2 with endpoint https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank." + "description": "Reranker provider format. Determines request/response shape and auth header. Use tei for Hugging Face Text Embeddings Inference /rerank endpoints. DashScope uses gte-rerank-v2 with endpoint https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank." }, "candidatePoolSize": { "type": "integer", @@ -258,32 +398,132 @@ "type": "object", "additionalProperties": false, "properties": { - "recencyHalfLifeDays": { "type": "number", "minimum": 1, "maximum": 365, "default": 30 }, - "recencyWeight": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.4 }, - "frequencyWeight": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.3 }, - "intrinsicWeight": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.3 }, - "staleThreshold": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.3 }, - "searchBoostMin": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.3 }, - "importanceModulation": { "type": "number", "minimum": 0, "maximum": 10, "default": 1.5 }, - "betaCore": { "type": "number", "minimum": 0.1, "maximum": 5, "default": 0.8 }, - "betaWorking": { "type": "number", "minimum": 0.1, "maximum": 5, "default": 1.0 }, - "betaPeripheral": { "type": "number", "minimum": 0.1, "maximum": 5, "default": 1.3 }, - "coreDecayFloor": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.9 }, - "workingDecayFloor": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.7 }, - "peripheralDecayFloor": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.5 } + "recencyHalfLifeDays": { + "type": "number", + "minimum": 1, + "maximum": 365, + "default": 30 + }, + "recencyWeight": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.4 + }, + "frequencyWeight": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.3 + }, + "intrinsicWeight": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.3 + }, + "staleThreshold": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.3 + }, + "searchBoostMin": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.3 + }, + "importanceModulation": { + "type": "number", + "minimum": 0, + "maximum": 10, + "default": 1.5 + }, + "betaCore": { + "type": "number", + "minimum": 0.1, + "maximum": 5, + "default": 0.8 + }, + "betaWorking": { + "type": "number", + "minimum": 0.1, + "maximum": 5, + "default": 1 + }, + "betaPeripheral": { + "type": "number", + "minimum": 0.1, + "maximum": 5, + "default": 1.3 + }, + "coreDecayFloor": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.9 + }, + "workingDecayFloor": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.7 + }, + "peripheralDecayFloor": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.5 + } } }, "tier": { "type": "object", "additionalProperties": false, "properties": { - "coreAccessThreshold": { "type": "integer", "minimum": 1, "maximum": 1000, "default": 10 }, - "coreCompositeThreshold": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.7 }, - "coreImportanceThreshold": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.8 }, - "peripheralCompositeThreshold": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.15 }, - "peripheralAgeDays": { "type": "integer", "minimum": 1, "maximum": 3650, "default": 60 }, - "workingAccessThreshold": { "type": "integer", "minimum": 1, "maximum": 1000, "default": 3 }, - "workingCompositeThreshold": { "type": "number", "minimum": 0, "maximum": 1, "default": 0.4 } + "coreAccessThreshold": { + "type": "integer", + "minimum": 1, + "maximum": 1000, + "default": 10 + }, + "coreCompositeThreshold": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.7 + }, + "coreImportanceThreshold": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.8 + }, + "peripheralCompositeThreshold": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.15 + }, + "peripheralAgeDays": { + "type": "integer", + "minimum": 1, + "maximum": 3650, + "default": 60 + }, + "workingAccessThreshold": { + "type": "integer", + "minimum": 1, + "maximum": 1000, + "default": 3 + }, + "workingCompositeThreshold": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.4 + } } }, "sessionMemory": { @@ -426,6 +666,15 @@ "type": "object", "additionalProperties": false, "properties": { + "auth": { + "type": "string", + "enum": [ + "api-key", + "oauth" + ], + "default": "api-key", + "description": "LLM authentication mode. oauth uses the local Codex/ChatGPT login cache instead of llm.apiKey." + }, "apiKey": { "type": "string" }, @@ -435,6 +684,19 @@ }, "baseURL": { "type": "string" + }, + "oauthProvider": { + "type": "string", + "description": "OAuth provider id for llm.auth=oauth. Currently supported: openai-codex." + }, + "oauthPath": { + "type": "string", + "description": "OAuth token file for llm.auth=oauth. Defaults to ~/.openclaw/.memory-lancedb-pro/oauth.json." + }, + "timeoutMs": { + "type": "integer", + "minimum": 1, + "default": 30000 } } }, @@ -452,6 +714,124 @@ "description": "Fallback directory for Markdown mirror files when agent workspace is unknown" } } + }, + "workspaceBoundary": { + "type": "object", + "additionalProperties": false, + "properties": { + "userMdExclusive": { + "type": "object", + "additionalProperties": false, + "properties": { + "enabled": { + "type": "boolean", + "default": false, + "description": "Do not store USER.md-exclusive facts in LanceDB." + }, + "routeProfile": { + "type": "boolean", + "default": true, + "description": "Treat extracted profile memories as USER.md-exclusive." + }, + "routeCanonicalName": { + "type": "boolean", + "default": true, + "description": "Treat canonical name facts as USER.md-exclusive." + }, + "routeCanonicalAddressing": { + "type": "boolean", + "default": true, + "description": "Treat canonical addressing facts as USER.md-exclusive." + }, + "filterRecall": { + "type": "boolean", + "default": true, + "description": "Filter USER.md-exclusive facts out of plugin recall results." + } + } + } + } + }, + "memoryCompaction": { + "type": "object", + "additionalProperties": false, + "description": "Progressive summarization: periodically consolidate semantically similar old memories into refined single entries, reducing noise and improving retrieval quality over time.", + "properties": { + "enabled": { + "type": "boolean", + "default": false, + "description": "Enable automatic compaction at gateway startup (respects cooldownHours)" + }, + "minAgeDays": { + "type": "integer", + "default": 7, + "minimum": 1, + "description": "Only compact memories at least this many days old" + }, + "similarityThreshold": { + "type": "number", + "default": 0.88, + "minimum": 0, + "maximum": 1, + "description": "Cosine similarity threshold for clustering. Higher = more conservative merges." + }, + "minClusterSize": { + "type": "integer", + "default": 2, + "minimum": 2, + "description": "Minimum cluster size required to trigger a merge" + }, + "maxMemoriesToScan": { + "type": "integer", + "default": 200, + "minimum": 1, + "description": "Maximum number of memories to scan per compaction run" + }, + "cooldownHours": { + "type": "integer", + "default": 24, + "minimum": 1, + "description": "Minimum hours between automatic compaction runs" + } + } + }, + "sessionCompression": { + "type": "object", + "additionalProperties": false, + "description": "Session compression settings for auto-capture. Scores and compresses conversation texts to prioritize high-signal content.", + "properties": { + "enabled": { + "type": "boolean", + "default": false, + "description": "Enable session compression before auto-capture extraction" + }, + "minScoreToKeep": { + "type": "number", + "minimum": 0, + "maximum": 1, + "default": 0.3, + "description": "Minimum score threshold. If all texts score below this, fallback to keeping at least the last few texts." + } + } + }, + "extractionThrottle": { + "type": "object", + "additionalProperties": false, + "description": "Adaptive extraction throttling to reduce LLM cost on low-value or rapid-fire sessions.", + "properties": { + "skipLowValue": { + "type": "boolean", + "default": false, + "description": "Skip extraction for conversations with estimated value < 0.2" + }, + "maxExtractionsPerHour": { + "type": "integer", + "minimum": 1, + "maximum": 200, + "default": 30, + "description": "Maximum number of auto-capture extractions allowed per hour" + } + } } }, "required": [ @@ -482,6 +862,11 @@ "help": "Override vector dimensions for custom models not in the built-in lookup table", "advanced": true }, + "embedding.omitDimensions": { + "label": "Omit Request Dimensions", + "help": "Do not send the dimensions parameter to the embedding API even if embedding.dimensions is configured. Useful for local models like Qwen3-Embedding that reject the field.", + "advanced": true + }, "embedding.taskQuery": { "label": "Query Task", "placeholder": "retrieval.query", @@ -517,17 +902,18 @@ "llm.apiKey": { "label": "LLM API Key", "sensitive": true, - "help": "API key for smart extraction LLM (defaults to embedding apiKey)", - "advanced": true + "placeholder": "sk-... or ${GROQ_API_KEY}", + "help": "API key for LLM used by smart memory extraction (defaults to embedding.apiKey if omitted)" }, "llm.model": { "label": "LLM Model", - "help": "Model for memory extraction and dedup (default: gpt-4o-mini)", - "advanced": true + "placeholder": "openai/gpt-oss-120b", + "help": "OpenAI-compatible chat model for memory extraction/summary" }, "llm.baseURL": { "label": "LLM Base URL", - "help": "Base URL for LLM API (defaults to embedding baseURL)", + "placeholder": "https://api.groq.com/openai/v1", + "help": "OpenAI-compatible base URL for LLM (defaults to embedding.baseURL if omitted)", "advanced": true }, "extractMinMessages": { @@ -540,6 +926,66 @@ "help": "Maximum conversation characters to process for extraction", "advanced": true }, + "admissionControl.enabled": { + "label": "Admission Control", + "help": "Enable A-MAC-style admission scoring before downstream dedup.", + "advanced": true + }, + "admissionControl.preset": { + "label": "Admission Preset", + "help": "balanced is the default; conservative favors precision; high-recall favors recall. Explicit admissionControl fields override the preset.", + "advanced": true + }, + "admissionControl.utilityMode": { + "label": "Admission Utility Mode", + "help": "standalone adds a separate LLM utility scoring call; off disables that feature.", + "advanced": true + }, + "admissionControl.rejectThreshold": { + "label": "Admission Reject Threshold", + "help": "Candidates below this weighted score are rejected before persistence.", + "advanced": true + }, + "admissionControl.admitThreshold": { + "label": "Admission Admit Threshold", + "help": "Higher-scoring admitted candidates are labeled as likely add cases in audit metadata; all admitted candidates still go through downstream dedup.", + "advanced": true + }, + "admissionControl.noveltyCandidatePoolSize": { + "label": "Admission Novelty Pool", + "help": "Number of nearby memories to compare for novelty scoring.", + "advanced": true + }, + "admissionControl.auditMetadata": { + "label": "Admission Audit Metadata", + "help": "Persist per-memory admission scores and reasons in metadata for debugging.", + "advanced": true + }, + "admissionControl.persistRejectedAudits": { + "label": "Persist Reject Audits", + "help": "Write rejected admission decisions to a JSONL audit log for later review.", + "advanced": true + }, + "admissionControl.rejectedAuditFilePath": { + "label": "Reject Audit File", + "help": "Optional JSONL path for rejected admission audit records. Defaults beside the plugin memory data directory.", + "advanced": true + }, + "admissionControl.recency.halfLifeDays": { + "label": "Admission Recency Half-Life", + "help": "Controls how quickly recency rises as similar memories get older.", + "advanced": true + }, + "admissionControl.weights": { + "label": "Admission Weights", + "help": "Feature weights are normalized at runtime before scoring.", + "advanced": true + }, + "admissionControl.typePriors": { + "label": "Admission Type Priors", + "help": "Category priors for long-term retention likelihood.", + "advanced": true + }, "autoCapture": { "label": "Auto-Capture", "help": "Automatically capture important information from conversations (enabled by default)" @@ -558,6 +1004,31 @@ "help": "Minimum number of conversation turns before a specific memory can be re-injected in the same session.", "advanced": true }, + "autoRecallMaxItems": { + "label": "Auto-Recall Max Items", + "help": "Maximum memories that auto-recall can inject in one turn.", + "advanced": true + }, + "autoRecallMaxChars": { + "label": "Auto-Recall Max Chars", + "help": "Maximum total characters injected by auto-recall in one turn.", + "advanced": true + }, + "autoRecallPerItemMaxChars": { + "label": "Auto-Recall Per-Item Max Chars", + "help": "Maximum characters per injected memory summary.", + "advanced": true + }, + "recallMode": { + "label": "Recall Mode", + "help": "Auto-recall depth: full (default), summary (L0 only), adaptive (intent-based category routing), off.", + "advanced": false + }, + "maxRecallPerTurn": { + "label": "Max Recall Per Turn", + "help": "Hard per-turn injection cap. Acts as a safety ceiling on top of Auto-Recall Max Items. Default: 10.", + "advanced": true + }, "captureAssistant": { "label": "Capture Assistant Messages", "help": "Also auto-capture assistant messages (default false to reduce memory pollution)", @@ -609,7 +1080,7 @@ }, "retrieval.rerankProvider": { "label": "Reranker Provider", - "help": "Provider format: jina (default), siliconflow, voyage, pinecone, or dashscope", + "help": "Provider format: jina (default), siliconflow, voyage, pinecone, dashscope, or tei", "advanced": true }, "retrieval.candidatePoolSize": { @@ -795,21 +1266,86 @@ "help": "Fallback directory when agent workspace mapping is unavailable", "advanced": true }, - "llm.apiKey": { - "label": "LLM API Key", - "sensitive": true, - "placeholder": "sk-... or ${GROQ_API_KEY}", - "help": "API key for LLM used by smart memory extraction (defaults to embedding.apiKey if omitted)" + "workspaceBoundary.userMdExclusive.enabled": { + "label": "USER.md Exclusive Facts", + "help": "Skip storing USER.md-owned facts in LanceDB and keep them out of plugin recall." }, - "llm.model": { - "label": "LLM Model", - "placeholder": "openai/gpt-oss-120b", - "help": "OpenAI-compatible chat model for memory extraction/summary" + "workspaceBoundary.userMdExclusive.routeProfile": { + "label": "Exclude Profile Memories", + "help": "Treat extracted profile memories as USER.md-only facts.", + "advanced": true }, - "llm.baseURL": { - "label": "LLM Base URL", - "placeholder": "https://api.groq.com/openai/v1", - "help": "OpenAI-compatible base URL for LLM (defaults to embedding.baseURL if omitted)", + "workspaceBoundary.userMdExclusive.routeCanonicalName": { + "label": "Exclude Canonical Name", + "help": "Treat canonical name facts as USER.md-only facts.", + "advanced": true + }, + "workspaceBoundary.userMdExclusive.routeCanonicalAddressing": { + "label": "Exclude Canonical Addressing", + "help": "Treat canonical addressing facts as USER.md-only facts.", + "advanced": true + }, + "workspaceBoundary.userMdExclusive.filterRecall": { + "label": "Filter USER.md Facts From Recall", + "help": "Hide USER.md-exclusive facts from plugin auto-recall and memory_recall output.", + "advanced": true + }, + "llm.auth": { + "label": "LLM Auth", + "help": "api-key uses llm.apiKey or embedding.apiKey. oauth uses a plugin-scoped OAuth token file by default.", + "advanced": true + }, + "llm.oauthProvider": { + "label": "LLM OAuth Provider", + "help": "OAuth provider id used when llm.auth=oauth. Currently supported: openai-codex.", + "advanced": true + }, + "llm.oauthPath": { + "label": "LLM OAuth File", + "help": "OAuth token file used when llm.auth=oauth. Default: ~/.openclaw/.memory-lancedb-pro/oauth.json", + "advanced": true + }, + "llm.timeoutMs": { + "label": "LLM Timeout (ms)", + "placeholder": "30000", + "help": "Request timeout for the smart-extraction / upgrade LLM in milliseconds", + "advanced": true + }, + "memoryCompaction.enabled": { + "label": "Auto Compaction", + "help": "Automatically consolidate similar old memories at gateway startup. Also available on-demand via the memory_compact tool (requires enableManagementTools)." + }, + "memoryCompaction.minAgeDays": { + "label": "Min Age (days)", + "help": "Memories younger than this are never touched by compaction", + "advanced": true + }, + "memoryCompaction.similarityThreshold": { + "label": "Similarity Threshold", + "help": "How similar two memories must be to merge (0–1). 0.88 is a good starting point; raise to 0.92+ for conservative merges.", + "advanced": true + }, + "memoryCompaction.cooldownHours": { + "label": "Cooldown (hours)", + "help": "Minimum gap between automatic compaction runs", + "advanced": true + }, + "sessionCompression.enabled": { + "label": "Session Compression", + "help": "Score and compress conversation texts before auto-capture to prioritize high-signal content (corrections, decisions, tool calls)" + }, + "sessionCompression.minScoreToKeep": { + "label": "Compression Min Score", + "help": "Minimum text score threshold. If all texts score below this, keep at least the last few texts as fallback.", + "advanced": true + }, + "extractionThrottle.skipLowValue": { + "label": "Skip Low-Value Conversations", + "help": "Skip auto-capture for conversations estimated to have low memory value (< 0.2)" + }, + "extractionThrottle.maxExtractionsPerHour": { + "label": "Max Extractions Per Hour", + "help": "Rate limit for auto-capture extractions. Prevents excessive LLM calls during rapid-fire sessions.", "advanced": true } } diff --git a/package-lock.json b/package-lock.json index 1a43bb5c..fcbf1b04 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,17 +1,20 @@ { "name": "memory-lancedb-pro", - "version": "1.1.0-beta.8", + "version": "1.1.0-beta.9", "lockfileVersion": 3, "requires": true, "packages": { "": { "name": "memory-lancedb-pro", - "version": "1.1.0-beta.8", + "version": "1.1.0-beta.9", "license": "MIT", "dependencies": { "@lancedb/lancedb": "^0.26.2", "@sinclair/typebox": "0.34.48", - "openai": "^6.21.0" + "apache-arrow": "18.1.0", + "json5": "^2.2.3", + "openai": "^6.21.0", + "proper-lockfile": "^4.1.2" }, "devDependencies": { "commander": "^14.0.0", @@ -175,7 +178,6 @@ "resolved": "https://registry.npmjs.org/@swc/helpers/-/helpers-0.5.18.tgz", "integrity": "sha512-TXTnIcNJQEKwThMMqBXsZ4VGAza6bvN4pa41Rkqoio6QBKMvo+5lexeTMScGCIxtzgQJzElcvIltani+adC5PQ==", "license": "Apache-2.0", - "peer": true, "dependencies": { "tslib": "^2.8.0" } @@ -184,22 +186,19 @@ "version": "5.2.3", "resolved": "https://registry.npmjs.org/@types/command-line-args/-/command-line-args-5.2.3.tgz", "integrity": "sha512-uv0aG6R0Y8WHZLTamZwtfsDLVRnOa+n+n5rEvFWL5Na5gZ8V2Teab/duDPFzIIIhs9qizDpcavCusCLJZu62Kw==", - "license": "MIT", - "peer": true + "license": "MIT" }, "node_modules/@types/command-line-usage": { "version": "5.0.4", "resolved": "https://registry.npmjs.org/@types/command-line-usage/-/command-line-usage-5.0.4.tgz", "integrity": "sha512-BwR5KP3Es/CSht0xqBcUXS3qCAUVXwpRKsV2+arxeb65atasuXG9LykC9Ab10Cw3s2raH92ZqOeILaQbsB2ACg==", - "license": "MIT", - "peer": true + "license": "MIT" }, "node_modules/@types/node": { "version": "20.19.33", "resolved": "https://registry.npmjs.org/@types/node/-/node-20.19.33.tgz", "integrity": "sha512-Rs1bVAIdBs5gbTIKza/tgpMuG1k3U/UMJLWecIMxNdJFDMzcM5LOiLVRYh3PilWEYDIeUDv7bpiHPLPsbydGcw==", "license": "MIT", - "peer": true, "dependencies": { "undici-types": "~6.21.0" } @@ -209,7 +208,6 @@ "resolved": "https://registry.npmjs.org/ansi-styles/-/ansi-styles-4.3.0.tgz", "integrity": "sha512-zbB9rCJAT1rbjiVDb2hqKFHNYLxgtk8NURxZ3IZwD3F6NtxbXZQCnnSi1Lkx+IDohdPlFp222wVALIheZJQSEg==", "license": "MIT", - "peer": true, "dependencies": { "color-convert": "^2.0.1" }, @@ -225,7 +223,6 @@ "resolved": "https://registry.npmjs.org/apache-arrow/-/apache-arrow-18.1.0.tgz", "integrity": "sha512-v/ShMp57iBnBp4lDgV8Jx3d3Q5/Hac25FWmQ98eMahUiHPXcvwIMKJD0hBIgclm/FCG+LwPkAKtkRO1O/W0YGg==", "license": "Apache-2.0", - "peer": true, "dependencies": { "@swc/helpers": "^0.5.11", "@types/command-line-args": "^5.2.3", @@ -246,7 +243,6 @@ "resolved": "https://registry.npmjs.org/array-back/-/array-back-3.1.0.tgz", "integrity": "sha512-TkuxA4UCOvxuDK6NZYXCalszEzj+TLszyASooky+i742l9TqsOdYCMJJupxRic61hwquNtppB3hgcuq9SVSH1Q==", "license": "MIT", - "peer": true, "engines": { "node": ">=6" } @@ -256,7 +252,6 @@ "resolved": "https://registry.npmjs.org/chalk/-/chalk-4.1.2.tgz", "integrity": "sha512-oKnbhFyRIXpUuez8iBMmyEa4nbj4IOQyuhc/wy9kY7/WVPcwIO9VA668Pu8RkO7+0G76SLROeyw9CpQ061i4mA==", "license": "MIT", - "peer": true, "dependencies": { "ansi-styles": "^4.1.0", "supports-color": "^7.1.0" @@ -273,7 +268,6 @@ "resolved": "https://registry.npmjs.org/chalk-template/-/chalk-template-0.4.0.tgz", "integrity": "sha512-/ghrgmhfY8RaSdeo43hNXxpoHAtxdbskUHjPpfqUWGttFgycUhYPGx3YZBCnUCvOa7Doivn1IZec3DEGFoMgLg==", "license": "MIT", - "peer": true, "dependencies": { "chalk": "^4.1.2" }, @@ -289,7 +283,6 @@ "resolved": "https://registry.npmjs.org/color-convert/-/color-convert-2.0.1.tgz", "integrity": "sha512-RRECPsj7iu/xb5oKYcsFHSppFNnsj/52OVTRKb4zP5onXwVF3zVmmToNcOfGC+CRDpfK/U584fMg38ZHCaElKQ==", "license": "MIT", - "peer": true, "dependencies": { "color-name": "~1.1.4" }, @@ -301,15 +294,13 @@ "version": "1.1.4", "resolved": "https://registry.npmjs.org/color-name/-/color-name-1.1.4.tgz", "integrity": "sha512-dOy+3AuW3a2wNbZHIuMZpTcgjGuLU/uBL/ubcZF9OXbDo8ff4O8yVp5Bf0efS8uEoYo5q4Fx7dY9OgQGXgAsQA==", - "license": "MIT", - "peer": true + "license": "MIT" }, "node_modules/command-line-args": { "version": "5.2.1", "resolved": "https://registry.npmjs.org/command-line-args/-/command-line-args-5.2.1.tgz", "integrity": "sha512-H4UfQhZyakIjC74I9d34fGYDwk3XpSr17QhEd0Q3I9Xq1CETHo4Hcuo87WyWHpAF1aSLjLRf5lD9ZGX2qStUvg==", "license": "MIT", - "peer": true, "dependencies": { "array-back": "^3.1.0", "find-replace": "^3.0.0", @@ -325,7 +316,6 @@ "resolved": "https://registry.npmjs.org/command-line-usage/-/command-line-usage-7.0.3.tgz", "integrity": "sha512-PqMLy5+YGwhMh1wS04mVG44oqDsgyLRSKJBdOo1bnYhMKBW65gZF1dRp2OZRhiTjgUHljy99qkO7bsctLaw35Q==", "license": "MIT", - "peer": true, "dependencies": { "array-back": "^6.2.2", "chalk-template": "^0.4.0", @@ -341,7 +331,6 @@ "resolved": "https://registry.npmjs.org/array-back/-/array-back-6.2.2.tgz", "integrity": "sha512-gUAZ7HPyb4SJczXAMUXMGAvI976JoK3qEx9v1FTmeYuJj0IBiaKttG1ydtGKdkfqWkIkouke7nG8ufGy77+Cvw==", "license": "MIT", - "peer": true, "engines": { "node": ">=12.17" } @@ -351,7 +340,6 @@ "resolved": "https://registry.npmjs.org/typical/-/typical-7.3.0.tgz", "integrity": "sha512-ya4mg/30vm+DOWfBg4YK3j2WD6TWtRkCbasOJr40CseYENzCUby/7rIvXA99JGsQHeNxLbnXdyLLxKSv3tauFw==", "license": "MIT", - "peer": true, "engines": { "node": ">=12.17" } @@ -371,7 +359,6 @@ "resolved": "https://registry.npmjs.org/find-replace/-/find-replace-3.0.0.tgz", "integrity": "sha512-6Tb2myMioCAgv5kfvP5/PkZZ/ntTpVK39fHY7WkWBgvbeE+VHd/tZuZ4mrC+bxh4cfOZeYKVPaJIZtZXV7GNCQ==", "license": "MIT", - "peer": true, "dependencies": { "array-back": "^3.0.1" }, @@ -383,15 +370,19 @@ "version": "24.12.23", "resolved": "https://registry.npmjs.org/flatbuffers/-/flatbuffers-24.12.23.tgz", "integrity": "sha512-dLVCAISd5mhls514keQzmEG6QHmUUsNuWsb4tFafIUwvvgDjXhtfAYSKOzt5SWOy+qByV5pbsDZ+Vb7HUOBEdA==", - "license": "Apache-2.0", - "peer": true + "license": "Apache-2.0" + }, + "node_modules/graceful-fs": { + "version": "4.2.11", + "resolved": "https://registry.npmjs.org/graceful-fs/-/graceful-fs-4.2.11.tgz", + "integrity": "sha512-RbJ5/jmFcNNCcDV5o9eTnBLJ/HszWV0P73bc+Ff4nS/rJj+YaS6IGyiOL0VoBYX+l1Wrl3k63h/KrH+nhJ0XvQ==", + "license": "ISC" }, "node_modules/has-flag": { "version": "4.0.0", "resolved": "https://registry.npmjs.org/has-flag/-/has-flag-4.0.0.tgz", "integrity": "sha512-EykJT/Q1KjTWctppgIAgfSO0tKVuZUjhgMr17kqTumMl6Afv3EISleU7qZUzoXDFTAHTDC4NOoG/ZxU3EvlMPQ==", "license": "MIT", - "peer": true, "engines": { "node": ">=8" } @@ -410,17 +401,27 @@ "version": "0.0.3", "resolved": "https://registry.npmjs.org/json-bignum/-/json-bignum-0.0.3.tgz", "integrity": "sha512-2WHyXj3OfHSgNyuzDbSxI1w2jgw5gkWSWhS7Qg4bWXx1nLk3jnbwfUeS0PSba3IzpTUWdHxBieELUzXRjQB2zg==", - "peer": true, "engines": { "node": ">=0.8" } }, + "node_modules/json5": { + "version": "2.2.3", + "resolved": "https://registry.npmjs.org/json5/-/json5-2.2.3.tgz", + "integrity": "sha512-XmOWe7eyHYH14cLdVPoyg+GOH3rYX++KpzrylJwSW98t3Nk+U8XOl8FWKOgwtzdb8lXGf6zYwDUzeHMWfxasyg==", + "license": "MIT", + "bin": { + "json5": "lib/cli.js" + }, + "engines": { + "node": ">=6" + } + }, "node_modules/lodash.camelcase": { "version": "4.3.0", "resolved": "https://registry.npmjs.org/lodash.camelcase/-/lodash.camelcase-4.3.0.tgz", "integrity": "sha512-TwuEnCnxbc3rAvhf/LbG7tJUDzhqXyFnv3dtzLOPgCG/hODL7WFnsbwktkD7yUV0RrreP/l1PALq/YSg6VvjlA==", - "license": "MIT", - "peer": true + "license": "MIT" }, "node_modules/openai": { "version": "6.22.0", @@ -443,18 +444,43 @@ } } }, + "node_modules/proper-lockfile": { + "version": "4.1.2", + "resolved": "https://registry.npmjs.org/proper-lockfile/-/proper-lockfile-4.1.2.tgz", + "integrity": "sha512-TjNPblN4BwAWMXU8s9AEz4JmQxnD1NNL7bNOY/AKUzyamc379FWASUhc/K1pL2noVb+XmZKLL68cjzLsiOAMaA==", + "license": "MIT", + "dependencies": { + "graceful-fs": "^4.2.4", + "retry": "^0.12.0", + "signal-exit": "^3.0.2" + } + }, "node_modules/reflect-metadata": { "version": "0.2.2", "resolved": "https://registry.npmjs.org/reflect-metadata/-/reflect-metadata-0.2.2.tgz", "integrity": "sha512-urBwgfrvVP/eAyXx4hluJivBKzuEbSQs9rKWCrCkbSxNv8mxPcUZKeuoF3Uy4mJl3Lwprp6yy5/39VWigZ4K6Q==", "license": "Apache-2.0" }, + "node_modules/retry": { + "version": "0.12.0", + "resolved": "https://registry.npmjs.org/retry/-/retry-0.12.0.tgz", + "integrity": "sha512-9LkiTwjUh6rT555DtE9rTX+BKByPfrMzEAtnlEtdEwr3Nkffwiihqe2bWADg+OQRjt9gl6ICdmB/ZFDCGAtSow==", + "license": "MIT", + "engines": { + "node": ">= 4" + } + }, + "node_modules/signal-exit": { + "version": "3.0.7", + "resolved": "https://registry.npmjs.org/signal-exit/-/signal-exit-3.0.7.tgz", + "integrity": "sha512-wnD2ZE+l+SPC/uoS0vXeE9L1+0wuaMqKlfz9AMUo38JsyLSBWSFcHR1Rri62LZc12vLr1gb3jl7iwQhgwpAbGQ==", + "license": "ISC" + }, "node_modules/supports-color": { "version": "7.2.0", "resolved": "https://registry.npmjs.org/supports-color/-/supports-color-7.2.0.tgz", "integrity": "sha512-qpCAvRl9stuOHveKsn7HncJRvv501qIacKzQlO/+Lwxc9+0q2wLyv4Dfvt80/DPn2pqOBsJdDiogXGR9+OvwRw==", "license": "MIT", - "peer": true, "dependencies": { "has-flag": "^4.0.0" }, @@ -467,7 +493,6 @@ "resolved": "https://registry.npmjs.org/table-layout/-/table-layout-4.1.1.tgz", "integrity": "sha512-iK5/YhZxq5GO5z8wb0bY1317uDF3Zjpha0QFFLA8/trAoiLbQD0HUbMesEaxyzUgDxi2QlcbM8IvqOlEjgoXBA==", "license": "MIT", - "peer": true, "dependencies": { "array-back": "^6.2.2", "wordwrapjs": "^5.1.0" @@ -481,7 +506,6 @@ "resolved": "https://registry.npmjs.org/array-back/-/array-back-6.2.2.tgz", "integrity": "sha512-gUAZ7HPyb4SJczXAMUXMGAvI976JoK3qEx9v1FTmeYuJj0IBiaKttG1ydtGKdkfqWkIkouke7nG8ufGy77+Cvw==", "license": "MIT", - "peer": true, "engines": { "node": ">=12.17" } @@ -490,8 +514,7 @@ "version": "2.8.1", "resolved": "https://registry.npmjs.org/tslib/-/tslib-2.8.1.tgz", "integrity": "sha512-oJFu94HQb+KVduSUQL7wnpmqnfmLsOA/nAh6b6EH0wCEoK0/mPeXU6c3wKDV83MkOuHPRHtSXKKU99IBazS/2w==", - "license": "0BSD", - "peer": true + "license": "0BSD" }, "node_modules/typescript": { "version": "5.9.3", @@ -512,7 +535,6 @@ "resolved": "https://registry.npmjs.org/typical/-/typical-4.0.0.tgz", "integrity": "sha512-VAH4IvQ7BDFYglMd7BPRDfLgxZZX4O4TFcRDA6EN5X7erNJJq+McIEp8np9aVtxrCJ6qx4GTYVfOWNjcqwZgRw==", "license": "MIT", - "peer": true, "engines": { "node": ">=8" } @@ -521,15 +543,13 @@ "version": "6.21.0", "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.21.0.tgz", "integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ==", - "license": "MIT", - "peer": true + "license": "MIT" }, "node_modules/wordwrapjs": { "version": "5.1.1", "resolved": "https://registry.npmjs.org/wordwrapjs/-/wordwrapjs-5.1.1.tgz", "integrity": "sha512-0yweIbkINJodk27gX9LBGMzyQdBDan3s/dEAiwBOj+Mf0PPyWL6/rikalkv8EeD0E8jm4o5RXEOrFTP3NXbhJg==", "license": "MIT", - "peer": true, "engines": { "node": ">=12.17" } diff --git a/package.json b/package.json index dda72582..cfd47cd0 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "memory-lancedb-pro", - "version": "1.1.0-beta.8", + "version": "1.1.0-beta.10", "description": "OpenClaw enhanced LanceDB memory plugin with hybrid retrieval (Vector + BM25), cross-encoder rerank, multi-scope isolation, long-context chunking, and management CLI", "type": "module", "main": "index.ts", @@ -28,7 +28,9 @@ "@lancedb/lancedb": "^0.26.2", "@sinclair/typebox": "0.34.48", "apache-arrow": "18.1.0", - "openai": "^6.21.0" + "json5": "^2.2.3", + "openai": "^6.21.0", + "proper-lockfile": "^4.1.2" }, "openclaw": { "extensions": [ @@ -36,7 +38,7 @@ ] }, "scripts": { - "test": "node test/embedder-error-hints.test.mjs && node test/migrate-legacy-schema.test.mjs && node --test test/config-session-strategy-migration.test.mjs && node --test test/recall-text-cleanup.test.mjs && node test/update-consistency-lancedb.test.mjs && node test/cli-smoke.mjs && node test/functional-e2e.mjs && node test/retriever-rerank-regression.mjs && node test/smart-memory-lifecycle.mjs && node test/smart-extractor-branches.mjs && node test/plugin-manifest-regression.mjs && node --test test/sync-plugin-version.test.mjs && node test/smart-metadata-v2.mjs && node test/vector-search-cosine.test.mjs && node test/context-support-e2e.mjs && node test/temporal-facts.test.mjs && node test/memory-update-supersede.test.mjs", + "test": "node test/embedder-error-hints.test.mjs && node test/cjk-recursion-regression.test.mjs && node test/migrate-legacy-schema.test.mjs && node --test test/config-session-strategy-migration.test.mjs && node --test test/scope-access-undefined.test.mjs && node --test test/reflection-bypass-hook.test.mjs && node --test test/smart-extractor-scope-filter.test.mjs && node --test test/store-empty-scope-filter.test.mjs && node --test test/recall-text-cleanup.test.mjs && node test/update-consistency-lancedb.test.mjs && node --test test/strip-envelope-metadata.test.mjs && node test/cli-smoke.mjs && node test/functional-e2e.mjs && node test/retriever-rerank-regression.mjs && node test/smart-memory-lifecycle.mjs && node test/smart-extractor-branches.mjs && node test/plugin-manifest-regression.mjs && node --test test/session-summary-before-reset.test.mjs && node --test test/sync-plugin-version.test.mjs && node test/smart-metadata-v2.mjs && node test/vector-search-cosine.test.mjs && node test/context-support-e2e.mjs && node test/temporal-facts.test.mjs && node test/memory-update-supersede.test.mjs && node test/memory-upgrader-diagnostics.test.mjs && node --test test/llm-api-key-client.test.mjs && node --test test/llm-oauth-client.test.mjs && node --test test/cli-oauth-login.test.mjs && node --test test/workflow-fork-guards.test.mjs && node --test test/clawteam-scope.test.mjs && node --test test/cross-process-lock.test.mjs && node --test test/preference-slots.test.mjs", "test:openclaw-host": "node test/openclaw-host-functional.mjs", "version": "node scripts/sync-plugin-version.mjs openclaw.plugin.json package.json && git add openclaw.plugin.json" }, diff --git a/scripts/governance-maintenance.mjs b/scripts/governance-maintenance.mjs new file mode 100755 index 00000000..7eb4a73c --- /dev/null +++ b/scripts/governance-maintenance.mjs @@ -0,0 +1,130 @@ +#!/usr/bin/env node +import { resolve } from "node:path"; +import jitiFactory from "jiti"; + +const jiti = jitiFactory(import.meta.url, { interopDefault: true }); +const { MemoryStore } = jiti("../src/store.ts"); +const { parseSmartMetadata, stringifySmartMetadata } = jiti("../src/smart-metadata.ts"); + +function parseArgs(argv) { + const args = { + dbPath: process.env.MEMORY_DB_PATH || "", + vectorDim: Number(process.env.MEMORY_VECTOR_DIM || "1536"), + scope: undefined, + apply: false, + pendingDays: 30, + limit: 1000, + }; + for (let i = 2; i < argv.length; i++) { + const a = argv[i]; + if (a === "--db-path") args.dbPath = argv[++i] || ""; + else if (a === "--vector-dim") args.vectorDim = Number(argv[++i] || "1536"); + else if (a === "--scope") args.scope = argv[++i] || undefined; + else if (a === "--apply") args.apply = true; + else if (a === "--pending-days") args.pendingDays = Number(argv[++i] || "30"); + else if (a === "--limit") args.limit = Number(argv[++i] || "1000"); + } + return args; +} + +async function loadAllEntries(store, scopeFilter, limit) { + const out = []; + let offset = 0; + const pageSize = 200; + while (out.length < limit) { + const page = await store.list(scopeFilter, undefined, Math.min(pageSize, limit - out.length), offset); + if (!page.length) break; + out.push(...page); + offset += page.length; + if (page.length < pageSize) break; + } + return out; +} + +function normalizeKey(text) { + return text.toLowerCase().replace(/\s+/g, " ").trim(); +} + +async function run() { + const args = parseArgs(process.argv); + if (!args.dbPath) throw new Error("Missing --db-path (or MEMORY_DB_PATH)"); + + const store = new MemoryStore({ + dbPath: resolve(args.dbPath), + vectorDim: Number.isFinite(args.vectorDim) ? args.vectorDim : 1536, + }); + const scopeFilter = args.scope ? [args.scope] : undefined; + const entries = await loadAllEntries(store, scopeFilter, args.limit); + + const now = Date.now(); + const pendingCutoff = now - Math.max(1, args.pendingDays) * 24 * 60 * 60 * 1000; + + const toArchivePending = []; + const canonicalByKey = new Map(); + const duplicateCandidates = []; + + for (const entry of entries) { + const meta = parseSmartMetadata(entry.metadata, entry); + + if (meta.state === "pending" && entry.timestamp < pendingCutoff) { + toArchivePending.push(entry.id); + } + + if (meta.state === "archived") continue; + const key = `${meta.memory_category}:${normalizeKey(meta.l0_abstract || entry.text)}`; + const existing = canonicalByKey.get(key); + if (!existing) { + canonicalByKey.set(key, entry); + continue; + } + const keep = existing.timestamp >= entry.timestamp ? existing : entry; + const drop = keep.id === existing.id ? entry : existing; + canonicalByKey.set(key, keep); + duplicateCandidates.push({ duplicateId: drop.id, canonicalId: keep.id }); + } + + if (!args.apply) { + console.log(`Dry run summary:`); + console.log(`- scanned: ${entries.length}`); + console.log(`- stale pending -> archive: ${toArchivePending.length}`); + console.log(`- duplicate compact candidates: ${duplicateCandidates.length}`); + return; + } + + let archivedPending = 0; + for (const id of toArchivePending) { + const existing = await store.getById(id, scopeFilter); + if (!existing) continue; + const meta = parseSmartMetadata(existing.metadata, existing); + meta.state = "archived"; + meta.memory_layer = "archive"; + meta.archive_reason = "pending_timeout"; + meta.archived_at = now; + await store.update(id, { metadata: stringifySmartMetadata(meta) }, scopeFilter); + archivedPending++; + } + + let compacted = 0; + for (const row of duplicateCandidates) { + const existing = await store.getById(row.duplicateId, scopeFilter); + if (!existing) continue; + const meta = parseSmartMetadata(existing.metadata, existing); + meta.state = "archived"; + meta.memory_layer = "archive"; + meta.archive_reason = "compact_duplicate"; + meta.canonical_id = row.canonicalId; + meta.archived_at = now; + await store.update(row.duplicateId, { metadata: stringifySmartMetadata(meta) }, scopeFilter); + compacted++; + } + + console.log(`Maintenance complete:`); + console.log(`- scanned: ${entries.length}`); + console.log(`- archived pending: ${archivedPending}`); + console.log(`- compacted duplicates: ${compacted}`); +} + +run().catch((err) => { + console.error(err instanceof Error ? err.message : String(err)); + process.exit(1); +}); diff --git a/scripts/migrate-governance-metadata.mjs b/scripts/migrate-governance-metadata.mjs new file mode 100755 index 00000000..fafb70d9 --- /dev/null +++ b/scripts/migrate-governance-metadata.mjs @@ -0,0 +1,110 @@ +#!/usr/bin/env node +import { createWriteStream, readFileSync } from "node:fs"; +import { resolve } from "node:path"; +import { fileURLToPath } from "node:url"; +import jitiFactory from "jiti"; + +const jiti = jitiFactory(import.meta.url, { interopDefault: true }); +const { MemoryStore } = jiti("../src/store.ts"); +const { buildSmartMetadata, stringifySmartMetadata } = jiti("../src/smart-metadata.ts"); + +function parseArgs(argv) { + const args = { + dbPath: process.env.MEMORY_DB_PATH || "", + vectorDim: Number(process.env.MEMORY_VECTOR_DIM || "1536"), + scope: undefined, + apply: false, + limit: 1000, + rollbackFile: "", + }; + + for (let i = 2; i < argv.length; i++) { + const a = argv[i]; + if (a === "--db-path") args.dbPath = argv[++i] || ""; + else if (a === "--vector-dim") args.vectorDim = Number(argv[++i] || "1536"); + else if (a === "--scope") args.scope = argv[++i] || undefined; + else if (a === "--apply") args.apply = true; + else if (a === "--limit") args.limit = Number(argv[++i] || "1000"); + else if (a === "--rollback") args.rollbackFile = argv[++i] || ""; + } + + return args; +} + +async function loadAllEntries(store, scopeFilter, limit) { + const out = []; + let offset = 0; + const pageSize = 200; + while (out.length < limit) { + const page = await store.list(scopeFilter, undefined, Math.min(pageSize, limit - out.length), offset); + if (!page.length) break; + out.push(...page); + offset += page.length; + if (page.length < pageSize) break; + } + return out; +} + +async function run() { + const args = parseArgs(process.argv); + if (!args.dbPath) { + throw new Error("Missing --db-path (or MEMORY_DB_PATH)"); + } + + const store = new MemoryStore({ + dbPath: resolve(args.dbPath), + vectorDim: Number.isFinite(args.vectorDim) ? args.vectorDim : 1536, + }); + + const scopeFilter = args.scope ? [args.scope] : undefined; + + if (args.rollbackFile) { + const raw = readFileSync(resolve(args.rollbackFile), "utf8"); + const lines = raw.split(/\r?\n/).filter(Boolean); + let restored = 0; + for (const line of lines) { + const row = JSON.parse(line); + await store.update(row.id, { metadata: row.metadata }, scopeFilter); + restored++; + } + console.log(`Rollback complete. Restored ${restored} metadata entries.`); + return; + } + + const entries = await loadAllEntries(store, scopeFilter, args.limit); + const changed = []; + + for (const entry of entries) { + const normalized = buildSmartMetadata(entry, {}); + const next = stringifySmartMetadata(normalized); + const prev = typeof entry.metadata === "string" ? entry.metadata : "{}"; + if (next !== prev) { + changed.push({ id: entry.id, prev, next }); + } + } + + if (!args.apply) { + console.log(`Dry run complete. scanned=${entries.length} pending_updates=${changed.length}`); + return; + } + + const ts = new Date().toISOString().replace(/[:.]/g, "-"); + const backupPath = resolve(`governance-migration-backup-${ts}.jsonl`); + const backup = createWriteStream(backupPath, { flags: "wx" }); + + let applied = 0; + for (const row of changed) { + backup.write(`${JSON.stringify({ id: row.id, metadata: row.prev })}\n`); + await store.update(row.id, { metadata: row.next }, scopeFilter); + applied++; + } + backup.end(); + + console.log(`Migration complete. scanned=${entries.length} updated=${applied}`); + console.log(`Rollback file: ${backupPath}`); +} + +run().catch((err) => { + console.error(err instanceof Error ? err.message : String(err)); + process.exit(1); +}); diff --git a/src/admission-control.ts b/src/admission-control.ts new file mode 100644 index 00000000..eee44d35 --- /dev/null +++ b/src/admission-control.ts @@ -0,0 +1,748 @@ +import { join } from "node:path"; +import type { LlmClient } from "./llm-client.js"; +import type { CandidateMemory, MemoryCategory } from "./memory-categories.js"; +import type { MemorySearchResult, MemoryStore } from "./store.js"; +import { parseSmartMetadata } from "./smart-metadata.js"; + +export interface AdmissionWeights { + utility: number; + confidence: number; + novelty: number; + recency: number; + typePrior: number; +} + +export interface AdmissionTypePriors { + profile: number; + preferences: number; + entities: number; + events: number; + cases: number; + patterns: number; +} + +export interface AdmissionRecencyConfig { + halfLifeDays: number; +} + +export type AdmissionControlPreset = + | "balanced" + | "conservative" + | "high-recall"; + +export interface AdmissionControlConfig { + preset: AdmissionControlPreset; + enabled: boolean; + utilityMode: "standalone" | "off"; + weights: AdmissionWeights; + rejectThreshold: number; + admitThreshold: number; + noveltyCandidatePoolSize: number; + recency: AdmissionRecencyConfig; + typePriors: AdmissionTypePriors; + auditMetadata: boolean; + persistRejectedAudits: boolean; + rejectedAuditFilePath?: string; +} + +export interface AdmissionFeatureScores { + utility: number; + confidence: number; + novelty: number; + recency: number; + typePrior: number; +} + +export interface AdmissionAuditRecord { + version: "amac-v1"; + decision: "reject" | "pass_to_dedup"; + hint?: "add" | "update_or_merge"; + score: number; + reason: string; + utility_reason?: string; + thresholds: { + reject: number; + admit: number; + }; + weights: AdmissionWeights; + feature_scores: AdmissionFeatureScores; + matched_existing_memory_ids: string[]; + compared_existing_memory_ids: string[]; + max_similarity: number; + evaluated_at: number; +} + +export interface AdmissionEvaluation { + decision: "reject" | "pass_to_dedup"; + hint?: "add" | "update_or_merge"; + audit: AdmissionAuditRecord; +} + +export interface AdmissionRejectionAuditEntry { + version: "amac-v1"; + rejected_at: number; + session_key: string; + target_scope: string; + scope_filter: string[]; + candidate: CandidateMemory; + audit: AdmissionAuditRecord & { decision: "reject" }; + conversation_excerpt: string; +} + +export interface ConfidenceSupportBreakdown { + score: number; + bestSupport: number; + coverage: number; + unsupportedRatio: number; +} + +export interface NoveltyBreakdown { + score: number; + maxSimilarity: number; + matchedIds: string[]; + comparedIds: string[]; +} + +const DEFAULT_WEIGHTS: AdmissionWeights = { + utility: 0.1, + confidence: 0.1, + novelty: 0.1, + recency: 0.1, + typePrior: 0.6, +}; + +const DEFAULT_TYPE_PRIORS: AdmissionTypePriors = { + profile: 0.95, + preferences: 0.9, + entities: 0.75, + events: 0.45, + cases: 0.8, + patterns: 0.85, +}; + +function cloneAdmissionControlConfig(config: AdmissionControlConfig): AdmissionControlConfig { + return { + ...config, + recency: { ...config.recency }, + weights: { ...config.weights }, + typePriors: { ...config.typePriors }, + }; +} + +export const ADMISSION_CONTROL_PRESETS: Record = { + balanced: { + preset: "balanced", + enabled: false, + utilityMode: "standalone", + weights: DEFAULT_WEIGHTS, + rejectThreshold: 0.45, + admitThreshold: 0.6, + noveltyCandidatePoolSize: 8, + recency: { + halfLifeDays: 14, + }, + typePriors: DEFAULT_TYPE_PRIORS, + auditMetadata: true, + persistRejectedAudits: true, + rejectedAuditFilePath: undefined, + }, + conservative: { + preset: "conservative", + enabled: false, + utilityMode: "standalone", + weights: { + utility: 0.16, + confidence: 0.16, + novelty: 0.18, + recency: 0.08, + typePrior: 0.42, + }, + rejectThreshold: 0.52, + admitThreshold: 0.68, + noveltyCandidatePoolSize: 10, + recency: { + halfLifeDays: 10, + }, + typePriors: { + profile: 0.98, + preferences: 0.94, + entities: 0.78, + events: 0.28, + cases: 0.78, + patterns: 0.8, + }, + auditMetadata: true, + persistRejectedAudits: true, + rejectedAuditFilePath: undefined, + }, + "high-recall": { + preset: "high-recall", + enabled: false, + utilityMode: "standalone", + weights: { + utility: 0.08, + confidence: 0.1, + novelty: 0.08, + recency: 0.14, + typePrior: 0.6, + }, + rejectThreshold: 0.34, + admitThreshold: 0.52, + noveltyCandidatePoolSize: 6, + recency: { + halfLifeDays: 21, + }, + typePriors: { + profile: 0.96, + preferences: 0.92, + entities: 0.8, + events: 0.58, + cases: 0.84, + patterns: 0.88, + }, + auditMetadata: true, + persistRejectedAudits: true, + rejectedAuditFilePath: undefined, + }, +}; + +export const DEFAULT_ADMISSION_CONTROL_CONFIG = + ADMISSION_CONTROL_PRESETS.balanced; + +function parseAdmissionControlPreset(raw: unknown): AdmissionControlPreset { + switch (raw) { + case "conservative": + case "high-recall": + case "balanced": + return raw; + default: + return "balanced"; + } +} + +function clamp01(value: unknown, fallback: number): number { + const n = typeof value === "number" ? value : Number(value); + if (!Number.isFinite(n)) return fallback; + return Math.min(1, Math.max(0, n)); +} + +function clampPositiveInt(value: unknown, fallback: number, max: number): number { + const n = typeof value === "number" ? value : Number(value); + if (!Number.isFinite(n) || n <= 0) return fallback; + return Math.min(max, Math.max(1, Math.floor(n))); +} + +function normalizeWeights(raw: unknown, defaults: AdmissionWeights): AdmissionWeights { + if (!raw || typeof raw !== "object") { + return { ...defaults }; + } + + const obj = raw as Record; + const candidate: AdmissionWeights = { + utility: clamp01(obj.utility, defaults.utility), + confidence: clamp01(obj.confidence, defaults.confidence), + novelty: clamp01(obj.novelty, defaults.novelty), + recency: clamp01(obj.recency, defaults.recency), + typePrior: clamp01(obj.typePrior, defaults.typePrior), + }; + + const total = + candidate.utility + + candidate.confidence + + candidate.novelty + + candidate.recency + + candidate.typePrior; + + if (total <= 0) { + return { ...defaults }; + } + + return { + utility: candidate.utility / total, + confidence: candidate.confidence / total, + novelty: candidate.novelty / total, + recency: candidate.recency / total, + typePrior: candidate.typePrior / total, + }; +} + +function normalizeTypePriors(raw: unknown, defaults: AdmissionTypePriors): AdmissionTypePriors { + if (!raw || typeof raw !== "object") { + return { ...defaults }; + } + + const obj = raw as Record; + return { + profile: clamp01(obj.profile, defaults.profile), + preferences: clamp01(obj.preferences, defaults.preferences), + entities: clamp01(obj.entities, defaults.entities), + events: clamp01(obj.events, defaults.events), + cases: clamp01(obj.cases, defaults.cases), + patterns: clamp01(obj.patterns, defaults.patterns), + }; +} + +export function normalizeAdmissionControlConfig(raw: unknown): AdmissionControlConfig { + if (!raw || typeof raw !== "object") { + return cloneAdmissionControlConfig(DEFAULT_ADMISSION_CONTROL_CONFIG); + } + + const obj = raw as Record; + const preset = parseAdmissionControlPreset(obj.preset); + const base = cloneAdmissionControlConfig(ADMISSION_CONTROL_PRESETS[preset]); + const rejectThreshold = clamp01(obj.rejectThreshold, base.rejectThreshold); + const admitThreshold = clamp01(obj.admitThreshold, base.admitThreshold); + const normalizedAdmit = Math.max(admitThreshold, rejectThreshold); + const recencyRaw = + typeof obj.recency === "object" && obj.recency !== null + ? (obj.recency as Record) + : {}; + + return { + preset, + enabled: obj.enabled === true, + utilityMode: + obj.utilityMode === "off" + ? "off" + : obj.utilityMode === "standalone" + ? "standalone" + : base.utilityMode, + weights: normalizeWeights(obj.weights, base.weights), + rejectThreshold, + admitThreshold: normalizedAdmit, + noveltyCandidatePoolSize: clampPositiveInt( + obj.noveltyCandidatePoolSize, + base.noveltyCandidatePoolSize, + 20, + ), + recency: { + halfLifeDays: clampPositiveInt( + recencyRaw.halfLifeDays, + base.recency.halfLifeDays, + 365, + ), + }, + typePriors: normalizeTypePriors(obj.typePriors, base.typePriors), + auditMetadata: + typeof obj.auditMetadata === "boolean" + ? obj.auditMetadata + : base.auditMetadata, + persistRejectedAudits: + typeof obj.persistRejectedAudits === "boolean" + ? obj.persistRejectedAudits + : base.persistRejectedAudits, + rejectedAuditFilePath: + typeof obj.rejectedAuditFilePath === "string" && + obj.rejectedAuditFilePath.trim().length > 0 + ? obj.rejectedAuditFilePath.trim() + : undefined, + }; +} + +export function resolveRejectedAuditFilePath( + dbPath: string, + config?: Pick | null, +): string { + const explicitPath = config?.rejectedAuditFilePath; + if (typeof explicitPath === "string" && explicitPath.trim().length > 0) { + return explicitPath.trim(); + } + return join(dbPath, "..", "admission-audit", "rejections.jsonl"); +} + +function isHanChar(char: string): boolean { + return /\p{Script=Han}/u.test(char); +} + +function isWordChar(char: string): boolean { + return /[\p{Letter}\p{Number}]/u.test(char); +} + +function tokenizeText(value: string): string[] { + const normalized = value.toLowerCase().trim(); + const tokens: string[] = []; + let current = ""; + + for (const char of normalized) { + if (isHanChar(char)) { + if (current) { + tokens.push(current); + current = ""; + } + tokens.push(char); + continue; + } + + if (isWordChar(char)) { + current += char; + continue; + } + + if (current) { + tokens.push(current); + current = ""; + } + } + + if (current) { + tokens.push(current); + } + + return tokens; +} + +function lcsLength(left: string[], right: string[]): number { + if (left.length === 0 || right.length === 0) return 0; + const dp = Array.from({ length: left.length + 1 }, () => + Array(right.length + 1).fill(0), + ); + + for (let i = 1; i <= left.length; i++) { + for (let j = 1; j <= right.length; j++) { + if (left[i - 1] === right[j - 1]) { + dp[i][j] = dp[i - 1][j - 1] + 1; + } else { + dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]); + } + } + } + + return dp[left.length][right.length]; +} + +function rougeLikeF1(left: string[], right: string[]): number { + if (left.length === 0 || right.length === 0) return 0; + const lcs = lcsLength(left, right); + if (lcs === 0) return 0; + const precision = lcs / left.length; + const recall = lcs / right.length; + if (precision + recall === 0) return 0; + return (2 * precision * recall) / (precision + recall); +} + +function splitSupportSpans(conversationText: string): string[] { + const spans = new Set(); + for (const line of conversationText.split(/\n+/)) { + const trimmed = line.trim(); + if (!trimmed) continue; + spans.add(trimmed); + for (const sentence of trimmed.split(/[。!?!?]+/)) { + const candidate = sentence.trim(); + if (candidate.length >= 4) { + spans.add(candidate); + } + } + } + return Array.from(spans); +} + +function cosineSimilarity(left: number[], right: number[]): number { + if (!Array.isArray(left) || !Array.isArray(right) || left.length === 0 || right.length === 0) { + return 0; + } + + const size = Math.min(left.length, right.length); + let dot = 0; + let leftNorm = 0; + let rightNorm = 0; + + for (let i = 0; i < size; i++) { + const l = Number(left[i]) || 0; + const r = Number(right[i]) || 0; + dot += l * r; + leftNorm += l * l; + rightNorm += r * r; + } + + if (leftNorm === 0 || rightNorm === 0) return 0; + return dot / (Math.sqrt(leftNorm) * Math.sqrt(rightNorm)); +} + +function buildUtilityPrompt(candidate: CandidateMemory, conversationText: string): string { + const excerpt = + conversationText.length > 3000 + ? conversationText.slice(-3000) + : conversationText; + + return `Evaluate whether this candidate memory is worth keeping for future cross-session interactions. + +Conversation excerpt: +${excerpt} + +Candidate memory: +- Category: ${candidate.category} +- Abstract: ${candidate.abstract} +- Overview: ${candidate.overview} +- Content: ${candidate.content} + +Score future usefulness on a 0.0-1.0 scale. + +Use higher scores for durable preferences, profile facts, reusable procedures, and long-lived project/entity state. +Use lower scores for one-off chatter, low-signal situational remarks, thin restatements, and low-value transient details. + +Return JSON only: +{ + "utility": 0.0, + "reason": "short explanation" +}`; +} + +function buildReason(details: { + decision: "reject" | "pass_to_dedup"; + hint?: "add" | "update_or_merge"; + score: number; + rejectThreshold: number; + maxSimilarity: number; + utilityReason?: string; +}): string { + const scoreText = details.score.toFixed(3); + const similarityText = details.maxSimilarity.toFixed(3); + const utilityText = details.utilityReason ? ` Utility: ${details.utilityReason}` : ""; + if (details.decision === "reject") { + return `Admission rejected (${scoreText} < ${details.rejectThreshold.toFixed(3)}). maxSimilarity=${similarityText}.${utilityText}`.trim(); + } + const hintText = details.hint ? ` hint=${details.hint};` : ""; + return `Admission passed (${scoreText});${hintText} maxSimilarity=${similarityText}.${utilityText}`.trim(); +} + +export function scoreTypePrior( + category: MemoryCategory, + typePriors: AdmissionTypePriors, +): number { + return clamp01(typePriors[category], DEFAULT_TYPE_PRIORS[category]); +} + +export function scoreConfidenceSupport( + candidate: CandidateMemory, + conversationText: string, +): ConfidenceSupportBreakdown { + const candidateText = `${candidate.abstract}\n${candidate.content}`.trim(); + const candidateTokens = tokenizeText(candidateText); + if (candidateTokens.length === 0) { + return { score: 0, bestSupport: 0, coverage: 0, unsupportedRatio: 1 }; + } + + const spans = splitSupportSpans(conversationText); + const conversationTokens = new Set(tokenizeText(conversationText)); + let bestSupport = 0; + + for (const span of spans) { + const spanTokens = tokenizeText(span); + bestSupport = Math.max(bestSupport, rougeLikeF1(candidateTokens, spanTokens)); + } + + const uniqueCandidateTokens = Array.from(new Set(candidateTokens)); + const supportedTokenCount = uniqueCandidateTokens.filter((token) => conversationTokens.has(token)).length; + const coverage = uniqueCandidateTokens.length > 0 ? supportedTokenCount / uniqueCandidateTokens.length : 0; + const unsupportedRatio = uniqueCandidateTokens.length > 0 ? 1 - coverage : 1; + const score = clamp01((bestSupport * 0.7) + (coverage * 0.3) - (unsupportedRatio * 0.25), 0); + + return { score, bestSupport, coverage, unsupportedRatio }; +} + +export function scoreNoveltyFromMatches( + candidateVector: number[], + matches: MemorySearchResult[], +): NoveltyBreakdown { + if (!Array.isArray(candidateVector) || candidateVector.length === 0 || matches.length === 0) { + return { score: 1, maxSimilarity: 0, matchedIds: [], comparedIds: [] }; + } + + let maxSimilarity = 0; + const comparedIds: string[] = []; + const matchedIds: string[] = []; + + for (const match of matches) { + comparedIds.push(match.entry.id); + const similarity = Math.max(0, cosineSimilarity(candidateVector, match.entry.vector)); + if (similarity > maxSimilarity) { + maxSimilarity = similarity; + } + if (similarity >= 0.55) { + matchedIds.push(match.entry.id); + } + } + + return { + score: clamp01(1 - maxSimilarity, 1), + maxSimilarity, + matchedIds, + comparedIds, + }; +} + +export function scoreRecencyGap( + now: number, + matches: MemorySearchResult[], + halfLifeDays: number, +): number { + if (matches.length === 0 || halfLifeDays <= 0) { + return 1; + } + + const latestTimestamp = Math.max( + ...matches.map((match) => (Number.isFinite(match.entry.timestamp) ? match.entry.timestamp : 0)), + ); + if (!Number.isFinite(latestTimestamp) || latestTimestamp <= 0) { + return 1; + } + + const gapMs = Math.max(0, now - latestTimestamp); + const gapDays = gapMs / 86_400_000; + if (gapDays === 0) { + return 0; + } + + const lambda = Math.LN2 / halfLifeDays; + return clamp01(1 - Math.exp(-lambda * gapDays), 1); +} + +async function scoreUtility( + llm: LlmClient, + mode: AdmissionControlConfig["utilityMode"], + candidate: CandidateMemory, + conversationText: string, +): Promise<{ score: number; reason?: string }> { + if (mode === "off") { + return { score: 0.5, reason: "Utility scoring disabled" }; + } + + let response: { utility?: number; reason?: string } | null = null; + try { + response = await llm.completeJson<{ utility?: number; reason?: string }>( + buildUtilityPrompt(candidate, conversationText), + "admission-utility", + ); + } catch { + return { score: 0.5, reason: "Utility scoring failed" }; + } + + if (!response) { + return { score: 0.5, reason: "Utility scoring unavailable" }; + } + + return { + score: clamp01(response.utility, 0.5), + reason: typeof response.reason === "string" ? response.reason.trim() : undefined, + }; +} + +export class AdmissionController { + constructor( + private readonly store: MemoryStore, + private readonly llm: LlmClient, + private readonly config: AdmissionControlConfig, + private readonly debugLog: (msg: string) => void = () => {}, + ) {} + + private async loadRelevantMatches( + candidate: CandidateMemory, + candidateVector: number[], + scopeFilter: string[], + ): Promise { + if (!Array.isArray(candidateVector) || candidateVector.length === 0) { + return []; + } + + const rawMatches = await this.store.vectorSearch( + candidateVector, + this.config.noveltyCandidatePoolSize, + 0, + scopeFilter, + ); + + if (rawMatches.length === 0) { + return []; + } + + const sameCategoryMatches = rawMatches.filter((match) => { + const metadata = parseSmartMetadata(match.entry.metadata, match.entry); + return metadata.memory_category === candidate.category; + }); + + return sameCategoryMatches.length > 0 ? sameCategoryMatches : rawMatches; + } + + async evaluate(params: { + candidate: CandidateMemory; + candidateVector: number[]; + conversationText: string; + scopeFilter: string[]; + now?: number; + }): Promise { + const now = params.now ?? Date.now(); + const relevantMatches = await this.loadRelevantMatches( + params.candidate, + params.candidateVector, + params.scopeFilter, + ); + + const utility = await scoreUtility( + this.llm, + this.config.utilityMode, + params.candidate, + params.conversationText, + ); + const confidence = scoreConfidenceSupport(params.candidate, params.conversationText); + const novelty = scoreNoveltyFromMatches(params.candidateVector, relevantMatches); + const recency = scoreRecencyGap(now, relevantMatches, this.config.recency.halfLifeDays); + const typePrior = scoreTypePrior(params.candidate.category, this.config.typePriors); + + const featureScores: AdmissionFeatureScores = { + utility: utility.score, + confidence: confidence.score, + novelty: novelty.score, + recency, + typePrior, + }; + + const score = + (featureScores.utility * this.config.weights.utility) + + (featureScores.confidence * this.config.weights.confidence) + + (featureScores.novelty * this.config.weights.novelty) + + (featureScores.recency * this.config.weights.recency) + + (featureScores.typePrior * this.config.weights.typePrior); + + const decision = score < this.config.rejectThreshold ? "reject" : "pass_to_dedup"; + const hint = + decision === "reject" + ? undefined + : score >= this.config.admitThreshold && novelty.maxSimilarity < 0.55 + ? "add" + : "update_or_merge"; + + const reason = buildReason({ + decision, + hint, + score, + rejectThreshold: this.config.rejectThreshold, + maxSimilarity: novelty.maxSimilarity, + utilityReason: utility.reason, + }); + + const audit: AdmissionAuditRecord = { + version: "amac-v1", + decision, + hint, + score, + reason, + utility_reason: utility.reason, + thresholds: { + reject: this.config.rejectThreshold, + admit: this.config.admitThreshold, + }, + weights: this.config.weights, + feature_scores: featureScores, + matched_existing_memory_ids: novelty.matchedIds, + compared_existing_memory_ids: novelty.comparedIds, + max_similarity: novelty.maxSimilarity, + evaluated_at: now, + }; + + this.debugLog( + `memory-lancedb-pro: admission-control: decision=${audit.decision} hint=${audit.hint ?? "n/a"} score=${audit.score.toFixed(3)} candidate=${JSON.stringify(params.candidate.abstract.slice(0, 80))}`, + ); + + return { decision, hint, audit }; + } +} diff --git a/src/admission-stats.ts b/src/admission-stats.ts new file mode 100644 index 00000000..5dd60d8c --- /dev/null +++ b/src/admission-stats.ts @@ -0,0 +1,332 @@ +import { readFile } from "node:fs/promises"; +import type { AdmissionControlConfig, AdmissionRejectionAuditEntry } from "./admission-control.js"; +import { resolveRejectedAuditFilePath } from "./admission-control.js"; +import { parseSmartMetadata } from "./smart-metadata.js"; + +const DEFAULT_TOP_REJECTION_REASONS = 5; +const ADMISSION_WINDOWS = [ + { key: "last24h", durationMs: 24 * 60 * 60 * 1000 }, + { key: "last7d", durationMs: 7 * 24 * 60 * 60 * 1000 }, +] as const; + +export interface AdmissionAuditedMemoryLike { + metadata?: string; + timestamp?: number; + category?: string; + text?: string; + importance?: number; +} + +export interface AdmissionStatsStoreLike { + dbPath: string; + list?: ( + scopeFilter?: string[], + category?: string, + limit?: number, + offset?: number, + ) => Promise; +} + +export interface AdmissionCategoryBreakdown { + admittedCount: number | null; + rejectedCount: number; + totalObserved: number | null; + rejectRate: number | null; +} + +export interface AdmissionWindowBreakdown { + admittedCount: number | null; + rejectedCount: number; + totalObserved: number | null; + rejectRate: number | null; +} + +export interface AdmissionRejectionReasonCount { + label: string; + count: number; +} + +export interface AdmissionRejectionSummary { + total: number; + latestRejectedAt: number | null; + byCategory: Record; + byScope: Record; + topReasons: AdmissionRejectionReasonCount[]; +} + +export interface AdmissionStatsSummary { + enabled: boolean; + auditMetadataEnabled: boolean; + rejectedAuditFilePath: string; + rejectedCount: number; + admittedCount: number | null; + totalObserved: number | null; + rejectRate: number | null; + latestRejectedAt: number | null; + rejectedByCategory: Record; + rejectedByScope: Record; + categoryBreakdown: Record; + topReasons: AdmissionRejectionReasonCount[]; + windows: Record; + observedAuditedMemories: number; +} + +export async function readAdmissionRejectionAudits( + filePath: string, +): Promise { + try { + const raw = await readFile(filePath, "utf8"); + const entries: AdmissionRejectionAuditEntry[] = []; + for (const rawLine of raw.split(/\r?\n/)) { + const line = rawLine.trim(); + if (!line) continue; + try { + entries.push(JSON.parse(line) as AdmissionRejectionAuditEntry); + } catch { + // Skip corrupt JSONL lines (truncated writes, disk errors, etc.) + } + } + return entries; + } catch (error) { + const err = error as NodeJS.ErrnoException; + if (err?.code === "ENOENT") { + return []; + } + throw error; + } +} + +export function normalizeReasonKey(reason: string): string { + return reason + .toLowerCase() + .replace(/\d+(?:\.\d+)?/g, "#") + .replace(/\s+/g, " ") + .trim(); +} + +export function extractAdmissionReasonLabel(entry: AdmissionRejectionAuditEntry): string { + const utilityReason = entry.audit.utility_reason?.trim(); + if (utilityReason) { + return utilityReason; + } + return entry.audit.reason.trim(); +} + +export function summarizeAdmissionRejections( + entries: AdmissionRejectionAuditEntry[], +): AdmissionRejectionSummary { + const byCategory: Record = {}; + const byScope: Record = {}; + const reasonCounts = new Map(); + + for (const entry of entries) { + byCategory[entry.candidate.category] = (byCategory[entry.candidate.category] ?? 0) + 1; + byScope[entry.target_scope] = (byScope[entry.target_scope] ?? 0) + 1; + const label = extractAdmissionReasonLabel(entry); + const key = normalizeReasonKey(label); + const current = reasonCounts.get(key); + if (current) { + current.count += 1; + } else { + reasonCounts.set(key, { label, count: 1 }); + } + } + + const latestRejectedAt = entries.length > 0 + ? Math.max(...entries.map((entry) => entry.rejected_at)) + : null; + const topReasons = Array.from(reasonCounts.values()) + .sort((left, right) => right.count - left.count || left.label.localeCompare(right.label)) + .slice(0, DEFAULT_TOP_REJECTION_REASONS); + + return { + total: entries.length, + latestRejectedAt, + byCategory, + byScope, + topReasons, + }; +} + +export function getAdmissionAuditDecision( + entry: { metadata?: string }, +): "pass_to_dedup" | "reject" | null { + try { + const parsed = JSON.parse(entry.metadata || "{}") as Record; + const audit = parsed.admission_control as Record | undefined; + const decision = audit?.decision; + return decision === "pass_to_dedup" || decision === "reject" ? decision : null; + } catch { + return null; + } +} + +export function getAdmittedDecisionTimestamp( + entry: { metadata?: string; timestamp?: number }, +): number | null { + try { + const parsed = JSON.parse(entry.metadata || "{}") as Record; + const audit = parsed.admission_control as Record | undefined; + const evaluatedAt = Number(audit?.evaluated_at); + if (Number.isFinite(evaluatedAt) && evaluatedAt > 0) { + return evaluatedAt; + } + } catch { + // ignore + } + + const timestamp = Number(entry.timestamp); + if (Number.isFinite(timestamp) && timestamp > 0) { + return timestamp; + } + return null; +} + +export function getObservedAdmissionCategory( + entry: AdmissionAuditedMemoryLike, +): string { + return parseSmartMetadata(entry.metadata, entry).memory_category || entry.category || "patterns"; +} + +export function buildAdmissionCategoryBreakdown( + admittedCategories: string[] | null, + rejectedEntries: AdmissionRejectionAuditEntry[], +): Record { + const admittedCounts: Record | null = admittedCategories ? {} : null; + const rejectedCounts: Record = {}; + + if (admittedCategories) { + for (const category of admittedCategories) { + admittedCounts[category] = (admittedCounts[category] ?? 0) + 1; + } + } + + for (const entry of rejectedEntries) { + const category = entry.candidate.category; + rejectedCounts[category] = (rejectedCounts[category] ?? 0) + 1; + } + + const categories = Array.from( + new Set([ + ...Object.keys(rejectedCounts), + ...(admittedCounts ? Object.keys(admittedCounts) : []), + ]), + ).sort((left, right) => left.localeCompare(right)); + + const breakdown: Record = {}; + for (const category of categories) { + const admittedCount = admittedCounts ? (admittedCounts[category] ?? 0) : null; + const rejectedCount = rejectedCounts[category] ?? 0; + const totalObserved = admittedCount !== null ? admittedCount + rejectedCount : null; + const rejectRate = + totalObserved && totalObserved > 0 ? rejectedCount / totalObserved : null; + + breakdown[category] = { + admittedCount, + rejectedCount, + totalObserved, + rejectRate, + }; + } + + return breakdown; +} + +export function buildAdmissionWindowSummary( + admittedTimestamps: number[] | null, + rejectedEntries: AdmissionRejectionAuditEntry[], + now = Date.now(), +): Record { + const windows: Record = {}; + + for (const windowDef of ADMISSION_WINDOWS) { + const since = now - windowDef.durationMs; + const rejectedCount = rejectedEntries.filter((entry) => entry.rejected_at >= since).length; + const admittedCount = admittedTimestamps + ? admittedTimestamps.filter((ts) => ts >= since).length + : null; + const totalObserved = admittedCount !== null ? admittedCount + rejectedCount : null; + const rejectRate = + totalObserved && totalObserved > 0 ? rejectedCount / totalObserved : null; + + windows[windowDef.key] = { + admittedCount, + rejectedCount, + totalObserved, + rejectRate, + }; + } + + return windows; +} + +export async function buildAdmissionStats(params: { + store: AdmissionStatsStoreLike; + admissionControl?: AdmissionControlConfig; + scopeFilter?: string[]; + memoryTotalCount: number; +}): Promise { + const rejectionFilePath = resolveRejectedAuditFilePath( + params.store.dbPath, + params.admissionControl, + ); + let rejectionEntries = await readAdmissionRejectionAudits(rejectionFilePath); + if (params.scopeFilter && params.scopeFilter.length > 0) { + const scopeSet = new Set(params.scopeFilter); + rejectionEntries = rejectionEntries.filter((entry) => scopeSet.has(entry.target_scope)); + } + + const rejectionSummary = summarizeAdmissionRejections(rejectionEntries); + const auditMetadataEnabled = params.admissionControl?.auditMetadata !== false; + let admittedCount: number | null = null; + let admittedTimestamps: number[] | null = null; + let admittedCategories: string[] | null = null; + let observedAuditedMemories = 0; + + if (auditMetadataEnabled && typeof params.store.list === "function") { + const memories = await params.store.list( + params.scopeFilter, + undefined, + Math.max(params.memoryTotalCount, 1), + 0, + ); + admittedCount = 0; + admittedTimestamps = []; + admittedCategories = []; + for (const memory of memories) { + const decision = getAdmissionAuditDecision(memory); + if (decision === "pass_to_dedup") { + admittedCount += 1; + observedAuditedMemories += 1; + admittedCategories.push(getObservedAdmissionCategory(memory)); + const admittedAt = getAdmittedDecisionTimestamp(memory); + if (admittedAt !== null) { + admittedTimestamps.push(admittedAt); + } + } else if (decision === "reject") { + observedAuditedMemories += 1; + } + } + } + + const totalObserved = admittedCount !== null ? admittedCount + rejectionSummary.total : null; + const rejectRate = + totalObserved && totalObserved > 0 ? rejectionSummary.total / totalObserved : null; + + return { + enabled: params.admissionControl?.enabled === true, + auditMetadataEnabled, + rejectedAuditFilePath: rejectionFilePath, + rejectedCount: rejectionSummary.total, + admittedCount, + totalObserved, + rejectRate, + latestRejectedAt: rejectionSummary.latestRejectedAt, + rejectedByCategory: rejectionSummary.byCategory, + rejectedByScope: rejectionSummary.byScope, + categoryBreakdown: buildAdmissionCategoryBreakdown(admittedCategories, rejectionEntries), + topReasons: rejectionSummary.topReasons, + windows: buildAdmissionWindowSummary(admittedTimestamps, rejectionEntries), + observedAuditedMemories, + }; +} diff --git a/src/auto-capture-cleanup.ts b/src/auto-capture-cleanup.ts new file mode 100644 index 00000000..b677ed2d --- /dev/null +++ b/src/auto-capture-cleanup.ts @@ -0,0 +1,94 @@ +const AUTO_CAPTURE_INBOUND_META_SENTINELS = [ + "Conversation info (untrusted metadata):", + "Sender (untrusted metadata):", + "Thread starter (untrusted, for context):", + "Replied message (untrusted, for context):", + "Forwarded message context (untrusted metadata):", + "Chat history since last reply (untrusted, for context):", +] as const; + +const AUTO_CAPTURE_SESSION_RESET_PREFIX = + "A new session was started via /new or /reset. Execute your Session Startup sequence now"; +const AUTO_CAPTURE_ADDRESSING_PREFIX_RE = /^(?:<@!?[0-9]+>|@[A-Za-z0-9_.-]+)\s*/; +const AUTO_CAPTURE_SYSTEM_EVENT_LINE_RE = /^System:\s*\[[^\n]*?\]\s*Exec\s+(?:completed|failed|started)\b.*$/gim; + +function escapeRegExp(value: string): string { + return value.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"); +} + +const AUTO_CAPTURE_INBOUND_META_BLOCK_RE = new RegExp( + String.raw`(?:^|\n)\s*(?:${AUTO_CAPTURE_INBOUND_META_SENTINELS.map((sentinel) => escapeRegExp(sentinel)).join("|")})\s*\n\`\`\`json[\s\S]*?\n\`\`\`\s*`, + "g", +); + +function stripLeadingInboundMetadata(text: string): string { + if (!text) { + return text; + } + + let normalized = text; + for (let i = 0; i < 6; i++) { + const before = normalized; + normalized = normalized.replace(AUTO_CAPTURE_SYSTEM_EVENT_LINE_RE, "\n"); + normalized = normalized.replace(AUTO_CAPTURE_INBOUND_META_BLOCK_RE, "\n"); + normalized = normalized.replace(/\n{3,}/g, "\n\n").trim(); + if (normalized === before.trim()) { + break; + } + } + + return normalized.trim(); +} + +function stripAutoCaptureSessionResetPrefix(text: string): string { + const trimmed = text.trim(); + if (!trimmed.startsWith(AUTO_CAPTURE_SESSION_RESET_PREFIX)) { + return trimmed; + } + + const blankLineIndex = trimmed.indexOf("\n\n"); + if (blankLineIndex >= 0) { + return trimmed.slice(blankLineIndex + 2).trim(); + } + + const lines = trimmed.split("\n"); + if (lines.length <= 2) { + return ""; + } + return lines.slice(2).join("\n").trim(); +} + +function stripAutoCaptureAddressingPrefix(text: string): string { + return text.replace(AUTO_CAPTURE_ADDRESSING_PREFIX_RE, "").trim(); +} + +export function stripAutoCaptureInjectedPrefix(role: string, text: string): string { + if (role !== "user") { + return text.trim(); + } + + let normalized = text.trim(); + normalized = normalized.replace(/\s*[\s\S]*?<\/relevant-memories>\s*/gi, ""); + normalized = normalized.replace( + /\[UNTRUSTED DATA[^\n]*\][\s\S]*?\[END UNTRUSTED DATA\]\s*/gi, + "", + ); + normalized = stripAutoCaptureSessionResetPrefix(normalized); + normalized = stripLeadingInboundMetadata(normalized); + normalized = stripAutoCaptureAddressingPrefix(normalized); + normalized = stripLeadingInboundMetadata(normalized); + normalized = normalized.replace(/\n{3,}/g, "\n\n"); + return normalized.trim(); +} + +export function normalizeAutoCaptureText( + role: unknown, + text: string, + shouldSkipMessage?: (role: string, text: string) => boolean, +): string | null { + if (typeof role !== "string") return null; + const normalized = stripAutoCaptureInjectedPrefix(role, text); + if (!normalized) return null; + if (shouldSkipMessage?.(role, normalized)) return null; + return normalized; +} diff --git a/src/batch-dedup.ts b/src/batch-dedup.ts new file mode 100644 index 00000000..0cbd339f --- /dev/null +++ b/src/batch-dedup.ts @@ -0,0 +1,146 @@ +/** + * Batch-Internal Dedup — Cosine similarity dedup within extraction batches + * + * Before running expensive per-candidate LLM dedup calls, this module + * checks all candidates against each other using cosine similarity + * on their embedded abstracts. Candidates with similarity > threshold + * are marked as batch duplicates and skipped. + * + * For n <= 5 candidates, O(n^2) pairwise comparison is trivial. + */ + +// ============================================================================ +// Types +// ============================================================================ + +export interface BatchDedupCandidate { + /** Unique index within the batch */ + index: number; + /** L0 abstract text used for embedding */ + abstract: string; + /** Embedded vector of the abstract */ + vector?: number[]; + /** Whether this candidate was marked as a batch duplicate */ + isBatchDuplicate: boolean; + /** If duplicate, index of the surviving candidate it duplicates */ + duplicateOf?: number; +} + +export interface BatchDedupResult { + /** Indices of candidates that survived (not duplicates) */ + survivingIndices: number[]; + /** Indices of candidates marked as batch duplicates */ + duplicateIndices: number[]; + /** Number of candidates before dedup */ + inputCount: number; + /** Number of candidates after dedup */ + outputCount: number; +} + +export interface ExtractionCostStats { + /** Candidates dropped by batch dedup */ + batchDeduped: number; + /** Total extraction wall time in ms */ + durationMs: number; + /** Count of LLM invocations */ + llmCalls: number; +} + +// ============================================================================ +// Cosine Similarity +// ============================================================================ + +function cosineSimilarity(a: number[], b: number[]): number { + if (a.length !== b.length || a.length === 0) return 0; + + let dotProduct = 0; + let normA = 0; + let normB = 0; + + for (let i = 0; i < a.length; i++) { + dotProduct += a[i] * b[i]; + normA += a[i] * a[i]; + normB += b[i] * b[i]; + } + + const norm = Math.sqrt(normA) * Math.sqrt(normB); + return norm === 0 ? 0 : dotProduct / norm; +} + +// ============================================================================ +// Batch Dedup +// ============================================================================ + +/** + * Perform batch-internal cosine dedup on candidate abstracts. + * + * @param abstracts - Array of L0 abstract strings from extracted candidates + * @param vectors - Parallel array of embedded vectors for each abstract + * @param threshold - Cosine similarity threshold above which candidates are considered duplicates (default: 0.85) + * @returns BatchDedupResult with surviving and duplicate indices + */ +export function batchDedup( + abstracts: string[], + vectors: number[][], + threshold = 0.85, +): BatchDedupResult { + const n = abstracts.length; + if (n <= 1) { + return { + survivingIndices: n === 1 ? [0] : [], + duplicateIndices: [], + inputCount: n, + outputCount: n, + }; + } + + // Track which candidates are duplicates + const isDuplicate = new Array(n).fill(false); + const duplicateOf = new Array(n).fill(undefined); + + // Pairwise comparison: O(n^2) but n <= 5 typically + for (let i = 0; i < n; i++) { + if (isDuplicate[i]) continue; + for (let j = i + 1; j < n; j++) { + if (isDuplicate[j]) continue; + if (!vectors[i] || !vectors[j]) continue; + if (vectors[i].length === 0 || vectors[j].length === 0) continue; + + const sim = cosineSimilarity(vectors[i], vectors[j]); + if (sim > threshold) { + // Mark the later candidate as duplicate of the earlier one + isDuplicate[j] = true; + duplicateOf[j] = i; + } + } + } + + const survivingIndices: number[] = []; + const duplicateIndices: number[] = []; + + for (let i = 0; i < n; i++) { + if (isDuplicate[i]) { + duplicateIndices.push(i); + } else { + survivingIndices.push(i); + } + } + + return { + survivingIndices, + duplicateIndices, + inputCount: n, + outputCount: survivingIndices.length, + }; +} + +/** + * Create a fresh ExtractionCostStats tracker. + */ +export function createExtractionCostStats(): ExtractionCostStats { + return { + batchDeduped: 0, + durationMs: 0, + llmCalls: 0, + }; +} diff --git a/src/chunker.ts b/src/chunker.ts index d1581237..8bb4dee6 100644 --- a/src/chunker.ts +++ b/src/chunker.ts @@ -162,6 +162,32 @@ function sliceTrimWithIndices(text: string, start: number, end: number): { chunk }; } +// ============================================================================ +// CJK Detection +// ============================================================================ + +// CJK Unicode ranges: Unified Ideographs, Extension A, Compatibility, +// Hangul Syllables, Katakana, Hiragana +const CJK_RE = + /[\u3040-\u309F\u30A0-\u30FF\u3400-\u4DBF\u4E00-\u9FFF\uAC00-\uD7AF\uF900-\uFAFF]/; + +/** Ratio of CJK characters to total non-whitespace characters. */ +function getCjkRatio(text: string): number { + let cjk = 0; + let total = 0; + for (const ch of text) { + if (/\s/.test(ch)) continue; + total++; + if (CJK_RE.test(ch)) cjk++; + } + return total === 0 ? 0 : cjk / total; +} + +// CJK chars are ~2-3 tokens each. When text is predominantly CJK, we divide +// char limits by this factor to stay within the model's token budget. +const CJK_CHAR_TOKEN_DIVISOR = 2.5; +const CJK_RATIO_THRESHOLD = 0.3; + // ============================================================================ // Chunking Core // ============================================================================ @@ -239,10 +265,15 @@ export function smartChunk(text: string, embedderModel?: string): ChunkResult { const limit = embedderModel ? EMBEDDING_CONTEXT_LIMITS[embedderModel] : undefined; const base = limit ?? 8192; + // CJK characters consume ~2-3 tokens each, so a char-based limit that works + // for Latin text will vastly overshoot the token budget for CJK-heavy text. + const cjkHeavy = getCjkRatio(text) > CJK_RATIO_THRESHOLD; + const divisor = cjkHeavy ? CJK_CHAR_TOKEN_DIVISOR : 1; + const config: ChunkerConfig = { - maxChunkSize: Math.max(1000, Math.floor(base * 0.7)), - overlapSize: Math.max(0, Math.floor(base * 0.05)), - minChunkSize: Math.max(100, Math.floor(base * 0.1)), + maxChunkSize: Math.max(200, Math.floor(base * 0.7 / divisor)), + overlapSize: Math.max(0, Math.floor(base * 0.05 / divisor)), + minChunkSize: Math.max(100, Math.floor(base * 0.1 / divisor)), semanticSplit: true, maxLinesPerChunk: 50, }; diff --git a/src/clawteam-scope.ts b/src/clawteam-scope.ts new file mode 100644 index 00000000..fa74ffb8 --- /dev/null +++ b/src/clawteam-scope.ts @@ -0,0 +1,63 @@ +/** + * ClawTeam Shared Memory Scope Integration + * + * Provides env-var-driven scope extension for ClawTeam multi-agent setups. + * When CLAWTEAM_MEMORY_SCOPE is set, agents gain access to the specified + * team scopes in addition to their own default scopes. + * + * Note: this extends `getAccessibleScopes()`, which MemoryScopeManager's + * `isAccessible()` and `getScopeFilter()` both delegate to. So the extra + * scopes affect both read and write access checks. The default *write target* + * (getDefaultScope) is NOT changed — agents still write to their own scope + * unless they explicitly specify a team scope. + */ + +import type { ScopeDefinition } from "./scopes.js"; +import type { MemoryScopeManager } from "./scopes.js"; + +/** + * Parse the CLAWTEAM_MEMORY_SCOPE env var value into a list of scope names. + * Supports comma-separated values, trims whitespace, and filters empty strings. + */ +export function parseClawteamScopes(envValue: string | undefined): string[] { + if (!envValue) return []; + return envValue.split(",").map(s => s.trim()).filter(Boolean); +} + +/** + * Register ClawTeam scopes and extend the scope manager's accessible scopes. + * + * 1. Registers scope definitions for any scopes not already defined. + * 2. Wraps `getAccessibleScopes()` to include the extra scopes for all agents. + * + * Designed for MemoryScopeManager specifically, where `isAccessible()` and + * `getScopeFilter()` delegate to `getAccessibleScopes()`. Custom ScopeManager + * implementations may need additional patching. + */ +export function applyClawteamScopes( + scopeManager: MemoryScopeManager, + scopes: string[], +): void { + if (scopes.length === 0) return; + + // Register scope definitions for unknown scopes + for (const scope of scopes) { + if (!scopeManager.getScopeDefinition(scope)) { + scopeManager.addScopeDefinition(scope, { + description: `ClawTeam shared scope: ${scope}`, + }); + } + } + + // Wrap getAccessibleScopes to include extra scopes + // Copy the base array to avoid mutating the manager's internal state + const originalGetAccessibleScopes = scopeManager.getAccessibleScopes.bind(scopeManager); + scopeManager.getAccessibleScopes = (agentId?: string): string[] => { + const base = originalGetAccessibleScopes(agentId); + const result = [...base]; + for (const s of scopes) { + if (!result.includes(s)) result.push(s); + } + return result; + }; +} diff --git a/src/embedder.ts b/src/embedder.ts index 5009425e..497f68b7 100644 --- a/src/embedder.ts +++ b/src/embedder.ts @@ -84,7 +84,8 @@ class EmbeddingCache { // ============================================================================ export interface EmbeddingConfig { - provider: "openai-compatible"; + provider: "openai-compatible" | "azure-openai"; + apiVersion?: string; /** Single API key or array of keys for round-robin rotation with failover. */ apiKey: string | string[]; model: string; @@ -97,10 +98,40 @@ export interface EmbeddingConfig { taskPassage?: string; /** Optional flag to request normalized embeddings (provider-dependent, e.g. Jina v5) */ normalized?: boolean; + /** When true, omit the dimensions parameter from embedding requests even if dimensions is set. + * Use this for local models that reject the dimensions parameter with "matryoshka representation" errors. */ + omitDimensions?: boolean; /** Enable automatic chunking for documents exceeding context limits (default: true) */ chunking?: boolean; } +type EmbeddingProviderProfile = + | "openai" + | "azure-openai" + | "jina" + | "voyage-compatible" + | "generic-openai-compatible"; + +interface EmbeddingCapabilities { + /** Whether to send encoding_format: "float" */ + encoding_format: boolean; + /** Whether to send normalized (Jina-style) */ + normalized: boolean; + /** + * Field name to use for the task/input-type hint, or null if unsupported. + * e.g. "task" for Jina, "input_type" for Voyage, null for OpenAI/generic. + * If a taskValueMap is provided, task values are translated before sending. + */ + taskField: string | null; + /** Optional value translation map for taskField (e.g. Voyage needs "retrieval.query" → "query") */ + taskValueMap?: Record; + /** + * Field name to use for the requested output dimension, or null if unsupported. + * e.g. "dimensions" for OpenAI, "output_dimension" for Voyage, null if not supported. + */ + dimensionsField: string | null; +} + // Known embedding model dimensions const EMBEDDING_DIMENSIONS: Record = { "text-embedding-3-small": 1536, @@ -116,6 +147,16 @@ const EMBEDDING_DIMENSIONS: Record = { // Jina v5 "jina-embeddings-v5-text-small": 1024, "jina-embeddings-v5-text-nano": 768, + + // Voyage recommended models + "voyage-4": 1024, + "voyage-4-lite": 1024, + "voyage-4-large": 1024, + + // Voyage legacy models + "voyage-3": 1024, + "voyage-3-lite": 512, + "voyage-3-large": 1024, }; // ============================================================================ @@ -159,12 +200,16 @@ function getErrorCode(error: unknown): string | undefined { } function getProviderLabel(baseURL: string | undefined, model: string): string { + const profile = detectEmbeddingProviderProfile(baseURL, model); const base = baseURL || ""; + if (/localhost:11434|127\.0\.0\.1:11434|\/ollama\b/i.test(base)) return "Ollama"; + if (base) { - if (/api\.jina\.ai/i.test(base)) return "Jina"; - if (/localhost:11434|127\.0\.0\.1:11434|\/ollama\b/i.test(base)) return "Ollama"; - if (/api\.openai\.com/i.test(base)) return "OpenAI"; + if (profile === "jina" && /api\.jina\.ai/i.test(base)) return "Jina"; + if (profile === "voyage-compatible" && /api\.voyageai\.com/i.test(base)) return "Voyage"; + if (profile === "openai" && /api\.openai\.com/i.test(base)) return "OpenAI"; + if (profile === "azure-openai" || /\.openai\.azure\.com/i.test(base)) return "Azure OpenAI"; try { return new URL(base).host; @@ -173,9 +218,73 @@ function getProviderLabel(baseURL: string | undefined, model: string): string { } } - if (/^jina-/i.test(model)) return "Jina"; + switch (profile) { + case "jina": + return "Jina"; + case "voyage-compatible": + return "Voyage"; + case "openai": + case "azure-openai": + return "OpenAI"; + default: + return "embedding provider"; + } +} + +function detectEmbeddingProviderProfile( + baseURL: string | undefined, + model: string, +): EmbeddingProviderProfile { + const base = baseURL || ""; - return "embedding provider"; + if (/api\.openai\.com/i.test(base)) return "openai"; + if (/\.openai\.azure\.com/i.test(base)) return "azure-openai"; + if (/api\.jina\.ai/i.test(base) || /^jina-/i.test(model)) return "jina"; + if (/api\.voyageai\.com/i.test(base) || /^voyage\b/i.test(model)) { + return "voyage-compatible"; + } + + return "generic-openai-compatible"; +} + +function getEmbeddingCapabilities(profile: EmbeddingProviderProfile): EmbeddingCapabilities { + switch (profile) { + case "openai": + return { + encoding_format: true, + normalized: false, + taskField: null, + dimensionsField: "dimensions", + }; + case "jina": + return { + encoding_format: true, + normalized: true, + taskField: "task", + dimensionsField: "dimensions", + }; + case "voyage-compatible": + return { + encoding_format: false, + normalized: false, + taskField: "input_type", + taskValueMap: { + "retrieval.query": "query", + "retrieval.passage": "document", + "query": "query", + "document": "document", + }, + dimensionsField: "output_dimension", + }; + case "generic-openai-compatible": + default: + return { + encoding_format: true, + normalized: false, + taskField: null, + dimensionsField: "dimensions", + }; + } } function isAuthError(error: unknown): boolean { @@ -226,7 +335,10 @@ export function formatEmbeddingProviderError( if (isAuthError(error)) { let hint = `Check embedding.apiKey and endpoint for ${provider}.`; - if (provider === "Jina") { + // Use profile rather than provider label so Jina-specific hint also fires + // when model is jina-* but baseURL is a proxy (not api.jina.ai). + const profile = detectEmbeddingProviderProfile(opts.baseURL, opts.model); + if (profile === "jina") { hint += " If your Jina key expired or lost access, replace the key or switch to a local OpenAI-compatible endpoint such as Ollama (for example baseURL http://127.0.0.1:11434/v1, with a matching model and embedding.dimensions)."; } else if (provider === "Ollama") { @@ -248,6 +360,22 @@ export function formatEmbeddingProviderError( return `${genericPrefix}${detailText}`; } +// ============================================================================ +// Safety Constants +// ============================================================================ + +/** Maximum recursion depth for embedSingle chunking retries. */ +const MAX_EMBED_DEPTH = 3; + +/** Global timeout for a single embedding operation (ms). */ +const EMBED_TIMEOUT_MS = 10_000; + +/** + * Strictly decreasing character limit for forced truncation. + * Each recursion level MUST reduce input by this factor to guarantee progress. + */ +const STRICT_REDUCTION_FACTOR = 0.5; // Each retry must be at most 50% of previous + export function getVectorDimensions(model: string, overrideDims?: number): number { if (overrideDims && overrideDims > 0) { return overrideDims; @@ -281,9 +409,12 @@ export class Embedder { private readonly _taskQuery?: string; private readonly _taskPassage?: string; private readonly _normalized?: boolean; + private readonly _capabilities: EmbeddingCapabilities; /** Optional requested dimensions to pass through to the embedding provider (OpenAI-compatible). */ private readonly _requestDimensions?: number; + /** When true, omit the dimensions parameter even if _requestDimensions is set. */ + private readonly _omitDimensions: boolean; /** Enable automatic chunking for long documents (default: true) */ private readonly _autoChunk: boolean; @@ -298,14 +429,44 @@ export class Embedder { this._taskPassage = config.taskPassage; this._normalized = config.normalized; this._requestDimensions = config.dimensions; + this._omitDimensions = config.omitDimensions === true; // Enable auto-chunking by default for better handling of long documents this._autoChunk = config.chunking !== false; + const profile = detectEmbeddingProviderProfile(this._baseURL, this._model); + this._capabilities = getEmbeddingCapabilities(profile); + + // Warn if configured fields will be silently ignored by this provider profile + if (config.normalized !== undefined && !this._capabilities.normalized) { + console.debug( + `[memory-lancedb-pro] embedding.normalized is set but provider profile "${profile}" does not support it — value will be ignored` + ); + } + if ((config.taskQuery || config.taskPassage) && !this._capabilities.taskField) { + console.debug( + `[memory-lancedb-pro] embedding.taskQuery/taskPassage is set but provider profile "${profile}" does not support task hints — values will be ignored` + ); + } // Create a client pool — one OpenAI client per key - this.clients = resolvedKeys.map(key => new OpenAI({ - apiKey: key, - ...(config.baseURL ? { baseURL: config.baseURL } : {}), - })); + this.clients = resolvedKeys.map(key => { + let defaultHeaders: Record = {}; + let baseURL = config.baseURL; + + if (config.provider === "azure-openai" || profile === "azure-openai") { + defaultHeaders["api-key"] = key; + if (baseURL && config.apiVersion) { + const url = new URL(baseURL); + url.searchParams.set("api-version", config.apiVersion); + baseURL = url.toString(); + } + } + + return new OpenAI({ + apiKey: key, + ...(baseURL ? { baseURL } : {}), + defaultHeaders: Object.keys(defaultHeaders).length > 0 ? defaultHeaders : undefined, + }); + }); if (this.clients.length > 1) { console.log(`[memory-lancedb-pro] Initialized ${this.clients.length} API keys for round-robin rotation`); @@ -350,19 +511,87 @@ export class Embedder { return /rate.limit|quota|too many requests|insufficient.*credit|429|503.*overload/i.test(msg); } + /** + * Detect if the configured baseURL points to a local Ollama instance. + * Ollama's HTTP server does not properly handle AbortController signals through + * the OpenAI SDK's HTTP client, causing long-lived sockets that don't close + * when the embedding pipeline times out. For Ollama we use native fetch instead. + */ + private isOllamaProvider(): boolean { + if (!this._baseURL) return false; + return /localhost:11434|127\.0\.0\.1:11434|\/ollama\b/i.test(this._baseURL); + } + + /** + * Call embeddings.create using native fetch (bypasses OpenAI SDK). + * Used exclusively for Ollama endpoints where AbortController must work + * correctly to avoid long-lived stalled sockets. + */ + private async embedWithNativeFetch(payload: any, signal?: AbortSignal): Promise { + if (!this._baseURL) { + throw new Error("embedWithNativeFetch requires a baseURL"); + } + // Ollama's embeddings endpoint is at /v1/embeddings (OpenAI-compatible) + const endpoint = this._baseURL.replace(/\/$/, "") + "/embeddings"; + + const apiKey = this.clients[0]?.apiKey ?? "ollama"; + + const response = await fetch(endpoint, { + method: "POST", + headers: { + "Content-Type": "application/json", + "Authorization": `Bearer ${apiKey}`, + }, + body: JSON.stringify(payload), + signal: signal, + }); + + if (!response.ok) { + const body = await response.text().catch(() => ""); + throw new Error(`Ollama embedding failed: ${response.status} ${response.statusText} ??${body.slice(0, 200)}`); + } + + const data = await response.json(); + return data; // OpenAI-compatible shape: { data: [{ embedding: number[] }] } + } + /** * Call embeddings.create with automatic key rotation on rate-limit errors. * Tries each key in the pool at most once before giving up. + * Accepts an optional AbortSignal to support true request cancellation. + * + * For Ollama endpoints, native fetch is used instead of the OpenAI SDK + * because AbortController does not reliably abort Ollama's HTTP connections + * through the SDK's HTTP client on Node.js. */ - private async embedWithRetry(payload: any): Promise { + private async embedWithRetry(payload: any, signal?: AbortSignal): Promise { + // Use native fetch for Ollama to ensure proper AbortController support + if (this.isOllamaProvider()) { + try { + return await this.embedWithNativeFetch(payload, signal); + } catch (error) { + if (error instanceof Error && error.name === 'AbortError') { + throw error; + } + // Ollama errors bubble up without retry (Ollama doesn't rate-limit locally) + throw error; + } + } + const maxAttempts = this.clients.length; let lastError: Error | undefined; for (let attempt = 0; attempt < maxAttempts; attempt++) { const client = this.nextClient(); try { - return await client.embeddings.create(payload); + // Pass signal to OpenAI SDK if provided (SDK v6+ supports this) + return await client.embeddings.create(payload, signal ? { signal } : undefined); } catch (error) { + // If aborted, re-throw immediately + if (error instanceof Error && error.name === 'AbortError') { + throw error; + } + lastError = error instanceof Error ? error : new Error(String(error)); if (this.isRateLimitError(error) && attempt < maxAttempts - 1) { @@ -391,6 +620,33 @@ export class Embedder { return this.clients.length; } + /** Wrap a single embedding operation with a global timeout via AbortSignal. */ + private withTimeout(promiseFactory: (signal: AbortSignal) => Promise, _label: string, externalSignal?: AbortSignal): Promise { + const controller = new AbortController(); + const timeoutId = setTimeout(() => controller.abort(), EMBED_TIMEOUT_MS); + + // If caller passes an external signal, merge it with the internal timeout controller. + // Either signal aborting will cancel the promise. + let unsubscribe: (() => void) | undefined; + if (externalSignal) { + if (externalSignal.aborted) { + clearTimeout(timeoutId); + return Promise.reject(externalSignal.reason ?? new Error("aborted")); + } + const handler = () => { + controller.abort(); + clearTimeout(timeoutId); + }; + externalSignal.addEventListener("abort", handler, { once: true }); + unsubscribe = () => externalSignal.removeEventListener("abort", handler); + } + + return promiseFactory(controller.signal).finally(() => { + clearTimeout(timeoutId); + unsubscribe?.(); + }); + } + // -------------------------------------------------------------------------- // Backward-compatible API // -------------------------------------------------------------------------- @@ -414,20 +670,24 @@ export class Embedder { // Task-aware API // -------------------------------------------------------------------------- - async embedQuery(text: string): Promise { - return this.embedSingle(text, this._taskQuery); + async embedQuery(text: string, signal?: AbortSignal): Promise { + return this.withTimeout((sig) => this.embedSingle(text, this._taskQuery, 0, sig), "embedQuery", signal); } - async embedPassage(text: string): Promise { - return this.embedSingle(text, this._taskPassage); + async embedPassage(text: string, signal?: AbortSignal): Promise { + return this.withTimeout((sig) => this.embedSingle(text, this._taskPassage, 0, sig), "embedPassage", signal); } - async embedBatchQuery(texts: string[]): Promise { - return this.embedMany(texts, this._taskQuery); + // Note: embedBatchQuery/embedBatchPassage are NOT wrapped with withTimeout because + // they handle multiple texts in a single API call. The timeout would fire after + // EMBED_TIMEOUT_MS regardless of how many texts succeed. Individual text embedding + // within the batch is protected by the SDK's own timeout handling. + async embedBatchQuery(texts: string[], signal?: AbortSignal): Promise { + return this.embedMany(texts, this._taskQuery, signal); } - async embedBatchPassage(texts: string[]): Promise { - return this.embedMany(texts, this._taskPassage); + async embedBatchPassage(texts: string[], signal?: AbortSignal): Promise { + return this.embedMany(texts, this._taskPassage, signal); } // -------------------------------------------------------------------------- @@ -449,34 +709,60 @@ export class Embedder { const payload: any = { model: this.model, input, - // Force float output to avoid SDK default base64 decoding path. - encoding_format: "float", }; - if (task) payload.task = task; - if (this._normalized !== undefined) payload.normalized = this._normalized; + if (this._capabilities.encoding_format) { + // Force float output where providers explicitly support OpenAI-style formatting. + payload.encoding_format = "float"; + } + + if (this._capabilities.normalized && this._normalized !== undefined) { + payload.normalized = this._normalized; + } + + // Task hint: field name and optional value translation are provider-defined. + if (this._capabilities.taskField && task) { + const cap = this._capabilities; + const value = cap.taskValueMap?.[task] ?? task; + payload[cap.taskField] = value; + } - // Some OpenAI-compatible providers support requesting a specific vector size. - // We only pass it through when explicitly configured to avoid breaking providers - // that reject unknown fields. - if (this._requestDimensions && this._requestDimensions > 0) { - payload.dimensions = this._requestDimensions; + // Output dimension: field name is provider-defined. + // Only sent when explicitly configured, unless omitDimensions is enabled for + // local or provider-compatible models that reject the dimensions field. + if (!this._omitDimensions && this._capabilities.dimensionsField && this._requestDimensions && this._requestDimensions > 0) { + payload[this._capabilities.dimensionsField] = this._requestDimensions; } return payload; } - private async embedSingle(text: string, task?: string): Promise { + private async embedSingle(text: string, task?: string, depth: number = 0, signal?: AbortSignal): Promise { if (!text || text.trim().length === 0) { throw new Error("Cannot embed empty text"); } + // FR-01: Recursion depth limit — force truncate when too deep + if (depth >= MAX_EMBED_DEPTH) { + const safeLimit = Math.floor(text.length * STRICT_REDUCTION_FACTOR); + console.warn( + `[memory-lancedb-pro] Recursion depth ${depth} reached MAX_EMBED_DEPTH (${MAX_EMBED_DEPTH}), ` + + `force-truncating ${text.length} chars → ${safeLimit} chars (strict ${STRICT_REDUCTION_FACTOR * 100}% reduction)` + ); + if (safeLimit < 100) { + throw new Error( + `[memory-lancedb-pro] Failed to embed: input too large for model context after ${MAX_EMBED_DEPTH} retries` + ); + } + text = text.slice(0, safeLimit); + } + // Check cache first const cached = this._cache.get(text, task); if (cached) return cached; try { - const response = await this.embedWithRetry(this.buildPayload(text, task)); + const response = await this.embedWithRetry(this.buildPayload(text, task), signal); const embedding = response.data[0]?.embedding as number[] | undefined; if (!embedding) { throw new Error("No embedding returned from provider"); @@ -494,17 +780,40 @@ export class Embedder { try { console.log(`Document exceeded context limit (${errorMsg}), attempting chunking...`); const chunkResult = smartChunk(text, this._model); - + if (chunkResult.chunks.length === 0) { throw new Error(`Failed to chunk document: ${errorMsg}`); } + // FR-03: Single chunk output detection — if smartChunk produced only + // one chunk that is nearly the same size as the original text, chunking + // did not actually reduce the problem. Force-truncate with STRICT + // reduction to guarantee progress. + if ( + chunkResult.chunks.length === 1 && + chunkResult.chunks[0].length > text.length * 0.9 + ) { + // Use strict reduction factor to guarantee each retry makes progress + const safeLimit = Math.floor(text.length * STRICT_REDUCTION_FACTOR); + console.warn( + `[memory-lancedb-pro] smartChunk produced 1 chunk (${chunkResult.chunks[0].length} chars) ≈ original (${text.length} chars). ` + + `Force-truncating to ${safeLimit} chars (strict ${STRICT_REDUCTION_FACTOR * 100}% reduction) to avoid infinite recursion.` + ); + if (safeLimit < 100) { + throw new Error( + `[memory-lancedb-pro] Failed to embed: chunking couldn't reduce input size enough for model context` + ); + } + const truncated = text.slice(0, safeLimit); + return this.embedSingle(truncated, task, depth + 1, signal); + } + // Embed all chunks in parallel console.log(`Split document into ${chunkResult.chunkCount} chunks for embedding`); const chunkEmbeddings = await Promise.all( chunkResult.chunks.map(async (chunk, idx) => { try { - const embedding = await this.embedSingle(chunk, task); + const embedding = await this.embedSingle(chunk, task, depth + 1, signal); return { embedding }; } catch (chunkError) { console.warn(`Failed to embed chunk ${idx}:`, chunkError); @@ -525,21 +834,16 @@ export class Embedder { ); const finalEmbedding = avgEmbedding.map(v => v / chunkEmbeddings.length); - + // Cache the result for the original text (using its hash) this._cache.set(text, task, finalEmbedding); console.log(`Successfully embedded long document as ${chunkEmbeddings.length} averaged chunks`); - + return finalEmbedding; } catch (chunkError) { - // If chunking fails, throw the original error - console.warn(`Chunking failed, using original error:`, chunkError); - const friendly = formatEmbeddingProviderError(error, { - baseURL: this._baseURL, - model: this._model, - mode: "single", - }); - throw new Error(friendly, { cause: error }); + // Preserve and surface the more specific chunkError + console.warn(`Chunking failed:`, chunkError); + throw chunkError; } } @@ -552,7 +856,7 @@ export class Embedder { } } - private async embedMany(texts: string[], task?: string): Promise { + private async embedMany(texts: string[], task?: string, signal?: AbortSignal): Promise { if (!texts || texts.length === 0) { return []; } @@ -574,7 +878,8 @@ export class Embedder { try { const response = await this.embedWithRetry( - this.buildPayload(validTexts, task) + this.buildPayload(validTexts, task), + signal, ); // Create result array with proper length @@ -605,7 +910,7 @@ export class Embedder { if (isContextError && this._autoChunk) { try { console.log(`Batch embedding failed with context error, attempting chunking...`); - + const chunkResults = await Promise.all( validTexts.map(async (text, idx) => { const chunkResult = smartChunk(text, this._model); @@ -615,7 +920,7 @@ export class Embedder { // Embed all chunks in parallel, then average. const embeddings = await Promise.all( - chunkResult.chunks.map((chunk) => this.embedSingle(chunk, task)) + chunkResult.chunks.map((chunk) => this.embedSingle(chunk, task, 0, signal)) ); const avgEmbedding = embeddings.reduce( diff --git a/src/extraction-prompts.ts b/src/extraction-prompts.ts index a8665ea2..6fe16180 100644 --- a/src/extraction-prompts.ts +++ b/src/extraction-prompts.ts @@ -28,6 +28,7 @@ ${conversationText} ## What is NOT worth remembering? - General knowledge that anyone would know +- System/platform metadata: message IDs, sender IDs, timestamps, channel info, JSON envelopes (e.g. "System: [timestamp] Feishu...", "message_id", "sender_id", "ou_xxx") — these are infrastructure noise, NEVER extract them - Temporary information: One-time questions or conversations - Vague information: "User has questions about a feature" (no specific details) - Tool output, error logs, or boilerplate diff --git a/src/identity-addressing.ts b/src/identity-addressing.ts new file mode 100644 index 00000000..5ac653f5 --- /dev/null +++ b/src/identity-addressing.ts @@ -0,0 +1,201 @@ +import type { CandidateMemory } from "./memory-categories.js"; + +export const CANONICAL_NAME_FACT_KEY = "entities:姓名"; +export const CANONICAL_ADDRESSING_FACT_KEY = "preferences:称呼偏好"; + +type IdentityKind = "name" | "addressing"; +export type IdentityAddressingSlot = "name" | "addressing"; + +type IdentityAddressingMemoryLike = { + factKey?: string; + text?: string; + abstract?: string; + overview?: string; + content?: string; +}; + +function trimCapturedValue(value: string): string { + return value + .replace(/^[\s"'“”‘’「」『』*`_]+/, "") + .replace(/[\s"'“”‘’「」『』*`_。!,、,.!?::;;]+$/u, "") + .trim(); +} + +function extractFirst(patterns: RegExp[], text: string): string | undefined { + for (const pattern of patterns) { + const match = pattern.exec(text); + const captured = match?.[1] ? trimCapturedValue(match[1]) : ""; + if (captured) return captured; + } + return undefined; +} + +function combineIdentityTextProbe(params: IdentityAddressingMemoryLike): string { + return [ + params.text, + params.abstract, + params.overview, + params.content, + ] + .filter((value): value is string => typeof value === "string" && value.trim().length > 0) + .map((value) => value.trim()) + .join("\n"); +} + +const NAME_PATTERNS = [ + /(?:我的名字是|我(?:现在)?叫|本名是)\s*([^\s,。,.!!??"'“”‘’「」『』]+)/iu, + /calls?\s+themselves\s+['"]([^'"]+)['"]/i, + /name\s+is\s+['"]?([^'".,\n]+)['"]?/i, +]; + +const ADDRESSING_PATTERNS = [ + /(?:以后你叫我|以后请叫我|请叫我|以后称呼我(?:为)?|称呼我(?:为)?|称呼其为|称呼他为)\s*([^\s,。,.!!??"'“”‘’「」『』]+)/iu, + /(?:希望(?:在[^\n。]{0,20})?(?:以后)?(?:你)?(?:被)?称呼(?:我|其|他)?为)\s*([^\s,。,.!!??"'“”‘’「」『』]+)/iu, + /(?:被称呼为|称呼偏好(?:是)?|Preferred address(?: is)?|be addressed as|addressed as)\s*['"]?([^'".,\n]+)['"]?/i, + /(?:addressive identifier is|preferred (?:and permanently assigned )?addressive identifier is)\s*['"]?([^'".,\n]+)['"]?/i, +]; + +const NAME_HINT_PATTERNS = [ + /^姓名[::]/m, + /^## Identity$/m, + /(?:^|\n)-\s*Name:\s+/i, + /用户当前姓名\/自称为/u, +]; + +const ADDRESSING_HINT_PATTERNS = [ + /^称呼偏好[::]/m, + /^## Addressing$/m, + /Preferred form of address/i, + /被称呼为/u, + /addressive identifier/i, +]; + +function makeCandidate(kind: IdentityKind, alias: string, sourceText: string): CandidateMemory { + if (kind === "name") { + return { + category: "entities", + abstract: `姓名:${alias}`, + overview: `## Identity\n- Name: ${alias}`, + content: `用户当前姓名/自称为“${alias}”。原始表述:${sourceText}`, + }; + } + + return { + category: "preferences", + abstract: `称呼偏好:${alias}`, + overview: `## Addressing\n- Preferred form of address: ${alias}`, + content: `用户希望以后被称呼为“${alias}”。原始表述:${sourceText}`, + }; +} + +export function createIdentityAndAddressingCandidates(text: string): CandidateMemory[] { + const sourceText = text.trim(); + if (!sourceText) return []; + + const name = extractFirst(NAME_PATTERNS, sourceText); + const addressing = extractFirst(ADDRESSING_PATTERNS, sourceText); + const candidates: CandidateMemory[] = []; + + if (name) { + candidates.push(makeCandidate("name", name, sourceText)); + } + if (addressing) { + const duplicateOfName = name && addressing === name; + if (!duplicateOfName || candidates.length === 0) { + candidates.push(makeCandidate("addressing", addressing, sourceText)); + } else { + candidates.push(makeCandidate("addressing", addressing, sourceText)); + } + } + + return candidates; +} + +export function extractIdentityAndAddressingValues(text: string): { + name?: string; + addressing?: string; +} { + const sourceText = text.trim(); + if (!sourceText) return {}; + + return { + name: extractFirst(NAME_PATTERNS, sourceText), + addressing: extractFirst(ADDRESSING_PATTERNS, sourceText), + }; +} + +export function classifyIdentityAndAddressingMemory( + params: IdentityAddressingMemoryLike, +): { + slots: Set; + name?: string; + addressing?: string; +} { + const slots = new Set(); + + if (params.factKey === CANONICAL_NAME_FACT_KEY) { + slots.add("name"); + } + if (params.factKey === CANONICAL_ADDRESSING_FACT_KEY) { + slots.add("addressing"); + } + + const probe = combineIdentityTextProbe(params); + if (!probe) { + return { slots }; + } + + const extracted = extractIdentityAndAddressingValues(probe); + + if (extracted.name || NAME_HINT_PATTERNS.some((pattern) => pattern.test(probe))) { + slots.add("name"); + } + if ( + extracted.addressing || + ADDRESSING_HINT_PATTERNS.some((pattern) => pattern.test(probe)) + ) { + slots.add("addressing"); + } + + return { + slots, + name: extracted.name, + addressing: extracted.addressing, + }; +} + +export function canonicalizeIdentityAndAddressingCandidate( + candidate: CandidateMemory, +): CandidateMemory { + const combined = [candidate.abstract, candidate.overview, candidate.content] + .filter(Boolean) + .join("\n"); + + if (candidate.category === "entities") { + const name = extractFirst(NAME_PATTERNS, combined); + if (name) { + return makeCandidate("name", name, candidate.content || candidate.abstract); + } + const addressing = extractFirst(ADDRESSING_PATTERNS, combined); + if (addressing) { + return makeCandidate("addressing", addressing, candidate.content || candidate.abstract); + } + return candidate; + } + + const addressing = extractFirst(ADDRESSING_PATTERNS, combined); + if (addressing) { + return makeCandidate("addressing", addressing, candidate.content || candidate.abstract); + } + + const name = extractFirst(NAME_PATTERNS, combined); + if (name) { + return makeCandidate("name", name, candidate.content || candidate.abstract); + } + + return candidate; +} + +export function isCanonicalIdentityOrAddressingFactKey(factKey: string | undefined): boolean { + return factKey === CANONICAL_NAME_FACT_KEY || factKey === CANONICAL_ADDRESSING_FACT_KEY; +} diff --git a/src/intent-analyzer.ts b/src/intent-analyzer.ts new file mode 100644 index 00000000..58c34281 --- /dev/null +++ b/src/intent-analyzer.ts @@ -0,0 +1,259 @@ +/** + * Intent Analyzer for Adaptive Recall + * + * Lightweight, rule-based intent analysis that determines which memory categories + * are most relevant for a given query and what recall depth to use. + * + * Inspired by OpenViking's hierarchical retrieval intent routing, adapted for + * memory-lancedb-pro's flat category model. No LLM calls — pure pattern matching + * for minimal latency impact on auto-recall. + * + * @see https://github.com/volcengine/OpenViking — hierarchical_retriever.py intent analysis + */ + +// ============================================================================ +// Types +// ============================================================================ + +/** + * Intent categories map to actual stored MemoryEntry categories. + * Note: "event" is NOT a stored category — event queries route to + * entity + decision (the categories most likely to contain timeline data). + */ +export type MemoryCategoryIntent = + | "preference" + | "fact" + | "decision" + | "entity" + | "other"; + +export type RecallDepth = "l0" | "l1" | "full"; + +export interface IntentSignal { + /** Categories to prioritize (ordered by relevance). */ + categories: MemoryCategoryIntent[]; + /** Recommended recall depth for this intent. */ + depth: RecallDepth; + /** Confidence level of the intent classification. */ + confidence: "high" | "medium" | "low"; + /** Short label for logging. */ + label: string; +} + +// ============================================================================ +// Intent Patterns +// ============================================================================ + +interface IntentRule { + label: string; + patterns: RegExp[]; + categories: MemoryCategoryIntent[]; + depth: RecallDepth; +} + +/** + * Intent rules ordered by specificity (most specific first). + * First match wins — keep high-confidence patterns at the top. + */ +const INTENT_RULES: IntentRule[] = [ + // --- Preference / Style queries --- + { + label: "preference", + patterns: [ + /\b(prefer|preference|style|convention|like|dislike|favorite|habit)\b/i, + /\b(how do (i|we) usually|what('s| is) (my|our) (style|convention|approach))\b/i, + /(偏好|喜欢|习惯|风格|惯例|常用|不喜欢|不要用|别用)/, + ], + categories: ["preference", "decision"], + depth: "l0", + }, + + // --- Decision / Rationale queries --- + { + label: "decision", + patterns: [ + /\b(why did (we|i)|decision|decided|chose|rationale|trade-?off|reason for)\b/i, + /\b(what was the (reason|rationale|decision))\b/i, + /(为什么选|决定|选择了|取舍|权衡|原因是|当时决定)/, + ], + categories: ["decision", "fact"], + depth: "l1", + }, + + // --- Entity / People / Project queries --- + // Narrowed patterns to avoid over-matching: require "who is" / "tell me about" + // style phrasing, not bare nouns like "tool" or "component". + { + label: "entity", + patterns: [ + /\b(who is|who are|tell me about|info on|details about|contact info)\b/i, + /\b(who('s| is) (the|our|my)|what team|which (person|team))\b/i, + /(谁是|告诉我关于|详情|联系方式|哪个团队)/, + ], + categories: ["entity", "fact"], + depth: "l1", + }, + + // --- Event / Timeline queries --- + // Note: "event" is not a stored category. Route to entity + decision + // (the categories most likely to contain timeline/incident data). + { + label: "event", + patterns: [ + /\b(when did|what happened|timeline|incident|outage|deploy|release|shipped)\b/i, + /\b(last (week|month|time|sprint)|recently|yesterday|today)\b/i, + /(什么时候|发生了什么|时间线|事件|上线|部署|发布|上次|最近)/, + ], + categories: ["entity", "decision"], + depth: "full", + }, + + // --- Fact / Knowledge queries --- + { + label: "fact", + patterns: [ + /\b(how (does|do|to)|what (does|do|is)|explain|documentation|spec)\b/i, + /\b(config|configuration|setup|install|architecture|api|endpoint)\b/i, + /(怎么|如何|是什么|解释|文档|规范|配置|安装|架构|接口)/, + ], + categories: ["fact", "entity"], + depth: "l1", + }, +]; + +// ============================================================================ +// Analyzer +// ============================================================================ + +/** + * Analyze a query to determine which memory categories and recall depth + * are most appropriate. + * + * Returns a default "broad" signal if no specific intent is detected, + * so callers can always use the result without null checks. + */ +export function analyzeIntent(query: string): IntentSignal { + const trimmed = query.trim(); + if (!trimmed) { + return { + categories: [], + depth: "l0", + confidence: "low", + label: "empty", + }; + } + + for (const rule of INTENT_RULES) { + if (rule.patterns.some((p) => p.test(trimmed))) { + return { + categories: rule.categories, + depth: rule.depth, + confidence: "high", + label: rule.label, + }; + } + } + + // No specific intent detected — return broad signal. + // All categories are eligible; use L0 to minimize token cost. + return { + categories: [], + depth: "l0", + confidence: "low", + label: "broad", + }; +} + +/** + * Apply intent-based category boost to retrieval results. + * + * Instead of filtering (which would lose potentially relevant results), + * this boosts scores of results matching the detected intent categories. + * Non-matching results are kept but ranked lower. + * + * @param results - Retrieval results with scores + * @param intent - Detected intent signal + * @param boostFactor - Score multiplier for matching categories (default: 1.15) + * @returns Results with adjusted scores, re-sorted + */ +export function applyCategoryBoost< + T extends { entry: { category: string }; score: number }, +>(results: T[], intent: IntentSignal, boostFactor = 1.15): T[] { + if (intent.categories.length === 0 || intent.confidence === "low") { + return results; // No intent signal — return as-is + } + + const prioritySet = new Set(intent.categories); + + const boosted = results.map((r) => { + if (prioritySet.has(r.entry.category)) { + return { ...r, score: Math.min(1, r.score * boostFactor) }; + } + return r; + }); + + return boosted.sort((a, b) => b.score - a.score); +} + +/** + * Format a memory entry for context injection at the specified depth level. + * + * - l0: One-line summary (category + scope + truncated text) + * - l1: Medium detail (category + scope + text up to ~300 chars) + * - full: Complete text (existing behavior) + */ +export function formatAtDepth( + entry: { text: string; category: string; scope: string }, + depth: RecallDepth, + score: number, + index: number, + extra?: { bm25Hit?: boolean; reranked?: boolean; sanitize?: (text: string) => string }, +): string { + const scoreStr = `${(score * 100).toFixed(0)}%`; + const sourceSuffix = [ + extra?.bm25Hit ? "vector+BM25" : null, + extra?.reranked ? "+reranked" : null, + ] + .filter(Boolean) + .join(""); + const sourceTag = sourceSuffix ? `, ${sourceSuffix}` : ""; + + // Apply sanitization if provided (prevents prompt injection from stored memories) + const safe = extra?.sanitize ? extra.sanitize(entry.text) : entry.text; + + switch (depth) { + case "l0": { + // Ultra-compact: first sentence or first 80 chars + const brief = extractFirstSentence(safe, 80); + return `- [${entry.category}] ${brief} (${scoreStr}${sourceTag})`; + } + case "l1": { + // Medium: up to 300 chars + const medium = + safe.length > 300 + ? safe.slice(0, 297) + "..." + : safe; + return `- [${entry.category}:${entry.scope}] ${medium} (${scoreStr}${sourceTag})`; + } + case "full": + default: + return `- [${entry.category}:${entry.scope}] ${safe} (${scoreStr}${sourceTag})`; + } +} + +// ============================================================================ +// Helpers +// ============================================================================ + +function extractFirstSentence(text: string, maxLen: number): string { + // Try to find a sentence boundary (CJK punctuation may not be followed by space) + const sentenceEnd = text.search(/[.!?]\s|[。!?]/); + if (sentenceEnd > 0 && sentenceEnd < maxLen) { + return text.slice(0, sentenceEnd + 1); + } + if (text.length <= maxLen) return text; + // Fall back to truncation at word boundary + const truncated = text.slice(0, maxLen); + const lastSpace = truncated.lastIndexOf(" "); + return (lastSpace > maxLen * 0.6 ? truncated.slice(0, lastSpace) : truncated) + "..."; +} diff --git a/src/llm-client.ts b/src/llm-client.ts index 8404d6aa..79182ede 100644 --- a/src/llm-client.ts +++ b/src/llm-client.ts @@ -4,11 +4,23 @@ */ import OpenAI from "openai"; +import { + buildOauthEndpoint, + extractOutputTextFromSse, + loadOAuthSession, + needsRefresh, + normalizeOauthModel, + refreshOAuthSession, + saveOAuthSession, +} from "./llm-oauth.js"; export interface LlmClientConfig { - apiKey: string; + apiKey?: string; model: string; baseURL?: string; + auth?: "api-key" | "oauth"; + oauthPath?: string; + oauthProvider?: string; timeoutMs?: number; log?: (msg: string) => void; } @@ -16,6 +28,8 @@ export interface LlmClientConfig { export interface LlmClient { /** Send a prompt and parse the JSON response. Returns null on failure. */ completeJson(prompt: string, label?: string): Promise; + /** Best-effort diagnostics for the most recent failure, if any. */ + getLastError(): string | null; } /** @@ -23,13 +37,11 @@ export interface LlmClient { * or contain surrounding text. */ function extractJsonFromResponse(text: string): string | null { - // Try markdown code fence first (```json ... ``` or ``` ... ```) const fenceMatch = text.match(/```(?:json)?\s*\n?([\s\S]*?)```/); if (fenceMatch) { return fenceMatch[1].trim(); } - // Try balanced brace extraction const firstBrace = text.indexOf("{"); if (firstBrace === -1) return null; @@ -56,16 +68,125 @@ function previewText(value: string, maxLen = 200): string { return `${normalized.slice(0, maxLen - 3)}...`; } -export function createLlmClient(config: LlmClientConfig): LlmClient { +function nextNonWhitespaceChar(text: string, start: number): string | undefined { + for (let i = start; i < text.length; i++) { + const ch = text[i]; + if (!/\s/.test(ch)) return ch; + } + return undefined; +} + +/** + * Best-effort repair for common LLM JSON issues: + * - unescaped quotes inside string values + * - raw newlines / tabs inside strings + * - trailing commas before } or ] + */ +function repairCommonJson(text: string): string { + let result = ""; + let inString = false; + let escaped = false; + + for (let i = 0; i < text.length; i++) { + const ch = text[i]; + + if (escaped) { + result += ch; + escaped = false; + continue; + } + + if (inString) { + if (ch === "\\") { + result += ch; + escaped = true; + continue; + } + + if (ch === "\"") { + const nextCh = nextNonWhitespaceChar(text, i + 1); + if ( + nextCh === undefined || + nextCh === "," || + nextCh === "}" || + nextCh === "]" || + nextCh === ":" + ) { + result += ch; + inString = false; + } else { + result += "\\\""; + } + continue; + } + + if (ch === "\n") { + result += "\\n"; + continue; + } + if (ch === "\r") { + result += "\\r"; + continue; + } + if (ch === "\t") { + result += "\\t"; + continue; + } + + result += ch; + continue; + } + + if (ch === "\"") { + result += ch; + inString = true; + continue; + } + + if (ch === ",") { + const nextCh = nextNonWhitespaceChar(text, i + 1); + if (nextCh === "}" || nextCh === "]") { + continue; + } + } + + result += ch; + } + + return result; +} + +function looksLikeSseResponse(bodyText: string): boolean { + const trimmed = bodyText.trimStart(); + return trimmed.startsWith("event:") || trimmed.startsWith("data:"); +} + +function createTimeoutSignal(timeoutMs?: number): { signal: AbortSignal; dispose: () => void } { + const effectiveTimeoutMs = + typeof timeoutMs === "number" && Number.isFinite(timeoutMs) && timeoutMs > 0 ? timeoutMs : 30_000; + const controller = new AbortController(); + const timer = setTimeout(() => controller.abort(), effectiveTimeoutMs); + return { + signal: controller.signal, + dispose: () => clearTimeout(timer), + }; +} + +function createApiKeyClient(config: LlmClientConfig, log: (msg: string) => void): LlmClient { + if (!config.apiKey) { + throw new Error("LLM api-key mode requires llm.apiKey or embedding.apiKey"); + } + const client = new OpenAI({ apiKey: config.apiKey, baseURL: config.baseURL, timeout: config.timeoutMs ?? 30000, }); - const log = config.log ?? (() => {}); + let lastError: string | null = null; return { async completeJson(prompt: string, label = "generic"): Promise { + lastError = null; try { const response = await client.chat.completions.create({ model: config.model, @@ -82,43 +203,219 @@ export function createLlmClient(config: LlmClientConfig): LlmClient { const raw = response.choices?.[0]?.message?.content; if (!raw) { - log( - `memory-lancedb-pro: llm-client [${label}] empty response content from model ${config.model}`, - ); + lastError = + `memory-lancedb-pro: llm-client [${label}] empty response content from model ${config.model}`; + log(lastError); return null; } if (typeof raw !== "string") { - log( - `memory-lancedb-pro: llm-client [${label}] non-string response content type=${Array.isArray(raw) ? "array" : typeof raw} from model ${config.model}`, - ); + lastError = + `memory-lancedb-pro: llm-client [${label}] non-string response content type=${Array.isArray(raw) ? "array" : typeof raw} from model ${config.model}`; + log(lastError); return null; } const jsonStr = extractJsonFromResponse(raw); if (!jsonStr) { - log( - `memory-lancedb-pro: llm-client [${label}] no JSON object found (chars=${raw.length}, preview=${JSON.stringify(previewText(raw))})`, - ); + lastError = + `memory-lancedb-pro: llm-client [${label}] no JSON object found (chars=${raw.length}, preview=${JSON.stringify(previewText(raw))})`; + log(lastError); return null; } try { return JSON.parse(jsonStr) as T; } catch (err) { - log( - `memory-lancedb-pro: llm-client [${label}] JSON.parse failed: ${err instanceof Error ? err.message : String(err)} (jsonChars=${jsonStr.length}, jsonPreview=${JSON.stringify(previewText(jsonStr))})`, - ); + const repairedJsonStr = repairCommonJson(jsonStr); + if (repairedJsonStr !== jsonStr) { + try { + const repaired = JSON.parse(repairedJsonStr) as T; + log( + `memory-lancedb-pro: llm-client [${label}] recovered malformed JSON via heuristic repair (jsonChars=${jsonStr.length})`, + ); + return repaired; + } catch (repairErr) { + lastError = + `memory-lancedb-pro: llm-client [${label}] JSON.parse failed: ${err instanceof Error ? err.message : String(err)}; repair failed: ${repairErr instanceof Error ? repairErr.message : String(repairErr)} (jsonChars=${jsonStr.length}, jsonPreview=${JSON.stringify(previewText(jsonStr))})`; + log(lastError); + return null; + } + } + lastError = + `memory-lancedb-pro: llm-client [${label}] JSON.parse failed: ${err instanceof Error ? err.message : String(err)} (jsonChars=${jsonStr.length}, jsonPreview=${JSON.stringify(previewText(jsonStr))})`; + log(lastError); return null; } } catch (err) { - // Graceful degradation — return null so caller can fall back - log( - `memory-lancedb-pro: llm-client [${label}] request failed for model ${config.model}: ${err instanceof Error ? err.message : String(err)}`, - ); + lastError = + `memory-lancedb-pro: llm-client [${label}] request failed for model ${config.model}: ${err instanceof Error ? err.message : String(err)}`; + log(lastError); + return null; + } + }, + getLastError(): string | null { + return lastError; + }, + }; +} + +function createOauthClient(config: LlmClientConfig, log: (msg: string) => void): LlmClient { + if (!config.oauthPath) { + throw new Error("LLM oauth mode requires llm.oauthPath"); + } + + let cachedSessionPromise: Promise>> | null = null; + let lastError: string | null = null; + + async function getSession() { + if (!cachedSessionPromise) { + cachedSessionPromise = loadOAuthSession(config.oauthPath!).catch((error) => { + cachedSessionPromise = null; + throw error; + }); + } + let session = await cachedSessionPromise; + if (needsRefresh(session)) { + session = await refreshOAuthSession(session, config.timeoutMs); + await saveOAuthSession(config.oauthPath!, session); + cachedSessionPromise = Promise.resolve(session); + } + return session; + } + + return { + async completeJson(prompt: string, label = "generic"): Promise { + lastError = null; + try { + const session = await getSession(); + const { signal, dispose } = createTimeoutSignal(config.timeoutMs); + const endpoint = buildOauthEndpoint(config.baseURL, config.oauthProvider); + try { + const response = await fetch(endpoint, { + method: "POST", + headers: { + Authorization: `Bearer ${session.accessToken}`, + "Content-Type": "application/json", + Accept: "text/event-stream", + "OpenAI-Beta": "responses=experimental", + "chatgpt-account-id": session.accountId, + originator: "codex_cli_rs", + }, + signal, + body: JSON.stringify({ + model: normalizeOauthModel(config.model), + instructions: + "You are a memory extraction assistant. Always respond with valid JSON only.", + input: [ + { + role: "user", + content: [ + { + type: "input_text", + text: prompt, + }, + ], + }, + ], + store: false, + stream: true, + text: { + format: { type: "text" }, + }, + }), + }); + + if (!response.ok) { + const detail = await response.text().catch(() => ""); + throw new Error(`HTTP ${response.status} ${response.statusText}: ${detail.slice(0, 500)}`); + } + + const bodyText = await response.text(); + const raw = ( + response.headers.get("content-type")?.includes("text/event-stream") || + looksLikeSseResponse(bodyText) + ) + ? extractOutputTextFromSse(bodyText) + : (() => { + try { + const parsed = JSON.parse(bodyText) as Record; + const output = Array.isArray(parsed.output) ? parsed.output : []; + const first = output.find( + (item) => + item && + typeof item === "object" && + Array.isArray((item as Record).content), + ) as Record | undefined; + if (!first) return null; + const content = (first.content as Array>).find( + (part) => part?.type === "output_text" && typeof part.text === "string", + ); + return typeof content?.text === "string" ? content.text : null; + } catch { + return null; + } + })(); + + if (!raw) { + lastError = + `memory-lancedb-pro: llm-client [${label}] empty OAuth response content from model ${config.model}`; + log(lastError); + return null; + } + + const jsonStr = extractJsonFromResponse(raw); + if (!jsonStr) { + lastError = + `memory-lancedb-pro: llm-client [${label}] no JSON object found in OAuth response (chars=${raw.length}, preview=${JSON.stringify(previewText(raw))})`; + log(lastError); + return null; + } + + try { + return JSON.parse(jsonStr) as T; + } catch (err) { + const repairedJsonStr = repairCommonJson(jsonStr); + if (repairedJsonStr !== jsonStr) { + try { + const repaired = JSON.parse(repairedJsonStr) as T; + log( + `memory-lancedb-pro: llm-client [${label}] recovered malformed OAuth JSON via heuristic repair (jsonChars=${jsonStr.length})`, + ); + return repaired; + } catch (repairErr) { + lastError = + `memory-lancedb-pro: llm-client [${label}] OAuth JSON.parse failed: ${err instanceof Error ? err.message : String(err)}; repair failed: ${repairErr instanceof Error ? repairErr.message : String(repairErr)} (jsonChars=${jsonStr.length}, jsonPreview=${JSON.stringify(previewText(jsonStr))})`; + log(lastError); + return null; + } + } + lastError = + `memory-lancedb-pro: llm-client [${label}] OAuth JSON.parse failed: ${err instanceof Error ? err.message : String(err)} (jsonChars=${jsonStr.length}, jsonPreview=${JSON.stringify(previewText(jsonStr))})`; + log(lastError); + return null; + } + } finally { + dispose(); + } + } catch (err) { + lastError = + `memory-lancedb-pro: llm-client [${label}] OAuth request failed for model ${config.model}: ${err instanceof Error ? err.message : String(err)}`; + log(lastError); return null; } }, + getLastError(): string | null { + return lastError; + }, }; } -export { extractJsonFromResponse }; +export function createLlmClient(config: LlmClientConfig): LlmClient { + const log = config.log ?? (() => {}); + if (config.auth === "oauth") { + return createOauthClient(config, log); + } + return createApiKeyClient(config, log); +} + +export { extractJsonFromResponse, repairCommonJson }; diff --git a/src/llm-oauth.ts b/src/llm-oauth.ts new file mode 100644 index 00000000..65bd650b --- /dev/null +++ b/src/llm-oauth.ts @@ -0,0 +1,675 @@ +import { createHash, randomBytes } from "node:crypto"; +import { createServer } from "node:http"; +import { mkdir, readFile, writeFile } from "node:fs/promises"; +import { dirname } from "node:path"; +import { platform } from "node:os"; +import { spawn } from "node:child_process"; + +export interface OAuthLoginOptions { + authPath: string; + timeoutMs?: number; + noBrowser?: boolean; + model?: string; + providerId?: string; + onOpenUrl?: (url: string) => void | Promise; + onAuthorizeUrl?: (url: string) => void | Promise; +} +const EXPIRY_SKEW_MS = 60_000; + +export type OAuthProviderId = "openai-codex"; + +interface OAuthProviderDefinition { + id: OAuthProviderId; + label: string; + authorizeUrl: string; + tokenUrl: string; + clientId: string; + redirectUri: string; + scope: string; + accountIdClaim: string; + backendBaseUrl: string; + defaultModel: string; + modelPattern: RegExp; + extraAuthorizeParams?: Record; +} + +const DEFAULT_OAUTH_PROVIDER_ID: OAuthProviderId = "openai-codex"; +const OAUTH_PROVIDER_ALIASES: Record = { + openai: "openai-codex", + codex: "openai-codex", + "openai-codex": "openai-codex", +}; +const OAUTH_PROVIDERS: Record = { + "openai-codex": { + id: "openai-codex", + label: "OpenAI Codex", + authorizeUrl: "https://auth.openai.com/oauth/authorize", + tokenUrl: "https://auth.openai.com/oauth/token", + clientId: "app_EMoamEEZ73f0CkXaXp7hrann", + redirectUri: "http://localhost:1455/auth/callback", + scope: "openid profile email offline_access", + accountIdClaim: "https://api.openai.com/auth", + backendBaseUrl: "https://chatgpt.com/backend-api", + defaultModel: "gpt-5.4", + modelPattern: /^(gpt-|o[1345]\b|o\d-mini\b|gpt-5|gpt-4|gpt-4o|gpt-5-codex|gpt-5\.1-codex)/i, + extraAuthorizeParams: { + id_token_add_organizations: "true", + codex_cli_simplified_flow: "true", + originator: "codex_cli_rs", + }, + }, +}; + +export interface OAuthSession { + accessToken: string; + refreshToken?: string; + expiresAt?: number; + accountId: string; + providerId: OAuthProviderId; + authPath: string; +} + +interface TokenRefreshResponse { + access_token?: string; + refresh_token?: string; + expires_in?: number; +} + +function parseNumericTimestamp(value: unknown): number | undefined { + if (typeof value === "number" && Number.isFinite(value) && value > 0) { + return value > 1_000_000_000_000 ? value : value * 1000; + } + + if (typeof value === "string") { + const trimmed = value.trim(); + if (!trimmed) return undefined; + const parsed = Number(trimmed); + if (Number.isFinite(parsed) && parsed > 0) { + return parsed > 1_000_000_000_000 ? parsed : parsed * 1000; + } + } + + return undefined; +} + +function toBase64Url(value: Buffer): string { + return value.toString("base64url"); +} + +function createState(): string { + return randomBytes(16).toString("hex"); +} + +function createPkceVerifier(): string { + return toBase64Url(randomBytes(32)); +} + +function createPkceChallenge(verifier: string): string { + return createHash("sha256").update(verifier).digest("base64url"); +} + +export function listOAuthProviders(): Array> { + return Object.values(OAUTH_PROVIDERS).map((provider) => ({ + id: provider.id, + label: provider.label, + defaultModel: provider.defaultModel, + })); +} + +export function normalizeOAuthProviderId(providerId?: string): OAuthProviderId { + const raw = providerId?.trim().toLowerCase(); + if (!raw) return DEFAULT_OAUTH_PROVIDER_ID; + const resolved = OAUTH_PROVIDER_ALIASES[raw]; + if (resolved) return resolved; + const available = listOAuthProviders().map((provider) => provider.id).join(", "); + throw new Error(`Unsupported OAuth provider "${providerId}". Available providers: ${available}`); +} + +export function getOAuthProvider(providerId?: string): OAuthProviderDefinition { + return OAUTH_PROVIDERS[normalizeOAuthProviderId(providerId)]; +} + +export function getOAuthProviderLabel(providerId?: string): string { + return getOAuthProvider(providerId).label; +} + +export function getDefaultOauthModelForProvider(providerId?: string): string { + return getOAuthProvider(providerId).defaultModel; +} + +export function isOauthModelSupported(providerId: string | undefined, value: string | undefined): boolean { + if (!value || !value.trim()) return false; + const provider = getOAuthProvider(providerId); + const trimmed = value.trim(); + const slashIndex = trimmed.indexOf("/"); + if (slashIndex !== -1) { + const modelProvider = trimmed.slice(0, slashIndex).trim().toLowerCase(); + if (provider.id === "openai-codex" && modelProvider !== "openai" && modelProvider !== "openai-codex") { + return false; + } + } + + return provider.modelPattern.test(normalizeOauthModel(trimmed)); +} + +function resolveOauthClientId(providerId?: string): string { + return process.env.MEMORY_PRO_OAUTH_CLIENT_ID?.trim() || getOAuthProvider(providerId).clientId; +} + +function resolveOauthAuthorizeUrl(providerId?: string): string { + return process.env.MEMORY_PRO_OAUTH_AUTHORIZE_URL?.trim() || getOAuthProvider(providerId).authorizeUrl; +} + +function resolveOauthTokenUrl(providerId?: string): string { + return process.env.MEMORY_PRO_OAUTH_TOKEN_URL?.trim() || getOAuthProvider(providerId).tokenUrl; +} + +function resolveOauthRedirectUri(providerId?: string): string { + return process.env.MEMORY_PRO_OAUTH_REDIRECT_URI?.trim() || getOAuthProvider(providerId).redirectUri; +} + +function buildAuthorizationUrl(state: string, verifier: string, providerId?: string): string { + const provider = getOAuthProvider(providerId); + const url = new URL(resolveOauthAuthorizeUrl(provider.id)); + url.searchParams.set("response_type", "code"); + url.searchParams.set("client_id", resolveOauthClientId(provider.id)); + url.searchParams.set("redirect_uri", resolveOauthRedirectUri(provider.id)); + url.searchParams.set("scope", provider.scope); + url.searchParams.set("code_challenge", createPkceChallenge(verifier)); + url.searchParams.set("code_challenge_method", "S256"); + url.searchParams.set("state", state); + for (const [key, value] of Object.entries(provider.extraAuthorizeParams || {})) { + url.searchParams.set(key, value); + } + return url.toString(); +} + +function buildSuccessHtml(): string { + return [ + "", + "", + "

memory-pro OAuth complete

", + "

You can close this window and return to your terminal.

", + "", + ].join(""); +} + +function buildErrorHtml(message: string): string { + return [ + "", + "", + "

memory-pro OAuth failed

", + `

${message}

`, + "", + ].join(""); +} + +function decodeJwtPayload(token: string): Record | null { + try { + const parts = token.split("."); + if (parts.length !== 3) return null; + return JSON.parse(Buffer.from(parts[1], "base64").toString("utf8")) as Record; + } catch { + return null; + } +} + +function getJwtExpiry(token: string): number | undefined { + const payload = decodeJwtPayload(token); + return parseNumericTimestamp(payload?.exp); +} + +function getJwtAccountId(token: string, providerId?: string): string | undefined { + const provider = getOAuthProvider(providerId); + const payload = decodeJwtPayload(token); + const claims = payload?.[provider.accountIdClaim]; + if (!claims || typeof claims !== "object") return undefined; + + const accountId = (claims as Record).chatgpt_account_id; + return typeof accountId === "string" && accountId.trim() ? accountId : undefined; +} + +function pickString(container: Record, keys: string[]): string | undefined { + for (const key of keys) { + const value = container[key]; + if (typeof value === "string" && value.trim()) { + return value.trim(); + } + } + return undefined; +} + +function pickTimestamp(container: Record, keys: string[]): number | undefined { + for (const key of keys) { + const parsed = parseNumericTimestamp(container[key]); + if (parsed) return parsed; + } + return undefined; +} + +function extractSessionFromObject(source: Record, authPath: string): OAuthSession | null { + const scopes: Record[] = [ + source, + typeof source.tokens === "object" && source.tokens ? source.tokens as Record : {}, + typeof source.oauth === "object" && source.oauth ? source.oauth as Record : {}, + typeof source.openai === "object" && source.openai ? source.openai as Record : {}, + typeof source.chatgpt === "object" && source.chatgpt ? source.chatgpt as Record : {}, + typeof source.auth === "object" && source.auth ? source.auth as Record : {}, + typeof source.credentials === "object" && source.credentials ? source.credentials as Record : {}, + ]; + + let accessToken: string | undefined; + let refreshToken: string | undefined; + let expiresAt: number | undefined; + let accountId: string | undefined; + const providerRaw = pickString(source, ["provider", "oauth_provider", "oauthProvider"]); + let providerId: OAuthProviderId; + try { + providerId = normalizeOAuthProviderId(providerRaw); + } catch { + return null; + } + + for (const scope of scopes) { + accessToken ||= pickString(scope, ["access_token", "accessToken", "access", "token"]); + refreshToken ||= pickString(scope, ["refresh_token", "refreshToken", "refresh"]); + expiresAt ||= pickTimestamp(scope, ["expires_at", "expiresAt", "expires", "expires_on"]); + accountId ||= pickString(scope, ["account_id", "accountId", "chatgpt_account_id", "chatgptAccountId"]); + } + + const apiKey = pickString(source, ["OPENAI_API_KEY", "api_key", "apiKey"]); + if (!accessToken && apiKey) { + return null; + } + + if (!accessToken) return null; + + accountId ||= getJwtAccountId(accessToken, providerId); + if (!accountId) return null; + + expiresAt ||= getJwtExpiry(accessToken); + + return { + accessToken, + refreshToken, + expiresAt, + accountId, + providerId, + authPath, + }; +} + +export async function loadOAuthSession(authPath: string): Promise { + let raw: string; + try { + raw = await readFile(authPath, "utf8"); + } catch (err) { + const reason = err instanceof Error ? err.message : String(err); + throw new Error( + `LLM OAuth requires a project OAuth file. Expected ${authPath}. Read failed: ${reason}`, + ); + } + + let parsed: unknown; + try { + parsed = JSON.parse(raw); + } catch (err) { + const reason = err instanceof Error ? err.message : String(err); + throw new Error(`Invalid project OAuth JSON at ${authPath}: ${reason}`); + } + + if (!parsed || typeof parsed !== "object") { + throw new Error(`Invalid project OAuth file at ${authPath}: expected a JSON object`); + } + + const session = extractSessionFromObject(parsed as Record, authPath); + if (!session) { + throw new Error( + `Project OAuth file at ${authPath} does not contain an OAuth access token and ChatGPT account id.`, + ); + } + + return session; +} + +export function needsRefresh(session: OAuthSession): boolean { + return !!session.refreshToken && !!session.expiresAt && session.expiresAt - EXPIRY_SKEW_MS <= Date.now(); +} + +function createTimeoutSignal(timeoutMs?: number): { signal: AbortSignal; dispose: () => void } { + const effectiveTimeoutMs = + typeof timeoutMs === "number" && Number.isFinite(timeoutMs) && timeoutMs > 0 ? timeoutMs : 30_000; + const controller = new AbortController(); + const timer = setTimeout(() => controller.abort(), effectiveTimeoutMs); + return { + signal: controller.signal, + dispose: () => clearTimeout(timer), + }; +} + +export async function refreshOAuthSession(session: OAuthSession, timeoutMs?: number): Promise { + if (!session.refreshToken) { + throw new Error( + `OAuth session from ${session.authPath} is expired and has no refresh token. Re-run \`codex login\`.`, + ); + } + + const { signal, dispose } = createTimeoutSignal(timeoutMs); + try { + const response = await fetch(resolveOauthTokenUrl(session.providerId), { + method: "POST", + headers: { + "Content-Type": "application/x-www-form-urlencoded", + }, + body: new URLSearchParams({ + grant_type: "refresh_token", + refresh_token: session.refreshToken, + client_id: resolveOauthClientId(session.providerId), + }), + signal, + }); + + if (!response.ok) { + const detail = await response.text().catch(() => ""); + throw new Error(`OAuth refresh failed (${response.status}): ${detail.slice(0, 500)}`); + } + + const payload = await response.json() as TokenRefreshResponse; + if (!payload.access_token) { + throw new Error("OAuth refresh returned no access token"); + } + + const accessToken = payload.access_token; + const refreshToken = payload.refresh_token || session.refreshToken; + const expiresAt = + typeof payload.expires_in === "number" + ? Date.now() + payload.expires_in * 1000 + : getJwtExpiry(accessToken); + const accountId = getJwtAccountId(accessToken, session.providerId) || session.accountId; + + if (!accountId) { + throw new Error("OAuth refresh returned a token without a ChatGPT account id"); + } + + return { + accessToken, + refreshToken, + expiresAt, + accountId, + providerId: session.providerId, + authPath: session.authPath, + }; + } finally { + dispose(); + } +} + +async function exchangeAuthorizationCode(code: string, verifier: string, providerId?: string): Promise { + const resolvedProviderId = normalizeOAuthProviderId(providerId); + const response = await fetch(resolveOauthTokenUrl(resolvedProviderId), { + method: "POST", + headers: { + "Content-Type": "application/x-www-form-urlencoded", + }, + body: new URLSearchParams({ + grant_type: "authorization_code", + client_id: resolveOauthClientId(resolvedProviderId), + code, + code_verifier: verifier, + redirect_uri: resolveOauthRedirectUri(resolvedProviderId), + }), + }); + + if (!response.ok) { + const detail = await response.text().catch(() => ""); + throw new Error(`OAuth token exchange failed (${response.status}): ${detail.slice(0, 500)}`); + } + + const payload = await response.json() as TokenRefreshResponse; + if (!payload.access_token) { + throw new Error("OAuth token exchange returned no access token"); + } + + const accountId = getJwtAccountId(payload.access_token, resolvedProviderId); + if (!accountId) { + throw new Error("OAuth token exchange returned a token without a ChatGPT account id"); + } + + return { + accessToken: payload.access_token, + refreshToken: payload.refresh_token, + expiresAt: + typeof payload.expires_in === "number" + ? Date.now() + payload.expires_in * 1000 + : getJwtExpiry(payload.access_token), + accountId, + providerId: resolvedProviderId, + authPath: "", + }; +} + +export async function saveOAuthSession(authPath: string, session: OAuthSession): Promise { + await mkdir(dirname(authPath), { recursive: true }); + const payload = { + provider: session.providerId, + type: "oauth", + access_token: session.accessToken, + refresh_token: session.refreshToken, + expires_at: session.expiresAt, + account_id: session.accountId, + updated_at: new Date().toISOString(), + }; + await writeFile(authPath, JSON.stringify(payload, null, 2) + "\n", { + encoding: "utf8", + mode: 0o600, + }); +} + +function tryOpenBrowser(url: string): void { + const targetPlatform = platform(); + if (targetPlatform === "darwin") { + const child = spawn("open", [url], { detached: true, stdio: "ignore" }); + child.unref(); + return; + } + + if (targetPlatform === "win32") { + const child = spawn("cmd", ["/c", "start", "", url], { detached: true, stdio: "ignore" }); + child.unref(); + return; + } + + const child = spawn("xdg-open", [url], { detached: true, stdio: "ignore" }); + child.unref(); +} + +async function waitForAuthorizationCode(state: string, timeoutMs: number, providerId?: string): Promise { + const redirectUri = new URL(resolveOauthRedirectUri(providerId)); + const listenPort = Number(redirectUri.port || 80); + const callbackPath = redirectUri.pathname || "/"; + const listenHost = resolveOAuthCallbackListenHost(redirectUri); + + return await new Promise((resolve, reject) => { + const timer = setTimeout(() => { + server.close(); + reject(new Error(`Timed out waiting for OAuth callback on ${redirectUri.origin}${callbackPath}`)); + }, timeoutMs); + + const server = createServer((req, res) => { + if (!req.url) { + res.writeHead(400, { "Content-Type": "text/html; charset=utf-8" }); + res.end(buildErrorHtml("Missing callback URL.")); + return; + } + + const url = new URL(req.url, redirectUri.origin); + if (url.pathname !== callbackPath) { + res.writeHead(404, { "Content-Type": "text/html; charset=utf-8" }); + res.end(buildErrorHtml("Unknown callback path.")); + return; + } + + const returnedState = url.searchParams.get("state"); + const code = url.searchParams.get("code"); + const error = url.searchParams.get("error"); + + if (error) { + clearTimeout(timer); + server.close(); + res.writeHead(400, { "Content-Type": "text/html; charset=utf-8" }); + res.end(buildErrorHtml(`Authorization failed: ${error}`)); + reject(new Error(`OAuth authorization failed: ${error}`)); + return; + } + + if (!code || returnedState !== state) { + clearTimeout(timer); + server.close(); + res.writeHead(400, { "Content-Type": "text/html; charset=utf-8" }); + res.end(buildErrorHtml("Invalid authorization callback.")); + reject(new Error("OAuth callback did not include a valid code/state pair")); + return; + } + + clearTimeout(timer); + server.close(); + res.writeHead(200, { "Content-Type": "text/html; charset=utf-8" }); + res.end(buildSuccessHtml()); + resolve(code); + }); + + server.on("error", (err) => { + clearTimeout(timer); + reject(err); + }); + + server.listen(listenPort, listenHost); + }); +} + +export function resolveOAuthCallbackListenHost(redirectUri: URL | string): string { + const parsed = typeof redirectUri === "string" ? new URL(redirectUri) : redirectUri; + const hostname = parsed.hostname.trim(); + if (!hostname) return "127.0.0.1"; + return hostname.startsWith("[") && hostname.endsWith("]") ? hostname.slice(1, -1) : hostname; +} + +export async function performOAuthLogin(options: OAuthLoginOptions): Promise<{ session: OAuthSession; authorizeUrl: string }> { + const provider = getOAuthProvider(options.providerId); + const verifier = createPkceVerifier(); + const state = createState(); + const authorizeUrl = buildAuthorizationUrl(state, verifier, provider.id); + + await options.onAuthorizeUrl?.(authorizeUrl); + if (!options.noBrowser) { + if (options.onOpenUrl) { + await options.onOpenUrl(authorizeUrl); + } else { + try { + tryOpenBrowser(authorizeUrl); + } catch { + // Browser opening is best-effort; caller still receives the URL. + } + } + } + + const code = await waitForAuthorizationCode(state, options.timeoutMs ?? 120_000, provider.id); + const session = await exchangeAuthorizationCode(code, verifier, provider.id); + session.authPath = options.authPath; + await saveOAuthSession(options.authPath, session); + return { session, authorizeUrl }; +} + +export function normalizeOauthModel(model: string): string { + const trimmed = model.trim(); + if (!trimmed) return trimmed; + + const slashIndex = trimmed.indexOf("/"); + if (slashIndex === -1) return trimmed; + + const provider = trimmed.slice(0, slashIndex).trim().toLowerCase(); + const modelName = trimmed.slice(slashIndex + 1).trim(); + if (!modelName) return trimmed; + + if (provider === "openai" || provider === "openai-codex") { + return modelName; + } + + return trimmed; +} + +export function buildOauthEndpoint(baseURL?: string, providerId?: string): string { + const root = (baseURL?.trim() || getOAuthProvider(providerId).backendBaseUrl).replace(/\/+$/, ""); + if (root.endsWith("/codex/responses")) return root; + if (root.endsWith("/responses")) return root.replace(/\/responses$/, "/codex/responses"); + return `${root}/codex/responses`; +} + +function extractOutputTextFromResponsePayload(payload: unknown): string | null { + if (!payload || typeof payload !== "object") return null; + + const response = payload as Record; + const output = Array.isArray(response.output) ? response.output : null; + if (!output) return null; + + const texts: string[] = []; + for (const item of output) { + if (!item || typeof item !== "object") continue; + const content = Array.isArray((item as Record).content) + ? (item as Record).content as Array> + : []; + for (const part of content) { + if (part?.type === "output_text" && typeof part.text === "string") { + texts.push(part.text); + } + } + } + + return texts.length ? texts.join("\n") : null; +} + +export function extractOutputTextFromSse(bodyText: string): string | null { + const chunks = bodyText.split(/\r?\n\r?\n/); + let deltas = ""; + + for (const chunk of chunks) { + const dataLines = chunk + .split(/\r?\n/) + .filter((line) => line.startsWith("data:")) + .map((line) => line.slice(5).trim()); + + if (!dataLines.length) continue; + + const data = dataLines.join("\n"); + if (!data || data === "[DONE]") continue; + + let payload: unknown; + try { + payload = JSON.parse(data); + } catch { + continue; + } + + if (!payload || typeof payload !== "object") continue; + + const event = payload as Record; + if (event.type === "response.output_text.delta" && typeof event.delta === "string") { + deltas += event.delta; + continue; + } + + if (event.type === "response.output_text.done" && typeof event.text === "string") { + return event.text; + } + + const nested = typeof event.response === "object" && event.response + ? extractOutputTextFromResponsePayload(event.response) + : null; + if (nested) return nested; + + const direct = extractOutputTextFromResponsePayload(event); + if (direct) return direct; + } + + return deltas || null; +} diff --git a/src/memory-categories.ts b/src/memory-categories.ts index 65fb314e..7edc7f53 100644 --- a/src/memory-categories.ts +++ b/src/memory-categories.ts @@ -70,6 +70,8 @@ export type ExtractionStats = { created: number; merged: number; skipped: number; + rejected?: number; // admission control rejections + boundarySkipped?: number; supported?: number; // context-aware support count superseded?: number; // temporal fact replacements }; diff --git a/src/memory-compactor.ts b/src/memory-compactor.ts new file mode 100644 index 00000000..1c0b1ead --- /dev/null +++ b/src/memory-compactor.ts @@ -0,0 +1,403 @@ +/** + * Memory Compactor — Progressive Summarization + * + * Identifies clusters of semantically similar memories older than a configured + * age threshold and merges each cluster into a single, higher-quality entry. + * + * Implements the "progressive summarization" pattern: memories get more refined + * over time as related fragments are consolidated, reducing noise and improving + * retrieval quality without requiring an external LLM call. + * + * Algorithm: + * 1. Load memories older than `minAgeDays` (with vectors). + * 2. Build similarity clusters using greedy cosine-similarity expansion. + * 3. For each cluster >= `minClusterSize`, merge into one entry: + * - text: deduplicated lines joined with newlines + * - importance: max of cluster members (never downgrade) + * - category: plurality vote + * - scope: shared scope (all members must share one) + * - metadata: marked { compacted: true, sourceCount: N } + * 4. Delete source entries, store merged entry. + */ + +import type { MemoryEntry } from "./store.js"; + +// ============================================================================ +// Types +// ============================================================================ + +export interface CompactionConfig { + /** Enable automatic compaction. Default: false */ + enabled: boolean; + /** Only compact memories at least this many days old. Default: 7 */ + minAgeDays: number; + /** Cosine similarity threshold for clustering [0, 1]. Default: 0.88 */ + similarityThreshold: number; + /** Minimum number of memories in a cluster to trigger merge. Default: 2 */ + minClusterSize: number; + /** Maximum memories to scan per compaction run. Default: 200 */ + maxMemoriesToScan: number; + /** Report plan without writing changes. Default: false */ + dryRun: boolean; + /** Run at most once per N hours (gateway_start guard). Default: 24 */ + cooldownHours: number; +} + +export interface CompactionEntry { + id: string; + text: string; + vector: number[]; + category: MemoryEntry["category"]; + scope: string; + importance: number; + timestamp: number; + metadata: string; +} + +export interface ClusterPlan { + /** Indices into the input entries array */ + memberIndices: number[]; + /** Proposed merged entry (without id/vector — computed by caller) */ + merged: { + text: string; + importance: number; + category: MemoryEntry["category"]; + scope: string; + metadata: string; + }; +} + +export interface CompactionResult { + /** Memories scanned (limited by maxMemoriesToScan) */ + scanned: number; + /** Clusters found with >= minClusterSize members */ + clustersFound: number; + /** Source memories deleted (0 when dryRun) */ + memoriesDeleted: number; + /** Merged memories created (0 when dryRun) */ + memoriesCreated: number; + /** Whether this was a dry run */ + dryRun: boolean; +} + +// ============================================================================ +// Math helpers +// ============================================================================ + +/** Dot product of two equal-length vectors. */ +function dot(a: number[], b: number[]): number { + let s = 0; + for (let i = 0; i < a.length; i++) s += a[i] * b[i]; + return s; +} + +/** L2 norm of a vector. */ +function norm(v: number[]): number { + return Math.sqrt(dot(v, v)); +} + +/** + * Cosine similarity in [0, 1]. + * Returns 0 if either vector has zero norm (avoids NaN). + */ +export function cosineSimilarity(a: number[], b: number[]): number { + if (a.length === 0 || a.length !== b.length) return 0; + const na = norm(a); + const nb = norm(b); + if (na === 0 || nb === 0) return 0; + return Math.max(0, Math.min(1, dot(a, b) / (na * nb))); +} + +// ============================================================================ +// Cluster building +// ============================================================================ + +/** + * Greedy cluster expansion. + * + * Sort entries by importance DESC so the most valuable memory seeds each + * cluster. Expand each seed by collecting every unassigned entry whose + * cosine similarity with the seed is >= threshold. + * + * Returns an array of index-arrays (each inner array = one cluster). + * Only clusters with >= minClusterSize entries are returned. + */ +export function buildClusters( + entries: CompactionEntry[], + threshold: number, + minClusterSize: number, +): ClusterPlan[] { + if (entries.length < minClusterSize) return []; + + // Sort indices by importance desc (highest importance seeds first) + const order = entries + .map((_, i) => i) + .sort((a, b) => entries[b].importance - entries[a].importance); + + const assigned = new Uint8Array(entries.length); // 0 = unassigned + const plans: ClusterPlan[] = []; + + for (const seedIdx of order) { + if (assigned[seedIdx]) continue; + + const cluster: number[] = [seedIdx]; + assigned[seedIdx] = 1; + + const seedVec = entries[seedIdx].vector; + if (seedVec.length === 0) continue; // skip entries without vectors + + for (let j = 0; j < entries.length; j++) { + if (assigned[j]) continue; + const jVec = entries[j].vector; + if (jVec.length === 0) continue; + if (cosineSimilarity(seedVec, jVec) >= threshold) { + cluster.push(j); + assigned[j] = 1; + } + } + + if (cluster.length >= minClusterSize) { + const members = cluster.map((i) => entries[i]); + plans.push({ + memberIndices: cluster, + merged: buildMergedEntry(members), + }); + } + } + + return plans; +} + +// ============================================================================ +// Merge strategy +// ============================================================================ + +/** + * Merge a cluster of entries into a single proposed entry. + * + * Text strategy: deduplicate lines across all member texts, join with newline. + * This preserves all unique information while removing redundancy. + * + * Importance: max across cluster (never downgrade). + * Category: plurality vote; ties broken by member with highest importance. + * Scope: all members must share a scope (validated upstream). + */ +export function buildMergedEntry( + members: CompactionEntry[], +): ClusterPlan["merged"] { + // --- text: deduplicate lines --- + const seen = new Set(); + const lines: string[] = []; + for (const m of members) { + for (const line of m.text.split("\n")) { + const trimmed = line.trim(); + if (trimmed && !seen.has(trimmed.toLowerCase())) { + seen.add(trimmed.toLowerCase()); + lines.push(trimmed); + } + } + } + const text = lines.join("\n"); + + // --- importance: max --- + const importance = Math.min( + 1.0, + Math.max(...members.map((m) => m.importance)), + ); + + // --- category: plurality vote --- + const counts = new Map(); + for (const m of members) { + counts.set(m.category, (counts.get(m.category) ?? 0) + 1); + } + let category: MemoryEntry["category"] = "other"; + let best = 0; + for (const [cat, count] of counts) { + if (count > best) { + best = count; + category = cat as MemoryEntry["category"]; + } + } + + // --- scope: use the first (all should match) --- + const scope = members[0].scope; + + // --- metadata --- + const metadata = JSON.stringify({ + compacted: true, + sourceCount: members.length, + compactedAt: Date.now(), + }); + + return { text, importance, category, scope, metadata }; +} + +// ============================================================================ +// Minimal store interface (duck-typed so no circular import) +// ============================================================================ + +export interface CompactorStore { + fetchForCompaction( + maxTimestamp: number, + scopeFilter?: string[], + limit?: number, + ): Promise; + store(entry: { + text: string; + vector: number[]; + importance: number; + category: MemoryEntry["category"]; + scope: string; + metadata?: string; + }): Promise; + delete(id: string, scopeFilter?: string[]): Promise; +} + +export interface CompactorEmbedder { + embedPassage(text: string): Promise; +} + +export interface CompactorLogger { + info(msg: string): void; + warn(msg: string): void; +} + +// ============================================================================ +// Main runner +// ============================================================================ + +/** + * Run a single compaction pass over memories in the given scopes. + * + * @param store Storage backend (must support fetchForCompaction + store + delete) + * @param embedder Used to embed merged text before storage + * @param config Compaction configuration + * @param scopes Scope filter; undefined = all scopes + * @param logger Optional logger + */ +export async function runCompaction( + store: CompactorStore, + embedder: CompactorEmbedder, + config: CompactionConfig, + scopes?: string[], + logger?: CompactorLogger, +): Promise { + const cutoff = Date.now() - config.minAgeDays * 24 * 60 * 60 * 1000; + + const entries = await store.fetchForCompaction( + cutoff, + scopes, + config.maxMemoriesToScan, + ); + + if (entries.length === 0) { + return { + scanned: 0, + clustersFound: 0, + memoriesDeleted: 0, + memoriesCreated: 0, + dryRun: config.dryRun, + }; + } + + // Filter out entries without vectors (shouldn't happen but be safe) + const valid = entries.filter((e) => e.vector && e.vector.length > 0); + + const plans = buildClusters( + valid, + config.similarityThreshold, + config.minClusterSize, + ); + + if (config.dryRun) { + logger?.info( + `memory-compactor [dry-run]: scanned=${valid.length} clusters=${plans.length}`, + ); + return { + scanned: valid.length, + clustersFound: plans.length, + memoriesDeleted: 0, + memoriesCreated: 0, + dryRun: true, + }; + } + + let memoriesDeleted = 0; + let memoriesCreated = 0; + + for (const plan of plans) { + const members = plan.memberIndices.map((i) => valid[i]); + + try { + // Embed the merged text + const vector = await embedder.embedPassage(plan.merged.text); + + // Store merged entry + await store.store({ + text: plan.merged.text, + vector, + importance: plan.merged.importance, + category: plan.merged.category, + scope: plan.merged.scope, + metadata: plan.merged.metadata, + }); + memoriesCreated++; + + // Delete source entries + for (const m of members) { + const deleted = await store.delete(m.id); + if (deleted) memoriesDeleted++; + } + } catch (err) { + logger?.warn( + `memory-compactor: failed to merge cluster of ${members.length}: ${String(err)}`, + ); + } + } + + logger?.info( + `memory-compactor: scanned=${valid.length} clusters=${plans.length} ` + + `deleted=${memoriesDeleted} created=${memoriesCreated}`, + ); + + return { + scanned: valid.length, + clustersFound: plans.length, + memoriesDeleted, + memoriesCreated, + dryRun: false, + }; +} + +// ============================================================================ +// Cooldown helper +// ============================================================================ + +/** + * Check whether enough time has passed since the last compaction run. + * Uses a simple JSON file at `stateFile` to persist the last-run timestamp. + */ +export async function shouldRunCompaction( + stateFile: string, + cooldownHours: number, +): Promise { + try { + const { readFile } = await import("node:fs/promises"); + const raw = await readFile(stateFile, "utf8"); + const state = JSON.parse(raw) as { lastRunAt?: number }; + if (typeof state.lastRunAt === "number") { + const elapsed = Date.now() - state.lastRunAt; + return elapsed >= cooldownHours * 60 * 60 * 1000; + } + } catch { + // File doesn't exist or is malformed — treat as never run + } + return true; +} + +export async function recordCompactionRun(stateFile: string): Promise { + const { writeFile, mkdir } = await import("node:fs/promises"); + const { dirname } = await import("node:path"); + await mkdir(dirname(stateFile), { recursive: true }); + await writeFile(stateFile, JSON.stringify({ lastRunAt: Date.now() }), "utf8"); +} diff --git a/src/memory-upgrader.ts b/src/memory-upgrader.ts index 6ad550b9..c6421ed6 100644 --- a/src/memory-upgrader.ts +++ b/src/memory-upgrader.ts @@ -315,7 +315,8 @@ export class MemoryUpgrader { }>(prompt); if (!llmResult) { - throw new Error("LLM returned null"); + const detail = this.llm.getLastError(); + throw new Error(detail || "LLM returned null"); } enriched = { diff --git a/src/preference-slots.ts b/src/preference-slots.ts new file mode 100644 index 00000000..300c5d57 --- /dev/null +++ b/src/preference-slots.ts @@ -0,0 +1,76 @@ +const ROLE_PREFIX_RE = /^\[(用户|助手)\]\s*/gm; +const PREFERENCE_SPLIT_RE = /(?:、|,|,|\/|以及|及|与|和| and | & )/iu; +const PREFERENCE_CLAUSE_STOP_RE = /(?:因为|所以|但是|不过|if |when |because |but )/iu; +const BRAND_ITEM_PREFERENCE_PATTERNS = [ + /(?:^|[\s,,。;;!!??])(?:我|用户)?(?:很|更|还)?(?:喜欢|爱吃|偏爱|常吃|想吃)(?:吃|喝|用|买)?(?[\p{Script=Han}A-Za-z0-9&·'\-]{1,24})的(?[\p{Script=Han}A-Za-z0-9&·'\-\s、,,和及与/]{1,80})/u, + /\b(?:i|user)?\s*(?:really\s+|still\s+|also\s+)?(?:like|love|prefer|enjoy)\s+(?[a-z0-9'&\-\s]{1,80})\s+from\s+(?[a-z0-9'&\-\s]{1,40})/iu, +] as const; + +export interface ParsedBrandItemPreference { + brand: string; + items: string[]; + aggregate: boolean; +} + +export interface AtomicBrandItemPreferenceSlot { + type: "brand-item"; + brand: string; + item: string; +} + +function normalizePreferenceText(value: string): string { + return value + .replace(ROLE_PREFIX_RE, "") + .replace(/\s+/g, " ") + .trim(); +} + +export function normalizePreferenceToken(value: string): string { + return normalizePreferenceText(value) + .replace(/^[“"'`‘’]+|[”"'`‘’。!!??,,;;::]+$/gu, "") + .replace(/\b(?:the|a|an)\s+/giu, "") + .replace(/\s+/g, "") + .toLowerCase(); +} + +function splitPreferenceItems(rawItems: string): string[] { + const trimmed = rawItems.split(PREFERENCE_CLAUSE_STOP_RE)[0] || rawItems; + return trimmed + .split(PREFERENCE_SPLIT_RE) + .map((item) => normalizePreferenceToken(item)) + .filter((item) => item.length > 0); +} + +export function parseBrandItemPreference(text: string): ParsedBrandItemPreference | null { + const normalizedText = normalizePreferenceText(text); + + for (const pattern of BRAND_ITEM_PREFERENCE_PATTERNS) { + const match = normalizedText.match(pattern); + if (!match?.groups) continue; + + const brand = normalizePreferenceToken(match.groups.brand || ""); + const items = splitPreferenceItems(match.groups.items || ""); + if (!brand || items.length === 0) continue; + + return { + brand, + items, + aggregate: items.length > 1, + }; + } + + return null; +} + +export function inferAtomicBrandItemPreferenceSlot(text: string): AtomicBrandItemPreferenceSlot | null { + const parsed = parseBrandItemPreference(text); + if (!parsed || parsed.aggregate || parsed.items.length !== 1) { + return null; + } + + return { + type: "brand-item", + brand: parsed.brand, + item: parsed.items[0], + }; +} diff --git a/src/retrieval-stats.ts b/src/retrieval-stats.ts new file mode 100644 index 00000000..60994040 --- /dev/null +++ b/src/retrieval-stats.ts @@ -0,0 +1,152 @@ +/** + * Retrieval Statistics — Aggregate query metrics + * + * Collects per-query traces and produces aggregate statistics + * for monitoring retrieval quality and performance. + */ + +import type { RetrievalTrace } from "./retrieval-trace.js"; + +// ============================================================================ +// Types +// ============================================================================ + +export interface AggregateStats { + /** Total number of queries recorded */ + totalQueries: number; + /** Number of queries that returned zero results */ + zeroResultQueries: number; + /** Average latency across all queries (ms) */ + avgLatencyMs: number; + /** 95th percentile latency (ms) */ + p95LatencyMs: number; + /** Average number of results returned */ + avgResultCount: number; + /** Number of queries where reranking was applied */ + rerankUsed: number; + /** Number of queries where noise filter removed results */ + noiseFiltered: number; + /** Query counts broken down by source */ + queriesBySource: Record; + /** Stages that drop the most entries across all queries */ + topDropStages: { name: string; totalDropped: number }[]; +} + +// ============================================================================ +// RetrievalStatsCollector +// ============================================================================ + +interface QueryRecord { + trace: RetrievalTrace; + source: string; +} + +export class RetrievalStatsCollector { + private _records: QueryRecord[] = []; + private readonly _maxRecords: number; + + constructor(maxRecords = 1000) { + this._maxRecords = maxRecords; + } + + /** + * Record a completed query trace. + * @param trace - The finalized retrieval trace + * @param source - Query source identifier (e.g. "manual", "auto-recall") + */ + recordQuery(trace: RetrievalTrace, source: string): void { + this._records.push({ trace, source }); + // Evict oldest if over capacity + if (this._records.length > this._maxRecords) { + this._records.shift(); + } + } + + /** + * Compute aggregate statistics from all recorded queries. + */ + getStats(): AggregateStats { + const n = this._records.length; + if (n === 0) { + return { + totalQueries: 0, + zeroResultQueries: 0, + avgLatencyMs: 0, + p95LatencyMs: 0, + avgResultCount: 0, + rerankUsed: 0, + noiseFiltered: 0, + queriesBySource: {}, + topDropStages: [], + }; + } + + let totalLatency = 0; + let totalResults = 0; + let zeroResultQueries = 0; + let rerankUsed = 0; + let noiseFiltered = 0; + const latencies: number[] = []; + const queriesBySource: Record = {}; + const dropsByStage: Record = {}; + + for (const { trace, source } of this._records) { + totalLatency += trace.totalMs; + totalResults += trace.finalCount; + latencies.push(trace.totalMs); + + if (trace.finalCount === 0) { + zeroResultQueries++; + } + + queriesBySource[source] = (queriesBySource[source] || 0) + 1; + + for (const stage of trace.stages) { + const dropped = stage.inputCount - stage.outputCount; + if (dropped > 0) { + dropsByStage[stage.name] = (dropsByStage[stage.name] || 0) + dropped; + } + if (stage.name === "rerank") { + rerankUsed++; + } + if (stage.name === "noise_filter" && dropped > 0) { + noiseFiltered++; + } + } + } + + // Sort latencies for percentile calculation + latencies.sort((a, b) => a - b); + const p95Index = Math.min(Math.ceil(n * 0.95) - 1, n - 1); + + // Top drop stages sorted by total dropped descending + const topDropStages = Object.entries(dropsByStage) + .map(([name, totalDropped]) => ({ name, totalDropped })) + .sort((a, b) => b.totalDropped - a.totalDropped) + .slice(0, 5); + + return { + totalQueries: n, + zeroResultQueries, + avgLatencyMs: Math.round(totalLatency / n), + p95LatencyMs: latencies[p95Index], + avgResultCount: Math.round((totalResults / n) * 10) / 10, + rerankUsed, + noiseFiltered, + queriesBySource, + topDropStages, + }; + } + + /** + * Reset all collected statistics. + */ + reset(): void { + this._records = []; + } + + /** Number of recorded queries. */ + get count(): number { + return this._records.length; + } +} diff --git a/src/retrieval-trace.ts b/src/retrieval-trace.ts new file mode 100644 index 00000000..a17af36e --- /dev/null +++ b/src/retrieval-trace.ts @@ -0,0 +1,173 @@ +/** + * Retrieval Trace — Observable pipeline diagnostics + * + * Tracks entry IDs through each retrieval stage, computes drops, + * score ranges, and timing. Zero overhead when not used. + */ + +// ============================================================================ +// Types +// ============================================================================ + +export interface RetrievalStageResult { + /** Stage name, e.g. "vector_search", "bm25_search", "rrf_fusion" */ + name: string; + /** Number of entries entering this stage */ + inputCount: number; + /** Number of entries surviving this stage */ + outputCount: number; + /** IDs that were present in input but not in output */ + droppedIds: string[]; + /** [min, max] score range of surviving entries, null if no scores */ + scoreRange: [number, number] | null; + /** Wall-clock duration of this stage in milliseconds */ + durationMs: number; +} + +export interface RetrievalTrace { + /** The original search query */ + query: string; + /** Retrieval mode used */ + mode: "hybrid" | "vector" | "bm25"; + /** Timestamp when retrieval started (epoch ms) */ + startedAt: number; + /** Per-stage results in pipeline order */ + stages: RetrievalStageResult[]; + /** Number of results after all stages */ + finalCount: number; + /** Total wall-clock time in milliseconds */ + totalMs: number; +} + +// ============================================================================ +// TraceCollector +// ============================================================================ + +interface PendingStage { + name: string; + inputIds: Set; + startTime: number; +} + +export class TraceCollector { + private readonly _startTime: number; + private readonly _stages: RetrievalStageResult[] = []; + private _pending: PendingStage | null = null; + + constructor() { + this._startTime = Date.now(); + } + + /** + * Begin tracking a pipeline stage. + * @param name - Stage identifier (e.g. "vector_search") + * @param entryIds - IDs of entries entering this stage + */ + startStage(name: string, entryIds: string[]): void { + // Auto-close any unclosed previous stage (defensive) + if (this._pending) { + this.endStage([...this._pending.inputIds]); + } + this._pending = { + name, + inputIds: new Set(entryIds), + startTime: Date.now(), + }; + } + + /** + * End the current stage. + * @param survivingIds - IDs of entries that survived this stage + * @param scores - Optional scores for surviving entries (parallel to survivingIds) + */ + endStage(survivingIds: string[], scores?: number[]): void { + if (!this._pending) return; + + const { name, inputIds, startTime } = this._pending; + const survivingSet = new Set(survivingIds); + + const droppedIds: string[] = []; + for (const id of inputIds) { + if (!survivingSet.has(id)) { + droppedIds.push(id); + } + } + + let scoreRange: [number, number] | null = null; + if (scores && scores.length > 0) { + let min = Infinity; + let max = -Infinity; + for (const s of scores) { + if (s < min) min = s; + if (s > max) max = s; + } + scoreRange = [min, max]; + } + + this._stages.push({ + name, + inputCount: inputIds.size, + outputCount: survivingIds.length, + droppedIds, + scoreRange, + durationMs: Date.now() - startTime, + }); + + this._pending = null; + } + + /** + * Finalize the trace and produce the complete RetrievalTrace object. + */ + finalize(query: string, mode: string): RetrievalTrace { + // Auto-close any unclosed stage + if (this._pending) { + this.endStage([...this._pending.inputIds]); + } + + const lastStage = this._stages[this._stages.length - 1]; + return { + query, + mode: mode as "hybrid" | "vector" | "bm25", + startedAt: this._startTime, + stages: this._stages, + finalCount: lastStage ? lastStage.outputCount : 0, + totalMs: Date.now() - this._startTime, + }; + } + + /** + * Produce a human-readable summary of the trace. + */ + summarize(): string { + const lines: string[] = []; + lines.push(`Retrieval trace (${this._stages.length} stages):`); + for (const stage of this._stages) { + const dropped = stage.inputCount - stage.outputCount; + const scoreStr = stage.scoreRange + ? ` scores=[${stage.scoreRange[0].toFixed(3)}, ${stage.scoreRange[1].toFixed(3)}]` + : ""; + lines.push( + ` ${stage.name}: ${stage.inputCount} -> ${stage.outputCount} (-${dropped}) ${stage.durationMs}ms${scoreStr}`, + ); + if (stage.droppedIds.length > 0 && stage.droppedIds.length <= 5) { + lines.push(` dropped: ${stage.droppedIds.join(", ")}`); + } else if (stage.droppedIds.length > 5) { + lines.push( + ` dropped: ${stage.droppedIds.slice(0, 5).join(", ")} (+${stage.droppedIds.length - 5} more)`, + ); + } + } + const lastStage = this._stages[this._stages.length - 1]; + const totalMs = Date.now() - this._startTime; + lines.push( + ` total: ${totalMs}ms, final count: ${lastStage ? lastStage.outputCount : 0}`, + ); + return lines.join("\n"); + } + + /** Access collected stages (read-only). */ + get stages(): readonly RetrievalStageResult[] { + return this._stages; + } +} diff --git a/src/retriever.ts b/src/retriever.ts index 484d6a20..900db753 100644 --- a/src/retriever.ts +++ b/src/retriever.ts @@ -19,6 +19,8 @@ import { parseSmartMetadata, toLifecycleMemory, } from "./smart-metadata.js"; +import { TraceCollector, type RetrievalTrace } from "./retrieval-trace.js"; +import { RetrievalStatsCollector } from "./retrieval-stats.js"; // ============================================================================ // Types & Configuration @@ -47,8 +49,15 @@ export interface RetrievalConfig { * - "jina" (default): Authorization: Bearer, string[] documents, results[].relevance_score * - "siliconflow": same format as jina (alias, for clarity) * - "voyage": Authorization: Bearer, string[] documents, data[].relevance_score - * - "pinecone": Api-Key header, {text}[] documents, data[].score */ - rerankProvider?: "jina" | "siliconflow" | "voyage" | "pinecone" | "dashscope"; + * - "pinecone": Api-Key header, {text}[] documents, data[].score + * - "tei": Authorization: Bearer, string[] texts, top-level [{ index, score }] */ + rerankProvider?: + | "jina" + | "siliconflow" + | "voyage" + | "pinecone" + | "dashscope" + | "tei"; /** * Length normalization: penalize long entries that dominate via sheer keyword * density. Formula: score *= 1 / (1 + log2(charLen / anchor)). @@ -78,6 +87,10 @@ export interface RetrievalConfig { /** Maximum half-life multiplier from access reinforcement. * Prevents frequently accessed memories from becoming immortal. (default: 3) */ maxHalfLifeMultiplier: number; + /** Tag prefixes for exact-match queries (default: ["proj", "env", "team", "scope"]). + * Queries containing these prefixes (e.g. "proj:AIF") will use BM25-only + mustContain + * to avoid semantic false positives from vector search. */ + tagPrefixes: string[]; } export interface RetrievalContext { @@ -119,6 +132,7 @@ export const DEFAULT_RETRIEVAL_CONFIG: RetrievalConfig = { timeDecayHalfLifeDays: 60, reinforcementFactor: 0.5, maxHalfLifeMultiplier: 3, + tagPrefixes: ["proj", "env", "team", "scope"], }; // ============================================================================ @@ -144,7 +158,13 @@ function clamp01WithFloor(value: number, floor: number): number { // Rerank Provider Adapters // ============================================================================ -type RerankProvider = "jina" | "siliconflow" | "voyage" | "pinecone" | "dashscope"; +type RerankProvider = + | "jina" + | "siliconflow" + | "voyage" + | "pinecone" + | "dashscope" + | "tei"; interface RerankItem { index: number; @@ -157,10 +177,21 @@ function buildRerankRequest( apiKey: string, model: string, query: string, - documents: string[], + candidates: string[], topN: number, ): { headers: Record; body: Record } { switch (provider) { + case "tei": + return { + headers: { + "Content-Type": "application/json", + Authorization: `Bearer ${apiKey}`, + }, + body: { + query, + texts: candidates, + }, + }; case "dashscope": // DashScope wraps query+documents under `input` and does not use top_n. // Endpoint: https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank @@ -173,7 +204,7 @@ function buildRerankRequest( model, input: { query, - documents, + documents: candidates, }, }, }; @@ -187,7 +218,7 @@ function buildRerankRequest( body: { model, query, - documents: documents.map((text) => ({ text })), + documents: candidates.map((text) => ({ text })), top_n: topN, rank_fields: ["text"], }, @@ -201,7 +232,7 @@ function buildRerankRequest( body: { model, query, - documents, + documents: candidates, // Voyage uses top_k (not top_n) to limit reranked outputs. top_k: topN, }, @@ -217,7 +248,7 @@ function buildRerankRequest( body: { model, query, - documents, + documents: candidates, top_n: topN, }, }; @@ -227,7 +258,7 @@ function buildRerankRequest( /** Parse provider-specific response into unified format */ function parseRerankResponse( provider: RerankProvider, - data: Record, + data: unknown, ): RerankItem[] | null { const parseItems = ( items: unknown, @@ -253,31 +284,41 @@ function parseRerankResponse( } return parsed.length > 0 ? parsed : null; }; + const objectData = + data && typeof data === "object" && !Array.isArray(data) + ? (data as Record) + : undefined; switch (provider) { + case "tei": + return ( + parseItems(data, ["score", "relevance_score"]) ?? + parseItems(objectData?.results, ["score", "relevance_score"]) ?? + parseItems(objectData?.data, ["score", "relevance_score"]) + ); case "dashscope": { // DashScope: { output: { results: [{ index, relevance_score }] } } - const output = data.output as Record | undefined; + const output = objectData?.output as Record | undefined; if (output) { return parseItems(output.results, ["relevance_score", "score"]); } // Fallback: try top-level results in case API format changes - return parseItems(data.results, ["relevance_score", "score"]); + return parseItems(objectData?.results, ["relevance_score", "score"]); } case "pinecone": { // Pinecone: usually { data: [{ index, score, ... }] } // Also tolerate results[] with score/relevance_score for robustness. return ( - parseItems(data.data, ["score", "relevance_score"]) ?? - parseItems(data.results, ["score", "relevance_score"]) + parseItems(objectData?.data, ["score", "relevance_score"]) ?? + parseItems(objectData?.results, ["score", "relevance_score"]) ); } case "voyage": { // Voyage: usually { data: [{ index, relevance_score }] } // Also tolerate results[] for compatibility across gateways. return ( - parseItems(data.data, ["relevance_score", "score"]) ?? - parseItems(data.results, ["relevance_score", "score"]) + parseItems(objectData?.data, ["relevance_score", "score"]) ?? + parseItems(objectData?.results, ["relevance_score", "score"]) ); } case "siliconflow": @@ -286,8 +327,8 @@ function parseRerankResponse( // Jina / SiliconFlow: usually { results: [{ index, relevance_score }] } // Also tolerate data[] for compatibility across gateways. return ( - parseItems(data.results, ["relevance_score", "score"]) ?? - parseItems(data.data, ["relevance_score", "score"]) + parseItems(objectData?.results, ["relevance_score", "score"]) ?? + parseItems(objectData?.data, ["relevance_score", "score"]) ); } } @@ -320,6 +361,7 @@ function cosineSimilarity(a: number[], b: number[]): number { export class MemoryRetriever { private accessTracker: AccessTracker | null = null; private tierManager: TierManager | null = null; + private _statsCollector: RetrievalStatsCollector | null = null; constructor( private store: MemoryStore, @@ -332,6 +374,16 @@ export class MemoryRetriever { this.accessTracker = tracker; } + /** Enable aggregate retrieval statistics collection. */ + setStatsCollector(collector: RetrievalStatsCollector): void { + this._statsCollector = collector; + } + + /** Get the stats collector (if set). */ + getStatsCollector(): RetrievalStatsCollector | null { + return this._statsCollector; + } + private filterActiveResults(results: T[]): T[] { return results.filter((result) => isMemoryActiveAt(parseSmartMetadata(result.entry.metadata, result.entry)), @@ -342,23 +394,35 @@ export class MemoryRetriever { const { query, limit, scopeFilter, category, source } = context; const safeLimit = clampInt(limit, 1, 20); + // Create trace only when stats collector is active (zero overhead otherwise) + const trace = this._statsCollector ? new TraceCollector() : undefined; + + // Check if query contains tag prefixes -> use BM25-only + mustContain + const tagTokens = this.extractTagTokens(query); let results: RetrievalResult[]; - if (this.config.mode === "vector" || !this.store.hasFtsSupport) { + + if (tagTokens.length > 0) { + results = await this.bm25OnlyRetrieval( + query, tagTokens, safeLimit, scopeFilter, category, trace, + ); + } else if (this.config.mode === "vector" || !this.store.hasFtsSupport) { results = await this.vectorOnlyRetrieval( - query, - safeLimit, - scopeFilter, - category, + query, safeLimit, scopeFilter, category, trace, ); } else { results = await this.hybridRetrieval( - query, - safeLimit, - scopeFilter, - category, + query, safeLimit, scopeFilter, category, trace, ); } + // Feed completed trace to stats collector + if (trace && this._statsCollector) { + const mode = tagTokens.length > 0 ? "bm25" + : (this.config.mode === "vector" || !this.store.hasFtsSupport) ? "vector" : "hybrid"; + const finalTrace = trace.finalize(query, mode); + this._statsCollector.recordQuery(finalTrace, source || "unknown"); + } + // Record access for reinforcement (manual recall only) if (this.accessTracker && source === "manual" && results.length > 0) { this.accessTracker.recordAccess(results.map((r) => r.entry.id)); @@ -367,121 +431,276 @@ export class MemoryRetriever { return results; } + /** + * Retrieve with full trace, used by the memory_debug tool. + * Always collects a trace regardless of stats collector state. + */ + async retrieveWithTrace( + context: RetrievalContext, + ): Promise<{ results: RetrievalResult[]; trace: RetrievalTrace }> { + const { query, limit, scopeFilter, category, source } = context; + const safeLimit = clampInt(limit, 1, 20); + const trace = new TraceCollector(); + + const tagTokens = this.extractTagTokens(query); + let results: RetrievalResult[]; + + if (tagTokens.length > 0) { + results = await this.bm25OnlyRetrieval( + query, tagTokens, safeLimit, scopeFilter, category, trace, + ); + } else if (this.config.mode === "vector" || !this.store.hasFtsSupport) { + results = await this.vectorOnlyRetrieval( + query, safeLimit, scopeFilter, category, trace, + ); + } else { + results = await this.hybridRetrieval( + query, safeLimit, scopeFilter, category, trace, + ); + } + + const mode = tagTokens.length > 0 ? "bm25" + : (this.config.mode === "vector" || !this.store.hasFtsSupport) ? "vector" : "hybrid"; + const finalTrace = trace.finalize(query, mode); + + if (this._statsCollector) { + this._statsCollector.recordQuery(finalTrace, source || "debug"); + } + + if (this.accessTracker && source === "manual" && results.length > 0) { + this.accessTracker.recordAccess(results.map((r) => r.entry.id)); + } + + return { results, trace: finalTrace }; + } + + private extractTagTokens(query: string): string[] { + if (!this.config.tagPrefixes?.length) return []; + + const pattern = this.config.tagPrefixes.join("|"); + const regex = new RegExp(`(?:${pattern}):[\\w-]+`, "gi"); + const matches = query.match(regex); + return matches || []; + } + private async vectorOnlyRetrieval( query: string, limit: number, scopeFilter?: string[], category?: string, + trace?: TraceCollector, ): Promise { const queryVector = await this.embedder.embedQuery(query); + + trace?.startStage("vector_search", []); const results = await this.store.vectorSearch( - queryVector, - limit, - this.config.minScore, - scopeFilter, - { excludeInactive: true }, + queryVector, limit, this.config.minScore, scopeFilter, { excludeInactive: true }, ); - - // Filter by category if specified const filtered = category - ? results.filter((r) => r.entry.category === category) - : results; - + ? results.filter((r) => r.entry.category === category) : results; const mapped = filtered.map( (result, index) => - ({ - ...result, - sources: { - vector: { score: result.score, rank: index + 1 }, - }, - }) as RetrievalResult, + ({ ...result, sources: { vector: { score: result.score, rank: index + 1 } } }) as RetrievalResult, ); + if (trace) { + trace.endStage(mapped.map((r) => r.entry.id), mapped.map((r) => r.score)); + } + + let weighted: RetrievalResult[]; + if (this.decayEngine) { + weighted = mapped; + } else { + trace?.startStage("recency_boost", mapped.map((r) => r.entry.id)); + const boosted = this.applyRecencyBoost(mapped); + trace?.endStage(boosted.map((r) => r.entry.id), boosted.map((r) => r.score)); - const weighted = this.decayEngine ? mapped : this.applyImportanceWeight(this.applyRecencyBoost(mapped)); + trace?.startStage("importance_weight", boosted.map((r) => r.entry.id)); + weighted = this.applyImportanceWeight(boosted); + trace?.endStage(weighted.map((r) => r.entry.id), weighted.map((r) => r.score)); + } + + trace?.startStage("length_normalization", weighted.map((r) => r.entry.id)); const lengthNormalized = this.applyLengthNormalization(weighted); + trace?.endStage(lengthNormalized.map((r) => r.entry.id), lengthNormalized.map((r) => r.score)); + + trace?.startStage("hard_cutoff", lengthNormalized.map((r) => r.entry.id)); const hardFiltered = lengthNormalized.filter(r => r.score >= this.config.hardMinScore); + trace?.endStage(hardFiltered.map((r) => r.entry.id), hardFiltered.map((r) => r.score)); + + const decayStageName = this.decayEngine ? "decay_boost" : "time_decay"; + trace?.startStage(decayStageName, hardFiltered.map((r) => r.entry.id)); const lifecycleRanked = this.decayEngine ? this.applyDecayBoost(hardFiltered) : this.applyTimeDecay(hardFiltered); + trace?.endStage(lifecycleRanked.map((r) => r.entry.id), lifecycleRanked.map((r) => r.score)); + + trace?.startStage("noise_filter", lifecycleRanked.map((r) => r.entry.id)); const denoised = this.config.filterNoise ? filterNoise(lifecycleRanked, r => r.entry.text) : lifecycleRanked; + trace?.endStage(denoised.map((r) => r.entry.id), denoised.map((r) => r.score)); - // MMR deduplication: avoid top-k filled with near-identical memories + trace?.startStage("mmr_diversity", denoised.map((r) => r.entry.id)); const deduplicated = this.applyMMRDiversity(denoised); + const finalResults = deduplicated.slice(0, limit); + trace?.endStage(finalResults.map((r) => r.entry.id), finalResults.map((r) => r.score)); - return deduplicated.slice(0, limit); + return finalResults; } - private async hybridRetrieval( + private async bm25OnlyRetrieval( query: string, + tagTokens: string[], limit: number, scopeFilter?: string[], category?: string, + trace?: TraceCollector, ): Promise { - const candidatePoolSize = Math.max( - this.config.candidatePoolSize, - limit * 2, + const candidatePoolSize = Math.max(this.config.candidatePoolSize, limit * 2); + + trace?.startStage("bm25_search", []); + const bm25Results = await this.store.bm25Search( + query, candidatePoolSize, scopeFilter, { excludeInactive: true }, + ); + const categoryFiltered = category + ? bm25Results.filter((r) => r.entry.category === category) : bm25Results; + const mustContainFiltered = categoryFiltered.filter((r) => { + const textLower = r.entry.text.toLowerCase(); + return tagTokens.every((t) => textLower.includes(t.toLowerCase())); + }); + const mapped = mustContainFiltered.map( + (result, index) => + ({ ...result, sources: { bm25: { score: result.score, rank: index + 1 } } }) as RetrievalResult, ); + trace?.endStage(mapped.map((r) => r.entry.id), mapped.map((r) => r.score)); - // Compute query embedding once, reuse for vector search + reranking + let temporallyRanked: RetrievalResult[]; + if (this.decayEngine) { + temporallyRanked = mapped; + } else { + trace?.startStage("recency_boost", mapped.map((r) => r.entry.id)); + const boosted = this.applyRecencyBoost(mapped); + trace?.endStage(boosted.map((r) => r.entry.id), boosted.map((r) => r.score)); + + trace?.startStage("importance_weight", boosted.map((r) => r.entry.id)); + temporallyRanked = this.applyImportanceWeight(boosted); + trace?.endStage(temporallyRanked.map((r) => r.entry.id), temporallyRanked.map((r) => r.score)); + } + + trace?.startStage("length_normalization", temporallyRanked.map((r) => r.entry.id)); + const lengthNormalized = this.applyLengthNormalization(temporallyRanked); + trace?.endStage(lengthNormalized.map((r) => r.entry.id), lengthNormalized.map((r) => r.score)); + + trace?.startStage("hard_cutoff", lengthNormalized.map((r) => r.entry.id)); + const hardFiltered = lengthNormalized.filter(r => r.score >= this.config.hardMinScore); + trace?.endStage(hardFiltered.map((r) => r.entry.id), hardFiltered.map((r) => r.score)); + + const decayStageName = this.decayEngine ? "decay_boost" : "time_decay"; + trace?.startStage(decayStageName, hardFiltered.map((r) => r.entry.id)); + const lifecycleRanked = this.decayEngine + ? this.applyDecayBoost(hardFiltered) : this.applyTimeDecay(hardFiltered); + trace?.endStage(lifecycleRanked.map((r) => r.entry.id), lifecycleRanked.map((r) => r.score)); + + trace?.startStage("noise_filter", lifecycleRanked.map((r) => r.entry.id)); + const denoised = this.config.filterNoise + ? filterNoise(lifecycleRanked, r => r.entry.text) : lifecycleRanked; + trace?.endStage(denoised.map((r) => r.entry.id), denoised.map((r) => r.score)); + + trace?.startStage("mmr_diversity", denoised.map((r) => r.entry.id)); + const deduplicated = this.applyMMRDiversity(denoised); + const finalResults = deduplicated.slice(0, limit); + trace?.endStage(finalResults.map((r) => r.entry.id), finalResults.map((r) => r.score)); + + return finalResults; + } + + private async hybridRetrieval( + query: string, + limit: number, + scopeFilter?: string[], + category?: string, + trace?: TraceCollector, + ): Promise { + const candidatePoolSize = Math.max(this.config.candidatePoolSize, limit * 2); const queryVector = await this.embedder.embedQuery(query); - // Run vector and BM25 searches in parallel + // Run vector and BM25 searches in parallel. + // Trace as a single "parallel_search" stage since both run concurrently — + // splitting into separate sequential stages would misrepresent timing. + trace?.startStage("parallel_search", []); const [vectorResults, bm25Results] = await Promise.all([ - this.runVectorSearch( - queryVector, - candidatePoolSize, - scopeFilter, - category, - ), + this.runVectorSearch(queryVector, candidatePoolSize, scopeFilter, category), this.runBM25Search(query, candidatePoolSize, scopeFilter, category), ]); + if (trace) { + const allSearchIds = [ + ...new Set([...vectorResults.map((r) => r.entry.id), ...bm25Results.map((r) => r.entry.id)]), + ]; + const allScores = [...vectorResults.map((r) => r.score), ...bm25Results.map((r) => r.score)]; + trace.endStage(allSearchIds, allScores); + } - // Fuse results using RRF (async: validates BM25-only entries exist in store) + // Fuse results using RRF + const allInputIds = [ + ...new Set([...vectorResults.map((r) => r.entry.id), ...bm25Results.map((r) => r.entry.id)]), + ]; + trace?.startStage("rrf_fusion", allInputIds); const fusedResults = await this.fuseResults(vectorResults, bm25Results); + trace?.endStage(fusedResults.map((r) => r.entry.id), fusedResults.map((r) => r.score)); // Apply minimum score threshold - const filtered = fusedResults.filter( - (r) => r.score >= this.config.minScore, - ); + trace?.startStage("min_score_filter", fusedResults.map((r) => r.entry.id)); + const filtered = fusedResults.filter((r) => r.score >= this.config.minScore); + trace?.endStage(filtered.map((r) => r.entry.id), filtered.map((r) => r.score)); + + // Rerank if enabled — only emit trace stage when rerank actually runs + let reranked: RetrievalResult[]; + if (this.config.rerank !== "none") { + trace?.startStage("rerank", filtered.map((r) => r.entry.id)); + reranked = await this.rerankResults(query, queryVector, filtered.slice(0, limit * 2)); + trace?.endStage(reranked.map((r) => r.entry.id), reranked.map((r) => r.score)); + } else { + reranked = filtered; + } - // Rerank if enabled - const reranked = - this.config.rerank !== "none" - ? await this.rerankResults( - query, - queryVector, - filtered.slice(0, limit * 2), - ) - : filtered; + let temporallyRanked: RetrievalResult[]; + if (this.decayEngine) { + temporallyRanked = reranked; + } else { + trace?.startStage("recency_boost", reranked.map((r) => r.entry.id)); + const boosted = this.applyRecencyBoost(reranked); + trace?.endStage(boosted.map((r) => r.entry.id), boosted.map((r) => r.score)); - const temporallyRanked = this.decayEngine - ? reranked - : this.applyImportanceWeight(this.applyRecencyBoost(reranked)); + trace?.startStage("importance_weight", boosted.map((r) => r.entry.id)); + temporallyRanked = this.applyImportanceWeight(boosted); + trace?.endStage(temporallyRanked.map((r) => r.entry.id), temporallyRanked.map((r) => r.score)); + } - // Apply length normalization (penalize long entries dominating via keyword density) + trace?.startStage("length_normalization", temporallyRanked.map((r) => r.entry.id)); const lengthNormalized = this.applyLengthNormalization(temporallyRanked); + trace?.endStage(lengthNormalized.map((r) => r.entry.id), lengthNormalized.map((r) => r.score)); - // Hard minimum score cutoff should be based on semantic / lexical relevance. - // Lifecycle decay and time-decay are used for re-ranking, not for dropping - // otherwise relevant fresh memories. + trace?.startStage("hard_cutoff", lengthNormalized.map((r) => r.entry.id)); const hardFiltered = lengthNormalized.filter(r => r.score >= this.config.hardMinScore); + trace?.endStage(hardFiltered.map((r) => r.entry.id), hardFiltered.map((r) => r.score)); - // Apply lifecycle-aware decay or legacy time decay after thresholding + const decayStageName = this.decayEngine ? "decay_boost" : "time_decay"; + trace?.startStage(decayStageName, hardFiltered.map((r) => r.entry.id)); const lifecycleRanked = this.decayEngine - ? this.applyDecayBoost(hardFiltered) - : this.applyTimeDecay(hardFiltered); + ? this.applyDecayBoost(hardFiltered) : this.applyTimeDecay(hardFiltered); + trace?.endStage(lifecycleRanked.map((r) => r.entry.id), lifecycleRanked.map((r) => r.score)); - // Filter noise + trace?.startStage("noise_filter", lifecycleRanked.map((r) => r.entry.id)); const denoised = this.config.filterNoise - ? filterNoise(lifecycleRanked, r => r.entry.text) - : lifecycleRanked; + ? filterNoise(lifecycleRanked, r => r.entry.text) : lifecycleRanked; + trace?.endStage(denoised.map((r) => r.entry.id), denoised.map((r) => r.score)); - // MMR deduplication: avoid top-k filled with near-identical memories + trace?.startStage("mmr_diversity", denoised.map((r) => r.entry.id)); const deduplicated = this.applyMMRDiversity(denoised); + const finalResults = deduplicated.slice(0, limit); + trace?.endStage(finalResults.map((r) => r.entry.id), finalResults.map((r) => r.score)); - return deduplicated.slice(0, limit); + return finalResults; } private async runVectorSearch( @@ -653,7 +872,7 @@ export class MemoryRetriever { clearTimeout(timeout); if (response.ok) { - const data = (await response.json()) as Record; + const data: unknown = await response.json(); // Parse provider-specific response into unified format const parsed = parseRerankResponse(provider, data); diff --git a/src/scopes.ts b/src/scopes.ts index ef37a6b1..5e3e1071 100644 --- a/src/scopes.ts +++ b/src/scopes.ts @@ -19,7 +19,21 @@ export interface ScopeConfig { } export interface ScopeManager { + /** + * Enumerate known scopes for the caller. + * + * Note: this is an enumeration API, not a full description of every syntactically-valid built-in + * pattern accepted by `validateScope()` / `isAccessible()`. In particular, bypass callers may still + * validate built-in scope patterns that are not explicitly registered in `definitions`. + */ getAccessibleScopes(agentId?: string): string[]; + /** + * Optional store-layer filter hook. + * Return `undefined` only for intentional full-bypass callers (for example internal system tasks). + * Custom implementations should keep this distinct from `getAccessibleScopes()`, which is an + * enumeration API and should remain consistent with `isAccessible()`. + */ + getScopeFilter?(agentId?: string): string[] | undefined; getDefaultScope(agentId?: string): string; isAccessible(scope: string, agentId?: string): boolean; validateScope(scope: string): boolean; @@ -49,10 +63,63 @@ const SCOPE_PATTERNS = { GLOBAL: "global", AGENT: (agentId: string) => `agent:${agentId}`, CUSTOM: (name: string) => `custom:${name}`, + REFLECTION: (agentId: string) => `reflection:agent:${agentId}`, PROJECT: (projectId: string) => `project:${projectId}`, USER: (userId: string) => `user:${userId}`, }; +const SYSTEM_BYPASS_IDS = new Set(["system", "undefined"]); +const warnedLegacyFallbackBypassIds = new Set(); + +export function isSystemBypassId(agentId?: string): boolean { + return typeof agentId === "string" && SYSTEM_BYPASS_IDS.has(agentId); +} + +/** @internal Exported for testing only — resets the legacy warning throttle. */ +export function _resetLegacyFallbackWarningState(): void { + warnedLegacyFallbackBypassIds.clear(); +} + +/** + * Extract agentId from an OpenClaw session key. + * Supports both formats: + * - "agent:main:discord:channel:123" (with trailing segments) + * - "agent:main" (two-segment, no trailing colon) + * Returns undefined for missing keys, non-agent keys, or reserved bypass IDs. + * This is the single canonical implementation — do not duplicate inline. + */ +export function parseAgentIdFromSessionKey(sessionKey: string | undefined): string | undefined { + if (!sessionKey) return undefined; + const sk = sessionKey.trim(); + // Match "agent:" with or without trailing segments + if (!sk.startsWith("agent:")) return undefined; + const rest = sk.slice("agent:".length); + const colonIdx = rest.indexOf(":"); + const candidate = (colonIdx === -1 ? rest : rest.slice(0, colonIdx)).trim(); + if (!candidate || isSystemBypassId(candidate)) { + return undefined; + } + return candidate; +} + +function withOwnReflectionScope(scopes: string[], agentId: string): string[] { + const reflectionScope = SCOPE_PATTERNS.REFLECTION(agentId); + return scopes.includes(reflectionScope) ? [...scopes] : [...scopes, reflectionScope]; +} + +function normalizeAgentAccessMap( + agentAccess: Record | undefined, +): Record { + const normalized: Record = {}; + if (!agentAccess) return normalized; + for (const [rawAgentId, scopes] of Object.entries(agentAccess)) { + const agentId = rawAgentId.trim(); + if (!agentId) continue; + normalized[agentId] = Array.isArray(scopes) ? [...scopes] : []; + } + return normalized; +} + // ============================================================================ // Scope Manager Implementation // ============================================================================ @@ -68,8 +135,8 @@ export class MemoryScopeManager implements ScopeManager { ...config.definitions, }, agentAccess: { - ...DEFAULT_SCOPE_CONFIG.agentAccess, - ...config.agentAccess, + ...normalizeAgentAccessMap(DEFAULT_SCOPE_CONFIG.agentAccess), + ...normalizeAgentAccessMap(config.agentAccess), }, }; @@ -89,8 +156,16 @@ export class MemoryScopeManager implements ScopeManager { throw new Error(`Default scope '${this.config.default}' not found in definitions`); } - // Validate agent access scopes exist in definitions + // Validate agent access scopes exist in definitions + reject reserved bypass IDs for (const [agentId, scopes] of Object.entries(this.config.agentAccess)) { + // Trim before checking to prevent space-padded bypass IDs like " system " + const trimmedAgentId = agentId.trim(); + if (isSystemBypassId(trimmedAgentId)) { + throw new Error( + `Reserved bypass agent ID '${trimmedAgentId}' cannot have explicit access configured. ` + + `This is rejected in both constructor and importConfig paths.` + ); + } for (const scope of scopes) { if (!this.config.definitions[scope] && !this.isBuiltInScope(scope)) { console.warn(`Agent '${agentId}' has access to undefined scope '${scope}'`); @@ -105,38 +180,64 @@ export class MemoryScopeManager implements ScopeManager { scope.startsWith("agent:") || scope.startsWith("custom:") || scope.startsWith("project:") || - scope.startsWith("user:") + scope.startsWith("user:") || + scope.startsWith("reflection:") ); } getAccessibleScopes(agentId?: string): string[] { - if (!agentId) { - // No agent specified, return all scopes + if (isSystemBypassId(agentId) || !agentId) { + // Keep enumeration semantics consistent for callers that inspect the list. + // This enumerates registered scopes, not every valid built-in pattern. return this.getAllScopes(); } - // Check explicit agent access configuration - const explicitAccess = this.config.agentAccess[agentId]; + // Explicit ACLs still inherit the agent's own reflection scope. + const normalizedAgentId = agentId.trim(); + const explicitAccess = this.config.agentAccess[normalizedAgentId]; if (explicitAccess) { - return explicitAccess; + return withOwnReflectionScope(explicitAccess, normalizedAgentId); } - // Default access: global + agent-specific scope - const defaultScopes = ["global"]; - const agentScope = SCOPE_PATTERNS.AGENT(agentId); + // Agent and reflection scopes are built-in and provisioned implicitly. + return withOwnReflectionScope([ + "global", + SCOPE_PATTERNS.AGENT(normalizedAgentId), + ], normalizedAgentId); + } - // Only include agent scope if it already exists — don't mutate config as a side effect - if (this.config.definitions[agentScope] || this.isBuiltInScope(agentScope)) { - defaultScopes.push(agentScope); + /** + * Store-layer scope filter semantics: + * + * | Return value | Store behavior | When | + * |---------------------|-----------------------------------------|----------------------------------------| + * | `undefined` | No scope filtering (full bypass) | Reserved bypass ids (system/undefined) | + * | `[]` | Deny all reads / match nothing | Explicit empty filter | + * | `["global", ...]` | Restrict reads to listed scopes | Normal agent with explicit access | + * + * IMPORTANT: Returning `[]` is now an explicit deny-all signal. + * Custom ScopeManager implementations should return `undefined` for bypass + * and `[]` only when they intend reads to match nothing. + */ + getScopeFilter(agentId?: string): string[] | undefined { + if (!agentId || isSystemBypassId(agentId)) { + // No agent specified or internal system tasks bypass store-level scope + // filtering entirely. This aligns with isAccessible(scope, undefined) + // which also uses bypass semantics for missing agentId. + return undefined; } - - return defaultScopes; + return this.getAccessibleScopes(agentId); } getDefaultScope(agentId?: string): string { if (!agentId) { return this.config.default; } + if (isSystemBypassId(agentId)) { + throw new Error( + `Reserved bypass agent ID '${agentId}' must provide an explicit write scope instead of using getDefaultScope().`, + ); + } // For agents, default to their private scope if they have access to it const agentScope = SCOPE_PATTERNS.AGENT(agentId); @@ -150,8 +251,8 @@ export class MemoryScopeManager implements ScopeManager { } isAccessible(scope: string, agentId?: string): boolean { - if (!agentId) { - // No agent specified, allow access to all valid scopes + if (!agentId || isSystemBypassId(agentId)) { + // No agent specified, or internal bypass identifier: allow any valid scope. return this.validateScope(scope); } @@ -217,6 +318,16 @@ export class MemoryScopeManager implements ScopeManager { if (!agentId || typeof agentId !== "string") { throw new Error("Invalid agent ID"); } + const normalizedAgentId = agentId.trim(); + if (!normalizedAgentId) { + throw new Error("Invalid agent ID"); + } + if (isSystemBypassId(normalizedAgentId)) { + throw new Error(`Reserved bypass agent ID cannot have explicit access configured: ${agentId}`); + } + + // Note: an agent's own reflection scope is still auto-granted by getAccessibleScopes(). + // This setter can add access, but it does not revoke `reflection:agent:${normalizedAgentId}`. // Validate all scopes for (const scope of scopes) { @@ -225,15 +336,16 @@ export class MemoryScopeManager implements ScopeManager { } } - this.config.agentAccess[agentId] = [...scopes]; + this.config.agentAccess[normalizedAgentId] = [...scopes]; } removeAgentAccess(agentId: string): boolean { - if (!this.config.agentAccess[agentId]) { + const normalizedAgentId = agentId.trim(); + if (!this.config.agentAccess[normalizedAgentId]) { return false; } - delete this.config.agentAccess[agentId]; + delete this.config.agentAccess[normalizedAgentId]; return true; } @@ -261,19 +373,35 @@ export class MemoryScopeManager implements ScopeManager { } importConfig(config: Partial): void { - this.config = { - default: config.default || this.config.default, + const previous = this.config; + const next: ScopeConfig = { + default: config.default || previous.default, definitions: { - ...this.config.definitions, + ...previous.definitions, ...config.definitions, }, agentAccess: { - ...this.config.agentAccess, - ...config.agentAccess, + ...normalizeAgentAccessMap(previous.agentAccess), + ...normalizeAgentAccessMap(config.agentAccess), }, }; - this.validateConfiguration(); + // Suppress warnings until validation succeeds + const originalWarn = console.warn; + const warnings: string[] = []; + console.warn = (msg: string) => warnings.push(msg); + + this.config = next; + try { + this.validateConfiguration(); + // Emit warnings only after successful validation + warnings.forEach(w => originalWarn(w)); + } catch (err) { + this.config = previous; + throw err; + } finally { + console.warn = originalWarn; + } } // Statistics @@ -302,7 +430,9 @@ export class MemoryScopeManager implements ScopeManager { scopesByType.custom++; } else if (scope.startsWith("project:")) { scopesByType.project++; - } else if (scope.startsWith("user:")) { + } else if (scope.startsWith("user:") || scope.startsWith("reflection:")) { + // TODO: add a dedicated `reflection` bucket once downstream dashboards accept it. + // For now, reflection scopes are counted under `user` for schema compatibility. scopesByType.user++; } else { scopesByType.other++; @@ -365,6 +495,41 @@ export function isScopeAccessible(scope: string, allowedScopes: string[]): boole return allowedScopes.includes(scope); } +export function resolveScopeFilter( + scopeManager: Pick & { + getScopeFilter?: (agentId?: string) => string[] | undefined; + }, + agentId?: string, +): string[] | undefined { + if (typeof scopeManager.getScopeFilter === "function") { + return scopeManager.getScopeFilter(agentId); + } + // Legacy/custom managers without getScopeFilter fall back to enumeration semantics. + // For reserved bypass IDs, any array return is treated as a legacy bypass encoding and + // normalized to undefined so callers see a consistent explicit-bypass contract. + const fallbackScopes = scopeManager.getAccessibleScopes(agentId); + if (!isSystemBypassId(agentId) && Array.isArray(fallbackScopes) && fallbackScopes.length === 0) { + console.warn( + "resolveScopeFilter: non-bypass agent resolved to an empty scope list; downstream store reads will deny all access.", + ); + return []; + } + if (isSystemBypassId(agentId) && Array.isArray(fallbackScopes)) { + const key = String(agentId); + if (!warnedLegacyFallbackBypassIds.has(key)) { + warnedLegacyFallbackBypassIds.add(key); + const shape = fallbackScopes.length === 0 ? "[]" : `[${fallbackScopes.join(", ")}]`; + console.warn( + `resolveScopeFilter: legacy ScopeManager returned ${shape} for reserved bypass id '${key}'. ` + + "Implement getScopeFilter() to make store-level bypass semantics explicit. " + + "Normalizing legacy array return to undefined for bypass consistency.", + ); + } + return undefined; + } + return fallbackScopes; +} + export function filterScopesForAgent(scopes: string[], agentId?: string, scopeManager?: ScopeManager): string[] { if (!scopeManager || !agentId) { return scopes; diff --git a/src/session-compressor.ts b/src/session-compressor.ts new file mode 100644 index 00000000..769904ce --- /dev/null +++ b/src/session-compressor.ts @@ -0,0 +1,331 @@ +/** + * Session Compressor + * + * Scores and compresses conversation texts before memory extraction. + * Prioritizes high-signal content (tool calls, corrections, decisions) over + * low-signal content (greetings, acknowledgments) so that the fixed extraction + * budget captures the most important parts of a conversation. + */ + +// --------------------------------------------------------------------------- +// Types +// --------------------------------------------------------------------------- + +export interface ScoredText { + /** Original index in the texts array */ + index: number; + /** The text content */ + text: string; + /** Score from 0.0 (noise) to 1.0 (high value) */ + score: number; + /** Human-readable reason for the score */ + reason: string; +} + +export interface CompressResult { + /** Selected texts in chronological order */ + texts: string[]; + /** Detailed scoring for all input texts */ + scored: ScoredText[]; + /** Number of texts dropped */ + dropped: number; + /** Total chars in output */ + totalChars: number; +} + +// --------------------------------------------------------------------------- +// Indicator patterns +// --------------------------------------------------------------------------- + +const TOOL_CALL_INDICATORS = [ + /\btool_use\b/i, + /\btool_result\b/i, + /\bfunction_call\b/i, + /\b(memory_store|memory_recall|memory_forget|memory_update)\b/i, + // Removed over-broad patterns: fenced code blocks and "$ " matched normal pasted code +]; + +const CORRECTION_INDICATORS = [ + /^no[,.\s]/i, + /\bactually\b/i, + /\binstead\b/i, + /\bwrong\b/i, + /\bcorrect(ion)?\b/i, + /\bfix\b/i, + /不对/, + /应该是/, + /應該是/, + /错了/, + /錯了/, + /改成/, + /不是.*而是/, +]; + +const DECISION_INDICATORS = [ + /\blet'?s go with\b/i, + /\bconfirmed?\b/i, + /\bapproved?\b/i, + /\bdecided?\b/i, + /\bwe'?ll use\b/i, + /\bgoing forward\b/i, + /\bfrom now on\b/i, + /\bagreed\b/i, + /决定/, + /決定/, + /确认/, + /確認/, + /选择了/, + /選擇了/, + /就这样/, + /就這樣/, +]; + +const ACKNOWLEDGMENT_PATTERNS = [ + /^(ok|okay|k|sure|fine|thanks|thank you|thx|ty|got it|understood|cool|nice|great|good|perfect|awesome|alright|yep|yup|yeah|right)\s*[.!]?$/i, + /^好的?\s*[。!]?$/, + /^嗯\s*[。]?$/, + /^收到\s*[。!]?$/, + /^了解\s*[。!]?$/, + /^明白\s*[。!]?$/, + /^谢谢\s*[。!]?$/, + /^感谢\s*[。!]?$/, + /^👍\s*$/, +]; + +// --------------------------------------------------------------------------- +// Scoring +// --------------------------------------------------------------------------- + +/** + * Score a single text segment by its information density. + */ +export function scoreText(text: string, index: number): ScoredText { + const trimmed = text.trim(); + + // Empty / whitespace-only + if (trimmed.length === 0) { + return { index, text, score: 0.0, reason: "empty" }; + } + + // Tool call indicators → highest value + if (TOOL_CALL_INDICATORS.some((p) => p.test(trimmed))) { + return { index, text, score: 1.0, reason: "tool_call" }; + } + + // Corrections → very high value (user correcting agent = strong signal) + if (CORRECTION_INDICATORS.some((p) => p.test(trimmed))) { + return { index, text, score: 0.95, reason: "correction" }; + } + + // Decisions / confirmations → high value + if (DECISION_INDICATORS.some((p) => p.test(trimmed))) { + return { index, text, score: 0.85, reason: "decision" }; + } + + // Acknowledgments → very low value + if (ACKNOWLEDGMENT_PATTERNS.some((p) => p.test(trimmed))) { + return { index, text, score: 0.1, reason: "acknowledgment" }; + } + + // Substantive content vs short questions + // CJK characters carry ~2-3x more meaning per character, so use a lower + // threshold (same approach as adaptive-retrieval.ts). + const hasCJK = /[\u4e00-\u9fff\u3040-\u309f\u30a0-\u30ff\uac00-\ud7af]/.test(trimmed); + const substantiveMinLength = hasCJK ? 30 : 80; + if (trimmed.length > substantiveMinLength) { + // Check for boilerplate (XML tags, system messages) + if (/^<[a-z-]+>/.test(trimmed) && /<\/[a-z-]+>\s*$/.test(trimmed)) { + return { index, text, score: 0.3, reason: "system_xml" }; + } + return { index, text, score: 0.7, reason: "substantive" }; + } + + // Short questions + if (trimmed.includes("?") || trimmed.includes("\uff1f")) { + return { index, text, score: 0.5, reason: "short_question" }; + } + + // Short but not a question and not an acknowledgment + return { index, text, score: 0.4, reason: "short_statement" }; +} + +// --------------------------------------------------------------------------- +// Compression +// --------------------------------------------------------------------------- + +/** Default minimum texts to keep even if all score low */ +const DEFAULT_MIN_TEXTS = 3; + +/** + * Compress an array of text segments to fit within a character budget. + * + * Strategy: + * 1. Score all texts + * 2. Always include first and last text (session boundaries) + * 3. Sort remaining by score descending + * 4. Greedily select until budget exhausted + * 5. Handle paired texts (tool call + result: indices i, i+1) + * 6. Re-sort selected by original index + * 7. If all texts score < threshold, keep at least minTexts + */ +export function compressTexts( + texts: string[], + maxChars: number, + options: { minTexts?: number; minScoreToKeep?: number } = {}, +): CompressResult { + const minTexts = options.minTexts ?? DEFAULT_MIN_TEXTS; + const minScoreToKeep = options.minScoreToKeep ?? 0.3; + + if (texts.length === 0) { + return { texts: [], scored: [], dropped: 0, totalChars: 0 }; + } + + // Score everything + const scored = texts.map((t, i) => scoreText(t, i)); + + // Total chars of all texts + const allChars = texts.reduce((sum, t) => sum + t.length, 0); + + // If already within budget, return all + if (allChars <= maxChars) { + return { + texts: [...texts], + scored, + dropped: 0, + totalChars: allChars, + }; + } + + // Build selected set starting with first and last + const selectedIndices = new Set(); + let usedChars = 0; + + const addIndex = (idx: number): boolean => { + if (selectedIndices.has(idx) || idx < 0 || idx >= texts.length) return false; + const len = texts[idx].length; + if (usedChars + len > maxChars) { + // Hard cap: even the first/last text cannot exceed budget + return false; + } + selectedIndices.add(idx); + usedChars += len; + return true; + }; + + // Always keep first and last + addIndex(0); + if (texts.length > 1) { + addIndex(texts.length - 1); + } + + // Build candidate list excluding first/last, sorted by score desc (stable by index asc on tie) + const candidates = scored + .filter((s) => s.index !== 0 && s.index !== texts.length - 1) + .sort((a, b) => b.score - a.score || a.index - b.index); + + // Identify paired indices (tool call at i → result at i+1). + // Only pair from a tool_call line, NOT from tool_result — a result line + // should not pull in the next unrelated line as its "partner". + const pairedWith = new Map(); + for (const s of scored) { + if ( + s.reason === "tool_call" && + s.index + 1 < texts.length && + !pairedWith.has(s.index) && // not already claimed + !pairedWith.has(s.index + 1) // partner not already claimed + ) { + pairedWith.set(s.index, s.index + 1); + pairedWith.set(s.index + 1, s.index); + } + } + + // Greedily add candidates + for (const candidate of candidates) { + if (usedChars >= maxChars) break; + + const added = addIndex(candidate.index); + if (added) { + // If this is part of a pair, try to add the partner + const partner = pairedWith.get(candidate.index); + if (partner !== undefined) { + addIndex(partner); + } + } + } + + // All-low-score fallback: if everything scored below threshold, ensure + // we keep at least minTexts (the last N by original order) + const allLow = scored.every((s) => s.score < minScoreToKeep); + if (allLow && selectedIndices.size < Math.min(minTexts, texts.length)) { + // Add from the end (most recent = most relevant for low-value sessions) + for (let i = texts.length - 1; i >= 0 && selectedIndices.size < Math.min(minTexts, texts.length); i--) { + addIndex(i); + } + } + + // Re-sort selected by original index to preserve chronological order + const sortedIndices = [...selectedIndices].sort((a, b) => a - b); + const resultTexts = sortedIndices.map((i) => texts[i]); + const totalChars = resultTexts.reduce((sum, t) => sum + t.length, 0); + + return { + texts: resultTexts, + scored, + dropped: texts.length - sortedIndices.length, + totalChars, + }; +} + +// --------------------------------------------------------------------------- +// Conversation Value Estimation (for Feature 7: Adaptive Throttling) +// --------------------------------------------------------------------------- + +/** + * Estimate the overall value of a conversation for memory extraction. + * Returns a number between 0.0 and 1.0. + * + * Used by the adaptive extraction throttle to skip low-value conversations. + */ +export function estimateConversationValue(texts: string[]): number { + if (texts.length === 0) return 0; + + let value = 0; + + const joined = texts.join(" "); + + // Has explicit memory intent? (e.g. "remember this", "记住") +0.5 + // These should NEVER be skipped by the low-value gate. + const MEMORY_INTENT = /\b(remember|recall|don'?t forget|note that|keep in mind)\b/i; + const MEMORY_INTENT_CJK = /(记住|記住|别忘|不要忘|记一下|記一下)/; + if (MEMORY_INTENT.test(joined) || MEMORY_INTENT_CJK.test(joined)) { + value += 0.5; + } + + // Has tool calls? +0.4 + if (TOOL_CALL_INDICATORS.some((p) => p.test(joined))) { + value += 0.4; + } + + // Has corrections or decisions? +0.3 + const hasCorrectionOrDecision = + CORRECTION_INDICATORS.some((p) => p.test(joined)) || + DECISION_INDICATORS.some((p) => p.test(joined)); + if (hasCorrectionOrDecision) { + value += 0.3; + } + + // Total substantive text > 200 chars? +0.2 + const substantiveChars = texts + .filter((t) => t.trim().length > 20) // skip very short lines + .reduce((sum, t) => sum + t.length, 0); + if (substantiveChars > 200) { + value += 0.2; + } + + // Has multi-turn exchanges (>6 texts)? +0.1 + if (texts.length > 6) { + value += 0.1; + } + + return Math.min(value, 1.0); +} diff --git a/src/smart-extractor.ts b/src/smart-extractor.ts index b24f4b4f..81a00576 100644 --- a/src/smart-extractor.ts +++ b/src/smart-extractor.ts @@ -14,6 +14,12 @@ import { buildDedupPrompt, buildMergePrompt, } from "./extraction-prompts.js"; +import { + AdmissionController, + type AdmissionAuditRecord, + type AdmissionControlConfig, + type AdmissionRejectionAuditEntry, +} from "./admission-control.js"; import { type CandidateMemory, type DedupDecision, @@ -38,6 +44,57 @@ import { parseSupportInfo, updateSupportStats, } from "./smart-metadata.js"; +import { + isUserMdExclusiveMemory, + type WorkspaceBoundaryConfig, +} from "./workspace-boundary.js"; +import { inferAtomicBrandItemPreferenceSlot } from "./preference-slots.js"; +import { batchDedup } from "./batch-dedup.js"; + +// ============================================================================ +// Envelope Metadata Stripping +// ============================================================================ + +/** + * Strip platform envelope metadata injected by OpenClaw channels before + * the conversation text reaches the extraction LLM. These envelopes contain + * message IDs, sender IDs, timestamps, and JSON metadata blocks that have + * zero informational value for memory extraction but get stored verbatim + * by weaker LLMs (e.g. qwen) that can't distinguish metadata from content. + * + * Targets: + * - "System: [YYYY-MM-DD HH:MM:SS GMT+N] Channel[account] ..." header lines + * - "Conversation info (untrusted metadata):" + JSON code blocks + * - "Sender (untrusted metadata):" + JSON code blocks + * - "Replied message (untrusted, for context):" + JSON code blocks + * - Standalone JSON blocks containing message_id/sender_id fields + */ +export function stripEnvelopeMetadata(text: string): string { + // 1. Strip "System: [timestamp] Channel..." lines + let cleaned = text.replace( + /^System:\s*\[[\d\-: +GMT]+\]\s+\S+\[.*?\].*$/gm, + "", + ); + + // 2. Strip labeled metadata sections with their JSON code blocks + // e.g. "Conversation info (untrusted metadata):\n```json\n{...}\n```" + cleaned = cleaned.replace( + /(?:Conversation info|Sender|Replied message)\s*\(untrusted[^)]*\):\s*```json\s*\{[\s\S]*?\}\s*```/g, + "", + ); + + // 3. Strip any remaining JSON blocks that look like envelope metadata + // (contain message_id and sender_id fields) + cleaned = cleaned.replace( + /```json\s*\{[^}]*"message_id"\s*:[^}]*"sender_id"\s*:[^}]*\}\s*```/g, + "", + ); + + // 4. Collapse excessive blank lines left by removals + cleaned = cleaned.replace(/\n{3,}/g, "\n\n"); + + return cleaned.trim(); +} // ============================================================================ // Constants @@ -75,18 +132,33 @@ export interface SmartExtractorConfig { debugLog?: (msg: string) => void; /** Optional embedding-based noise prototype bank for language-agnostic noise filtering. */ noiseBank?: NoisePrototypeBank; + /** Facts reserved for workspace-managed USER.md should never enter LanceDB. */ + workspaceBoundary?: WorkspaceBoundaryConfig; + /** Optional admission-control governance layer before downstream dedup/persistence. */ + admissionControl?: AdmissionControlConfig; + /** Optional sink for durable reject-audit logging. */ + onAdmissionRejected?: (entry: AdmissionRejectionAuditEntry) => Promise | void; } export interface ExtractPersistOptions { /** Target scope for newly created memories. */ scope?: string; - /** Scopes visible to the current agent for dedup/merge. */ + /** + * Optional store-layer scope filter override used for dedup/merge reads. + * - omit the field to default reads to `[scope ?? defaultScope]` + * - set `undefined` explicitly to preserve trusted full-bypass callers + * - pass `[]` to force deny-all reads (match nothing) + * - pass a non-empty array to restrict reads to those scopes + */ scopeFilter?: string[]; } export class SmartExtractor { private log: (msg: string) => void; private debugLog: (msg: string) => void; + private admissionController: AdmissionController | null; + private persistAdmissionAudit: boolean; + private onAdmissionRejected?: (entry: AdmissionRejectionAuditEntry) => Promise | void; constructor( private store: MemoryStore, @@ -96,6 +168,19 @@ export class SmartExtractor { ) { this.log = config.log ?? ((msg: string) => console.log(msg)); this.debugLog = config.debugLog ?? (() => { }); + this.persistAdmissionAudit = + config.admissionControl?.enabled === true && + config.admissionControl.auditMetadata !== false; + this.onAdmissionRejected = config.onAdmissionRejected; + this.admissionController = + config.admissionControl?.enabled === true + ? new AdmissionController( + this.store, + this.llm, + config.admissionControl, + this.debugLog, + ) + : null; } // -------------------------------------------------------------------------- @@ -111,12 +196,16 @@ export class SmartExtractor { sessionKey: string = "unknown", options: ExtractPersistOptions = {}, ): Promise { - const stats: ExtractionStats = { created: 0, merged: 0, skipped: 0 }; + const stats: ExtractionStats = { created: 0, merged: 0, skipped: 0, boundarySkipped: 0 }; const targetScope = options.scope ?? this.config.defaultScope ?? "global"; - const scopeFilter = - options.scopeFilter && options.scopeFilter.length > 0 - ? options.scopeFilter - : [targetScope]; + // Distinguish "no override supplied" from explicit bypass/override values. + // - omitted `scopeFilter` => default to `[targetScope]` + // - explicit `undefined` => preserve full-bypass semantics for trusted callers + // - explicit `[]` or non-empty array => pass through unchanged + const hasExplicitScopeFilter = "scopeFilter" in options; + const scopeFilter = hasExplicitScopeFilter + ? options.scopeFilter + : [targetScope]; // Step 1: LLM extraction const candidates = await this.extractCandidates(conversationText); @@ -132,11 +221,53 @@ export class SmartExtractor { `memory-pro: smart-extractor: extracted ${candidates.length} candidate(s)`, ); - // Step 2: Process each candidate through dedup pipeline - for (const candidate of candidates.slice(0, MAX_MEMORIES_PER_EXTRACTION)) { + // Step 1b: Batch-internal dedup — embed candidate abstracts and remove near-duplicates + // before expensive per-candidate LLM dedup calls (see src/batch-dedup.ts) + const capped = candidates.slice(0, MAX_MEMORIES_PER_EXTRACTION); + let survivingCandidates = capped; + try { + const abstracts = capped.map((c) => c.abstract); + const vectors = await Promise.all( + abstracts.map((a) => this.embedder.embed(a).catch(() => [] as number[])), + ); + const dedupResult = batchDedup(abstracts, vectors); + if (dedupResult.duplicateIndices.length > 0) { + survivingCandidates = dedupResult.survivingIndices.map((i) => capped[i]); + stats.skipped += dedupResult.duplicateIndices.length; + this.log( + `memory-pro: smart-extractor: batchDedup dropped ${dedupResult.duplicateIndices.length} near-duplicate(s), ${survivingCandidates.length} survivor(s)`, + ); + } + } catch (err) { + this.log( + `memory-pro: smart-extractor: batchDedup failed, proceeding without batch dedup: ${String(err)}`, + ); + } + + // Step 2: Process each surviving candidate through dedup pipeline + for (const candidate of survivingCandidates) { + if ( + isUserMdExclusiveMemory( + { + memoryCategory: candidate.category, + abstract: candidate.abstract, + content: candidate.content, + }, + this.config.workspaceBoundary, + ) + ) { + stats.skipped += 1; + stats.boundarySkipped = (stats.boundarySkipped ?? 0) + 1; + this.log( + `memory-pro: smart-extractor: skipped USER.md-exclusive [${candidate.category}] ${candidate.abstract.slice(0, 60)}`, + ); + continue; + } + try { await this.processCandidate( candidate, + conversationText, sessionKey, stats, targetScope, @@ -230,8 +361,13 @@ export class SmartExtractor { ? conversationText.slice(-maxChars) : conversationText; + // Strip platform envelope metadata injected by OpenClaw channels + // (e.g. "System: [2026-03-18 14:21:36 GMT+8] Feishu[default] DM | ou_...") + // These pollute extraction if treated as conversation content. + const cleaned = stripEnvelopeMetadata(truncated); + const user = this.config.user ?? "User"; - const prompt = buildExtractionPrompt(truncated, user); + const prompt = buildExtractionPrompt(cleaned, user); const result = await this.llm.completeJson<{ memories: Array<{ @@ -313,20 +449,28 @@ export class SmartExtractor { */ private async processCandidate( candidate: CandidateMemory, + conversationText: string, sessionKey: string, stats: ExtractionStats, targetScope: string, - scopeFilter: string[], + scopeFilter?: string[], ): Promise { - // Profile always merges (skip dedup) + // Profile always merges (skip dedup — admission control still applies) if (ALWAYS_MERGE_CATEGORIES.has(candidate.category)) { - await this.handleProfileMerge( + const profileResult = await this.handleProfileMerge( candidate, + conversationText, sessionKey, targetScope, scopeFilter, ); - stats.merged++; + if (profileResult === "rejected") { + stats.rejected = (stats.rejected ?? 0) + 1; + } else if (profileResult === "created") { + stats.created++; + } else { + stats.merged++; + } return; } @@ -340,12 +484,38 @@ export class SmartExtractor { return; } + // Admission control gate (before dedup) + const admission = this.admissionController + ? await this.admissionController.evaluate({ + candidate, + candidateVector: vector, + conversationText, + scopeFilter: scopeFilter ?? [targetScope], + }) + : undefined; + + if (admission?.decision === "reject") { + stats.rejected = (stats.rejected ?? 0) + 1; + this.log( + `memory-pro: smart-extractor: admission rejected [${candidate.category}] ${candidate.abstract.slice(0, 60)} — ${admission.audit.reason}`, + ); + await this.recordRejectedAdmission( + candidate, + conversationText, + sessionKey, + targetScope, + scopeFilter ?? [targetScope], + admission.audit as AdmissionAuditRecord & { decision: "reject" }, + ); + return; + } + // Dedup pipeline const dedupResult = await this.deduplicate(candidate, vector, scopeFilter); switch (dedupResult.decision) { case "create": - await this.storeCandidate(candidate, vector, sessionKey, targetScope); + await this.storeCandidate(candidate, vector, sessionKey, targetScope, admission?.audit); stats.created++; break; @@ -357,14 +527,15 @@ export class SmartExtractor { await this.handleMerge( candidate, dedupResult.matchId, - scopeFilter, targetScope, + scopeFilter, dedupResult.contextLabel, + admission?.audit, ); stats.merged++; } else { // Category doesn't support merge → create instead - await this.storeCandidate(candidate, vector, sessionKey, targetScope); + await this.storeCandidate(candidate, vector, sessionKey, targetScope, admission?.audit); stats.created++; } break; @@ -388,31 +559,32 @@ export class SmartExtractor { sessionKey, targetScope, scopeFilter, + admission?.audit, ); stats.created++; stats.superseded = (stats.superseded ?? 0) + 1; } else { - await this.storeCandidate(candidate, vector, sessionKey, targetScope); + await this.storeCandidate(candidate, vector, sessionKey, targetScope, admission?.audit); stats.created++; } break; case "support": if (dedupResult.matchId) { - await this.handleSupport(dedupResult.matchId, scopeFilter, { session: sessionKey, timestamp: Date.now() }, dedupResult.reason, dedupResult.contextLabel); + await this.handleSupport(dedupResult.matchId, { session: sessionKey, timestamp: Date.now() }, dedupResult.reason, dedupResult.contextLabel, scopeFilter, admission?.audit); stats.supported = (stats.supported ?? 0) + 1; } else { - await this.storeCandidate(candidate, vector, sessionKey, targetScope); + await this.storeCandidate(candidate, vector, sessionKey, targetScope, admission?.audit); stats.created++; } break; case "contextualize": if (dedupResult.matchId) { - await this.handleContextualize(candidate, vector, dedupResult.matchId, sessionKey, targetScope, scopeFilter, dedupResult.contextLabel); + await this.handleContextualize(candidate, vector, dedupResult.matchId, sessionKey, targetScope, scopeFilter, dedupResult.contextLabel, admission?.audit); stats.created++; } else { - await this.storeCandidate(candidate, vector, sessionKey, targetScope); + await this.storeCandidate(candidate, vector, sessionKey, targetScope, admission?.audit); stats.created++; } break; @@ -430,15 +602,16 @@ export class SmartExtractor { sessionKey, targetScope, scopeFilter, + admission?.audit, ); stats.created++; stats.superseded = (stats.superseded ?? 0) + 1; } else { - await this.handleContradict(candidate, vector, dedupResult.matchId, sessionKey, targetScope, scopeFilter, dedupResult.contextLabel); + await this.handleContradict(candidate, vector, dedupResult.matchId, sessionKey, targetScope, scopeFilter, dedupResult.contextLabel, admission?.audit); stats.created++; } } else { - await this.storeCandidate(candidate, vector, sessionKey, targetScope); + await this.storeCandidate(candidate, vector, sessionKey, targetScope, admission?.audit); stats.created++; } break; @@ -455,7 +628,7 @@ export class SmartExtractor { private async deduplicate( candidate: CandidateMemory, candidateVector: number[], - scopeFilter: string[], + scopeFilter?: string[], ): Promise { // Stage 1: Vector pre-filter — find similar active memories. // excludeInactive ensures the store over-fetches to fill N active slots, @@ -472,6 +645,26 @@ export class SmartExtractor { return { decision: "create", reason: "No similar memories found" }; } + // Stage 1.5: Preference slot guard — same brand but different item + // should always be stored as a new memory, not merged/skipped. + // Example: "喜欢麦当劳的板烧鸡腿堡" and "喜欢麦当劳的麦辣鸡翅" are + // different preferences even though they share the same brand. + if (candidate.category === "preferences") { + const candidateSlot = inferAtomicBrandItemPreferenceSlot(candidate.content); + if (candidateSlot) { + const allDifferentItem = activeSimilar.every((r) => { + const existingSlot = inferAtomicBrandItemPreferenceSlot(r.entry.text); + // If existing is not a brand-item preference, let LLM decide + if (!existingSlot) return false; + // Same brand, different item → should not be deduped + return existingSlot.brand === candidateSlot.brand && existingSlot.item !== candidateSlot.item; + }); + if (allDifferentItem) { + return { decision: "create", reason: "Same brand but different item-level preference (preference-slot guard)" }; + } + } + } + // Stage 2: LLM decision return this.llmDedupDecision(candidate, activeSimilar); } @@ -567,14 +760,34 @@ export class SmartExtractor { */ private async handleProfileMerge( candidate: CandidateMemory, + conversationText: string, sessionKey: string, targetScope: string, - scopeFilter: string[], - ): Promise { + scopeFilter?: string[], + admissionAudit?: AdmissionAuditRecord, + ): Promise<"merged" | "created" | "rejected"> { // Find existing profile memory by category const embeddingText = `${candidate.abstract} ${candidate.content}`; const vector = await this.embedder.embed(embeddingText); + // Run admission control for profile candidates (they skip the main dedup path) + if (!admissionAudit && this.admissionController && vector && vector.length > 0) { + const profileAdmission = await this.admissionController.evaluate({ + candidate, + candidateVector: vector, + conversationText, + scopeFilter: scopeFilter ?? [targetScope], + }); + if (profileAdmission.decision === "reject") { + this.log( + `memory-pro: smart-extractor: admission rejected profile [${candidate.abstract.slice(0, 60)}] — ${profileAdmission.audit.reason}`, + ); + await this.recordRejectedAdmission(candidate, conversationText, sessionKey, targetScope, scopeFilter ?? [targetScope], profileAdmission.audit as AdmissionAuditRecord & { decision: "reject" }); + return "rejected"; + } + admissionAudit = profileAdmission.audit; + } + // Search for existing profile memories const existing = await this.store.vectorSearch( vector || [], @@ -595,12 +808,16 @@ export class SmartExtractor { await this.handleMerge( candidate, profileMatch.entry.id, - scopeFilter, targetScope, + scopeFilter, + undefined, + admissionAudit, ); + return "merged"; } else { // No existing profile — create new - await this.storeCandidate(candidate, vector || [], sessionKey, targetScope); + await this.storeCandidate(candidate, vector || [], sessionKey, targetScope, admissionAudit); + return "created"; } } @@ -610,9 +827,10 @@ export class SmartExtractor { private async handleMerge( candidate: CandidateMemory, matchId: string, - scopeFilter: string[], targetScope: string, + scopeFilter?: string[], contextLabel?: string, + admissionAudit?: AdmissionAuditRecord, ): Promise { let existingAbstract = ""; let existingOverview = ""; @@ -672,14 +890,17 @@ export class SmartExtractor { // Update existing memory via store.update() const existing = await this.store.getById(matchId, scopeFilter); const metadata = stringifySmartMetadata( - buildSmartMetadata(existing ?? { text: merged.abstract }, { - l0_abstract: merged.abstract, - l1_overview: merged.overview, - l2_content: merged.content, - memory_category: candidate.category, - tier: "working", - confidence: 0.8, - }), + this.withAdmissionAudit( + buildSmartMetadata(existing ?? { text: merged.abstract }, { + l0_abstract: merged.abstract, + l1_overview: merged.overview, + l2_content: merged.content, + memory_category: candidate.category, + tier: "working", + confidence: 0.8, + }), + admissionAudit, + ), ); await this.store.update( @@ -722,6 +943,7 @@ export class SmartExtractor { sessionKey: string, targetScope: string, scopeFilter: string[], + admissionAudit?: AdmissionAuditRecord, ): Promise { const existing = await this.store.getById(matchId, scopeFilter); if (!existing) { @@ -755,6 +977,12 @@ export class SmartExtractor { access_count: 0, confidence: 0.7, source_session: sessionKey, + source: "auto-capture", + state: "confirmed", // #350: write confirmed to unblock auto-recall + memory_layer: "working", + injected_count: 0, + bad_recall_count: 0, + suppressed_until_turn: 0, valid_from: now, fact_key: factKey, supersedes: matchId, @@ -797,10 +1025,11 @@ export class SmartExtractor { */ private async handleSupport( matchId: string, - scopeFilter: string[], source: { session: string; timestamp: number }, reason: string, contextLabel?: string, + scopeFilter?: string[], + admissionAudit?: AdmissionAuditRecord, ): Promise { const existing = await this.store.getById(matchId, scopeFilter); if (!existing) return; @@ -812,7 +1041,7 @@ export class SmartExtractor { await this.store.update( matchId, - { metadata: stringifySmartMetadata(meta) }, + { metadata: stringifySmartMetadata(this.withAdmissionAudit(meta, admissionAudit)) }, scopeFilter, ); @@ -831,11 +1060,12 @@ export class SmartExtractor { matchId: string, sessionKey: string, targetScope: string, - scopeFilter: string[], + scopeFilter?: string[], contextLabel?: string, + admissionAudit?: AdmissionAuditRecord, ): Promise { const storeCategory = this.mapToStoreCategory(candidate.category); - const metadata = stringifySmartMetadata({ + const metadata = stringifySmartMetadata(this.withAdmissionAudit({ l0_abstract: candidate.abstract, l1_overview: candidate.overview, l2_content: candidate.content, @@ -845,9 +1075,15 @@ export class SmartExtractor { confidence: 0.7, last_accessed_at: Date.now(), source_session: sessionKey, + source: "auto-capture" as const, + state: "confirmed" as const, // #350: write confirmed to unblock auto-recall + memory_layer: "working" as const, + injected_count: 0, + bad_recall_count: 0, + suppressed_until_turn: 0, contexts: contextLabel ? [contextLabel] : [], relations: [{ type: "contextualizes", targetId: matchId }], - }); + }, admissionAudit)); await this.store.store({ text: candidate.abstract, @@ -873,8 +1109,9 @@ export class SmartExtractor { matchId: string, sessionKey: string, targetScope: string, - scopeFilter: string[], + scopeFilter?: string[], contextLabel?: string, + admissionAudit?: AdmissionAuditRecord, ): Promise { // 1. Record contradiction on the existing memory const existing = await this.store.getById(matchId, scopeFilter); @@ -892,7 +1129,7 @@ export class SmartExtractor { // 2. Store the contradicting entry as a new memory const storeCategory = this.mapToStoreCategory(candidate.category); - const metadata = stringifySmartMetadata({ + const metadata = stringifySmartMetadata(this.withAdmissionAudit({ l0_abstract: candidate.abstract, l1_overview: candidate.overview, l2_content: candidate.content, @@ -902,9 +1139,15 @@ export class SmartExtractor { confidence: 0.7, last_accessed_at: Date.now(), source_session: sessionKey, + source: "auto-capture" as const, + state: "confirmed" as const, // #350: write confirmed to unblock auto-recall + memory_layer: "working" as const, + injected_count: 0, + bad_recall_count: 0, + suppressed_until_turn: 0, contexts: contextLabel ? [contextLabel] : [], relations: [{ type: "contradicts", targetId: matchId }], - }); + }, admissionAudit)); await this.store.store({ text: candidate.abstract, @@ -932,6 +1175,7 @@ export class SmartExtractor { vector: number[], sessionKey: string, targetScope: string, + admissionAudit?: AdmissionAuditRecord, ): Promise { // Map 6-category to existing store categories for backward compatibility const storeCategory = this.mapToStoreCategory(candidate.category); @@ -951,6 +1195,12 @@ export class SmartExtractor { access_count: 0, confidence: 0.7, source_session: sessionKey, + source: "auto-capture", + state: "confirmed", // #350: write confirmed to unblock auto-recall + memory_layer: "working", + injected_count: 0, + bad_recall_count: 0, + suppressed_until_turn: 0, }, ), ); @@ -1014,4 +1264,108 @@ export class SmartExtractor { return 0.5; } } + + // -------------------------------------------------------------------------- + // Admission Control Helpers + // -------------------------------------------------------------------------- + + /** + * Embed admission audit record into metadata if audit persistence is enabled. + */ + private withAdmissionAudit>( + metadata: T, + admissionAudit?: AdmissionAuditRecord, + ): T & { admission_control?: AdmissionAuditRecord } { + if (!admissionAudit || !this.persistAdmissionAudit) { + return metadata as T & { admission_control?: AdmissionAuditRecord }; + } + return { ...metadata, admission_control: admissionAudit }; + } + + /** + * Record a rejected admission to the durable audit log. + */ + private async recordRejectedAdmission( + candidate: CandidateMemory, + conversationText: string, + sessionKey: string, + targetScope: string, + scopeFilter: string[], + audit: AdmissionAuditRecord & { decision: "reject" }, + ): Promise { + if (!this.onAdmissionRejected) { + return; + } + try { + await this.onAdmissionRejected({ + version: "amac-v1", + rejected_at: Date.now(), + session_key: sessionKey, + target_scope: targetScope, + scope_filter: scopeFilter, + candidate, + audit, + conversation_excerpt: conversationText.slice(-1200), + }); + } catch (err) { + this.log( + `memory-lancedb-pro: smart-extractor: rejected admission audit write failed: ${String(err)}`, + ); + } + } +} + +// ============================================================================ +// Extraction Rate Limiter (Feature 7: Adaptive Extraction Throttling) +// ============================================================================ + +const ONE_HOUR_MS = 60 * 60 * 1000; + +export interface ExtractionRateLimiterOptions { + /** Maximum number of extractions allowed per hour (default: 30) */ + maxExtractionsPerHour?: number; +} + +export interface ExtractionRateLimiter { + /** Check whether the current rate would exceed the limit */ + isRateLimited(): boolean; + /** Record a new extraction timestamp */ + recordExtraction(): void; + /** Get the number of extractions in the current window */ + getRecentCount(): number; +} + +/** + * Create an extraction rate limiter that tracks timestamps in a sliding + * one-hour window. + */ +export function createExtractionRateLimiter( + options: ExtractionRateLimiterOptions = {}, +): ExtractionRateLimiter { + const maxPerHour = options.maxExtractionsPerHour ?? 30; + const timestamps: number[] = []; + + function pruneOld(): void { + const cutoff = Date.now() - ONE_HOUR_MS; + while (timestamps.length > 0 && timestamps[0] < cutoff) { + timestamps.shift(); + } + } + + return { + isRateLimited(): boolean { + pruneOld(); + return timestamps.length >= maxPerHour; + }, + + recordExtraction(): void { + pruneOld(); + timestamps.push(Date.now()); + }, + + getRecentCount(): number { + pruneOld(); + return timestamps.length; + }, + }; } diff --git a/src/smart-metadata.ts b/src/smart-metadata.ts index 41e8ce3d..559eab4d 100644 --- a/src/smart-metadata.ts +++ b/src/smart-metadata.ts @@ -26,6 +26,15 @@ export interface MemoryRelation { targetId: string; } +export type MemoryState = "pending" | "confirmed" | "archived"; +export type MemoryLayer = "durable" | "working" | "reflection" | "archive"; +export type MemorySource = + | "manual" + | "auto-capture" + | "reflection" + | "session-summary" + | "legacy"; + export interface SmartMemoryMetadata { l0_abstract: string; l1_overview: string; @@ -42,6 +51,15 @@ export interface SmartMemoryMetadata { superseded_by?: string; relations?: MemoryRelation[]; source_session?: string; + state: MemoryState; + source: MemorySource; + memory_layer: MemoryLayer; + injected_count: number; + last_injected_at?: number; + last_confirmed_use_at?: number; + bad_recall_count: number; + suppressed_until_turn: number; + canonical_id?: string; [key: string]: unknown; } @@ -78,6 +96,59 @@ function normalizeTier(value: unknown): MemoryTier { } } +function normalizeState(value: unknown): MemoryState { + switch (value) { + case "pending": + case "confirmed": + case "archived": + return value; + default: + return "confirmed"; + } +} + +function normalizeSource(value: unknown): MemorySource { + switch (value) { + case "manual": + case "auto-capture": + case "reflection": + case "session-summary": + case "legacy": + return value; + default: + return "legacy"; + } +} + +function normalizeLayer(value: unknown): MemoryLayer { + switch (value) { + case "durable": + case "working": + case "reflection": + case "archive": + return value; + default: + return "working"; + } +} + +function deriveDefaultLayer( + source: MemorySource, + memoryCategory: MemoryCategory, + state: MemoryState, +): MemoryLayer { + if (source === "reflection" || source === "session-summary") return "reflection"; + if (state === "archived") return "archive"; + if ( + memoryCategory === "profile" || + memoryCategory === "preferences" || + memoryCategory === "events" + ) { + return "durable"; + } + return "working"; +} + export function reverseMapLegacyCategory( oldCategory: LegacyStoreCategory | undefined, text = "", @@ -190,6 +261,19 @@ export function parseSmartMetadata( const l2 = normalizeText(parsed.l2_content, text); const validFrom = normalizeTimestamp(parsed.valid_from, timestamp); const invalidatedAt = normalizeOptionalTimestamp(parsed.invalidated_at); + const fallbackSource = + parsed.type === "session-summary" + ? "session-summary" + : parsed.type === "memory-reflection" || parsed.type === "memory-reflection-item" + ? "reflection" + : "legacy"; + const source = normalizeSource(parsed.source ?? fallbackSource); + const defaultState = + source === "session-summary" ? "archived" : "confirmed"; + const state = normalizeState(parsed.state ?? defaultState); + const memoryLayer = normalizeLayer( + parsed.memory_layer ?? deriveDefaultLayer(source, memoryCategory, state), + ); const normalized: SmartMemoryMetadata = { ...parsed, l0_abstract: l0, @@ -218,6 +302,15 @@ export function parseSmartMetadata( superseded_by: normalizeOptionalString(parsed.superseded_by), source_session: typeof parsed.source_session === "string" ? parsed.source_session : undefined, + state, + source, + memory_layer: memoryLayer, + injected_count: clampCount(parsed.injected_count, 0), + last_injected_at: normalizeOptionalTimestamp(parsed.last_injected_at), + last_confirmed_use_at: normalizeOptionalTimestamp(parsed.last_confirmed_use_at), + bad_recall_count: clampCount(parsed.bad_recall_count, 0), + suppressed_until_turn: clampCount(parsed.suppressed_until_turn, 0), + canonical_id: normalizeOptionalString(parsed.canonical_id), }; return normalized; @@ -233,6 +326,14 @@ export function buildSmartMetadata( typeof patch.memory_category === "string" ? patch.memory_category : base.memory_category; + const nextSource = + patch.source !== undefined ? normalizeSource(patch.source) : base.source; + const nextState = + patch.state !== undefined ? normalizeState(patch.state) : base.state; + const nextLayer = + patch.memory_layer !== undefined + ? normalizeLayer(patch.memory_layer) + : base.memory_layer; const validFrom = normalizeTimestamp(patch.valid_from, base.valid_from); const invalidatedAt = patch.invalidated_at === undefined @@ -271,6 +372,27 @@ export function buildSmartMetadata( typeof patch.source_session === "string" ? patch.source_session : base.source_session, + source: nextSource, + state: nextState, + memory_layer: nextLayer, + injected_count: clampCount(patch.injected_count, base.injected_count), + last_injected_at: + patch.last_injected_at === undefined + ? base.last_injected_at + : normalizeOptionalTimestamp(patch.last_injected_at), + last_confirmed_use_at: + patch.last_confirmed_use_at === undefined + ? base.last_confirmed_use_at + : normalizeOptionalTimestamp(patch.last_confirmed_use_at), + bad_recall_count: clampCount(patch.bad_recall_count, base.bad_recall_count), + suppressed_until_turn: clampCount( + patch.suppressed_until_turn, + base.suppressed_until_turn, + ), + canonical_id: + patch.canonical_id === undefined + ? base.canonical_id + : normalizeOptionalString(patch.canonical_id), }; } diff --git a/src/store.ts b/src/store.ts index afeab6aa..ce80034c 100644 --- a/src/store.ts +++ b/src/store.ts @@ -12,7 +12,7 @@ import { realpathSync, lstatSync, } from "node:fs"; -import { dirname } from "node:path"; +import { dirname, join } from "node:path"; import { buildSmartMetadata, isMemoryActiveAt, parseSmartMetadata, stringifySmartMetadata } from "./smart-metadata.js"; // ============================================================================ @@ -51,11 +51,25 @@ export interface MetadataPatch { let lancedbImportPromise: Promise | null = null; +// ========================================================================= +// Cross-Process File Lock (proper-lockfile) +// ========================================================================= + +let lockfileModule: any = null; + +async function loadLockfile(): Promise { + if (!lockfileModule) { + lockfileModule = await import("proper-lockfile"); + } + return lockfileModule; +} + export const loadLanceDB = async (): Promise< typeof import("@lancedb/lancedb") > => { if (!lancedbImportPromise) { - lancedbImportPromise = import("@lancedb/lancedb"); + // Use require() for CommonJS modules on Windows to avoid ESM URL scheme issues + lancedbImportPromise = Promise.resolve(require("@lancedb/lancedb")); } try { return await lancedbImportPromise; @@ -84,6 +98,10 @@ function normalizeSearchText(value: string): string { return value.toLowerCase().trim(); } +function isExplicitDenyAllScopeFilter(scopeFilter?: string[]): boolean { + return Array.isArray(scopeFilter) && scopeFilter.length === 0; +} + function scoreLexicalHit(query: string, candidates: Array<{ text: string; weight: number }>): number { const normalizedQuery = normalizeSearchText(query); if (!normalizedQuery) return 0; @@ -184,6 +202,20 @@ export class MemoryStore { constructor(private readonly config: StoreConfig) { } + private async runWithFileLock(fn: () => Promise): Promise { + const lockfile = await loadLockfile(); + const lockPath = join(this.config.dbPath, ".memory-write.lock"); + if (!existsSync(lockPath)) { + try { mkdirSync(dirname(lockPath), { recursive: true }); } catch {} + try { const { writeFileSync } = await import("node:fs"); writeFileSync(lockPath, "", { flag: "wx" }); } catch {} + } + const release = await lockfile.lock(lockPath, { + retries: { retries: 5, factor: 2, minTimeout: 100, maxTimeout: 2000 }, + stale: 10000, + }); + try { return await fn(); } finally { await release(); } + } + get dbPath(): string { return this.config.dbPath; } @@ -226,16 +258,39 @@ export class MemoryStore { try { table = await db.openTable(TABLE_NAME); - // Check if we need to add scope column for backward compatibility + // Migrate legacy tables: add missing columns for backward compatibility try { - const sample = await table.query().limit(1).toArray(); - if (sample.length > 0 && !("scope" in sample[0])) { + const schema = await table.schema(); + const fieldNames = new Set(schema.fields.map((f: { name: string }) => f.name)); + + const missingColumns: Array<{ name: string; valueSql: string }> = []; + if (!fieldNames.has("scope")) { + missingColumns.push({ name: "scope", valueSql: "'global'" }); + } + if (!fieldNames.has("timestamp")) { + missingColumns.push({ name: "timestamp", valueSql: "CAST(0 AS DOUBLE)" }); + } + if (!fieldNames.has("metadata")) { + missingColumns.push({ name: "metadata", valueSql: "'{}'" }); + } + + if (missingColumns.length > 0) { console.warn( - "Adding scope column for backward compatibility with existing data", + `memory-lancedb-pro: migrating legacy table — adding columns: ${missingColumns.map((c) => c.name).join(", ")}`, + ); + await table.addColumns(missingColumns); + console.log( + `memory-lancedb-pro: migration complete — ${missingColumns.length} column(s) added`, ); } } catch (err) { - console.warn("Could not check table schema:", err); + const msg = String(err); + if (msg.includes("already exists")) { + // Concurrent initialization race — another process already added the columns + console.log("memory-lancedb-pro: migration columns already exist (concurrent init)"); + } else { + console.warn("memory-lancedb-pro: could not check/migrate table schema:", err); + } } } catch (_openErr) { // Table doesn't exist yet — create it @@ -329,16 +384,18 @@ export class MemoryStore { metadata: entry.metadata || "{}", }; - try { - await this.table!.add([fullEntry]); - } catch (err: any) { - const code = err.code || ""; - const message = err.message || String(err); - throw new Error( - `Failed to store memory in "${this.config.dbPath}": ${code} ${message}`, - ); - } - return fullEntry; + return this.runWithFileLock(async () => { + try { + await this.table!.add([fullEntry]); + } catch (err: any) { + const code = err.code || ""; + const message = err.message || String(err); + throw new Error( + `Failed to store memory in "${this.config.dbPath}": ${code} ${message}`, + ); + } + return fullEntry; + }); } /** @@ -370,8 +427,10 @@ export class MemoryStore { metadata: entry.metadata || "{}", }; - await this.table!.add([full]); - return full; + return this.runWithFileLock(async () => { + await this.table!.add([full]); + return full; + }); } async hasId(id: string): Promise { @@ -388,6 +447,8 @@ export class MemoryStore { async getById(id: string, scopeFilter?: string[]): Promise { await this.ensureInitialized(); + if (isExplicitDenyAllScopeFilter(scopeFilter)) return null; + const safeId = escapeSqlLiteral(id); const rows = await this.table! .query() @@ -418,6 +479,8 @@ export class MemoryStore { async vectorSearch(vector: number[], limit = 5, minScore = 0.3, scopeFilter?: string[], options?: { excludeInactive?: boolean }): Promise { await this.ensureInitialized(); + if (isExplicitDenyAllScopeFilter(scopeFilter)) return []; + const safeLimit = clampInt(limit, 1, 20); // Over-fetch more aggressively when filtering inactive records, // because superseded historical rows can crowd out active ones. @@ -487,6 +550,8 @@ export class MemoryStore { ): Promise { await this.ensureInitialized(); + if (isExplicitDenyAllScopeFilter(scopeFilter)) return []; + const safeLimit = clampInt(limit, 1, 20); const inactiveFilter = options?.excludeInactive ?? false; // Over-fetch when filtering inactive records to avoid crowding @@ -563,6 +628,8 @@ export class MemoryStore { } private async lexicalFallbackSearch(query: string, limit: number, scopeFilter?: string[], options?: { excludeInactive?: boolean }): Promise { + if (isExplicitDenyAllScopeFilter(scopeFilter)) return []; + const trimmedQuery = query.trim(); if (!trimmedQuery) return []; @@ -630,6 +697,10 @@ export class MemoryStore { async delete(id: string, scopeFilter?: string[]): Promise { await this.ensureInitialized(); + if (isExplicitDenyAllScopeFilter(scopeFilter)) { + throw new Error(`Memory ${id} is outside accessible scopes`); + } + // Support both full UUID and short prefix (8+ hex chars) const uuidRegex = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i; @@ -676,8 +747,10 @@ export class MemoryStore { throw new Error(`Memory ${resolvedId} is outside accessible scopes`); } - await this.table!.delete(`id = '${resolvedId}'`); - return true; + return this.runWithFileLock(async () => { + await this.table!.delete(`id = '${resolvedId}'`); + return true; + }); } async list( @@ -688,6 +761,8 @@ export class MemoryStore { ): Promise { await this.ensureInitialized(); + if (isExplicitDenyAllScopeFilter(scopeFilter)) return []; + let query = this.table!.query(); // Build where conditions @@ -745,6 +820,14 @@ export class MemoryStore { }> { await this.ensureInitialized(); + if (isExplicitDenyAllScopeFilter(scopeFilter)) { + return { + totalCount: 0, + scopeCounts: {}, + categoryCounts: {}, + }; + } + let query = this.table!.query(); if (scopeFilter && scopeFilter.length > 0) { @@ -787,7 +870,11 @@ export class MemoryStore { ): Promise { await this.ensureInitialized(); - return this.runSerializedUpdate(async () => { + if (isExplicitDenyAllScopeFilter(scopeFilter)) { + throw new Error(`Memory ${id} is outside accessible scopes`); + } + + return this.runWithFileLock(() => this.runSerializedUpdate(async () => { // Support both full UUID and short prefix (8+ hex chars), same as delete() const uuidRegex = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i; @@ -902,7 +989,7 @@ export class MemoryStore { } return updated; - }); + })); } private async runSerializedUpdate(action: () => Promise): Promise { @@ -961,16 +1048,18 @@ export class MemoryStore { const whereClause = conditions.join(" AND "); - // Count first - const countResults = await this.table!.query().where(whereClause).toArray(); - const deleteCount = countResults.length; + return this.runWithFileLock(async () => { + // Count first + const countResults = await this.table!.query().where(whereClause).toArray(); + const deleteCount = countResults.length; - // Then delete - if (deleteCount > 0) { - await this.table!.delete(whereClause); - } + // Then delete + if (deleteCount > 0) { + await this.table!.delete(whereClause); + } - return deleteCount; + return deleteCount; + }); } get hasFtsSupport(): boolean { @@ -1019,4 +1108,49 @@ export class MemoryStore { return { success: false, error: msg }; } } + + /** + * Fetch memories older than `maxTimestamp` including their raw vectors. + * Used exclusively by the memory compactor; vectors are intentionally + * omitted from `list()` for performance, but compaction needs them for + * cosine-similarity clustering. + */ + async fetchForCompaction( + maxTimestamp: number, + scopeFilter?: string[], + limit = 200, + ): Promise { + await this.ensureInitialized(); + + const conditions: string[] = [`timestamp < ${maxTimestamp}`]; + + if (scopeFilter && scopeFilter.length > 0) { + const scopeConditions = scopeFilter + .map((scope) => `scope = '${escapeSqlLiteral(scope)}'`) + .join(" OR "); + conditions.push(`((${scopeConditions}) OR scope IS NULL)`); + } + + const whereClause = conditions.join(" AND "); + + const results = await this.table! + .query() + .where(whereClause) + .toArray(); + + return results + .slice(0, limit) + .map( + (row): MemoryEntry => ({ + id: row.id as string, + text: row.text as string, + vector: Array.isArray(row.vector) ? (row.vector as number[]) : [], + category: row.category as MemoryEntry["category"], + scope: (row.scope as string | undefined) ?? "global", + importance: Number(row.importance), + timestamp: Number(row.timestamp), + metadata: (row.metadata as string) || "{}", + }), + ); + } } diff --git a/src/tools.ts b/src/tools.ts index 892bbea4..3f587a30 100644 --- a/src/tools.ts +++ b/src/tools.ts @@ -11,7 +11,7 @@ import { join } from "node:path"; import type { MemoryRetriever, RetrievalResult } from "./retriever.js"; import type { MemoryStore } from "./store.js"; import { isNoise } from "./noise-filter.js"; -import type { MemoryScopeManager } from "./scopes.js"; +import { isSystemBypassId, resolveScopeFilter, parseAgentIdFromSessionKey, type MemoryScopeManager } from "./scopes.js"; import type { Embedder } from "./embedder.js"; import { appendRelation, @@ -23,6 +23,12 @@ import { import { TEMPORAL_VERSIONED_CATEGORIES } from "./memory-categories.js"; import { appendSelfImprovementEntry, ensureSelfImprovementLearningFiles } from "./self-improvement-files.js"; import { getDisplayCategoryTag } from "./reflection-metadata.js"; +import type { RetrievalTrace } from "./retrieval-trace.js"; +import { + filterUserMdExclusiveRecallResults, + isUserMdExclusiveMemory, + type WorkspaceBoundaryConfig, +} from "./workspace-boundary.js"; // ============================================================================ // Types @@ -56,6 +62,7 @@ interface ToolContext { agentId?: string; workspaceDir?: string; mdMirror?: MdMirrorWriter | null; + workspaceBoundary?: WorkspaceBoundaryConfig; } function resolveAgentId(runtimeAgentId: unknown, fallback?: string): string | undefined { @@ -78,6 +85,23 @@ function clamp01(value: number, fallback = 0.7): number { return Math.min(1, Math.max(0, value)); } +function normalizeInlineText(text: string): string { + return text.replace(/[\r\n]+/g, " ").replace(/\s+/g, " ").trim(); +} + +function truncateText(text: string, maxChars: number): string { + if (text.length <= maxChars) return text; + const clipped = text.slice(0, Math.max(1, maxChars - 1)).trimEnd(); + return `${clipped}…`; +} + +function deriveManualMemoryLayer(category: string): "durable" | "working" { + if (category === "preference" || category === "decision" || category === "fact") { + return "durable"; + } + return "working"; +} + function sanitizeMemoryForSerialization(results: RetrievalResult[]) { return results.map((r) => ({ id: r.entry.id, @@ -91,21 +115,40 @@ function sanitizeMemoryForSerialization(results: RetrievalResult[]) { })); } -function parseAgentIdFromSessionKey(sessionKey: string | undefined): string | undefined { - if (!sessionKey) return undefined; - const m = /^agent:([^:]+):/.exec(sessionKey); - return m?.[1]; +const _warnedMissingAgentId = new Set(); + +/** @internal Exported for testing only — resets the missing-agent warning throttle. */ +export function _resetWarnedMissingAgentIdState(): void { + _warnedMissingAgentId.clear(); } function resolveRuntimeAgentId( staticAgentId: string | undefined, runtimeCtx: unknown, -): string | undefined { - if (!runtimeCtx || typeof runtimeCtx !== "object") return staticAgentId; +): string { + if (!runtimeCtx || typeof runtimeCtx !== "object") { + const fallback = staticAgentId?.trim(); + if (!fallback && !_warnedMissingAgentId.has("no-context")) { + _warnedMissingAgentId.add("no-context"); + console.warn( + "resolveRuntimeAgentId: no runtime context or static agentId, defaulting to 'main'. " + + "Tool callers without explicit agentId will be scoped to agent:main + global + reflection:agent:main." + ); + } + return fallback || "main"; + } const ctx = runtimeCtx as Record; const ctxAgentId = typeof ctx.agentId === "string" ? ctx.agentId : undefined; const ctxSessionKey = typeof ctx.sessionKey === "string" ? ctx.sessionKey : undefined; - return ctxAgentId || parseAgentIdFromSessionKey(ctxSessionKey) || staticAgentId; + const resolved = ctxAgentId || parseAgentIdFromSessionKey(ctxSessionKey) || staticAgentId; + const trimmed = resolved?.trim(); + if (!trimmed && !_warnedMissingAgentId.has("empty-resolved")) { + _warnedMissingAgentId.add("empty-resolved"); + console.warn( + "resolveRuntimeAgentId: resolved agentId is empty after trim, defaulting to 'main'." + ); + } + return trimmed ? trimmed : "main"; } function resolveToolContext( @@ -139,6 +182,60 @@ async function retrieveWithRetry( return results; } +async function resolveMemoryId( + context: ToolContext, + memoryRef: string, + scopeFilter: string[], +): Promise< + | { ok: true; id: string } + | { ok: false; message: string; details?: Record } +> { + const trimmed = memoryRef.trim(); + if (!trimmed) { + return { + ok: false, + message: "memoryId/query 不能为空。", + details: { error: "empty_memory_ref" }, + }; + } + + const uuidLike = /^[0-9a-f]{8}(-[0-9a-f]{4}){0,4}/i.test(trimmed); + if (uuidLike) { + return { ok: true, id: trimmed }; + } + + const results = await retrieveWithRetry(context.retriever, { + query: trimmed, + limit: 5, + scopeFilter, + }); + if (results.length === 0) { + return { + ok: false, + message: `No memory found matching "${trimmed}".`, + details: { error: "not_found", query: trimmed }, + }; + } + if (results.length === 1 || results[0].score > 0.85) { + return { ok: true, id: results[0].entry.id }; + } + + const list = results + .map( + (r) => + `- [${r.entry.id.slice(0, 8)}] ${r.entry.text.slice(0, 60)}${r.entry.text.length > 60 ? "..." : ""}`, + ) + .join("\n"); + return { + ok: false, + message: `Multiple matches. Specify memoryId:\n${list}`, + details: { + action: "candidates", + candidates: sanitizeMemoryForSerialization(results), + }, + }; +} + function resolveWorkspaceDir(toolCtx: unknown, fallback?: string): string { const runtime = toolCtx as Record | undefined; const runtimePath = typeof runtime?.workspaceDir === "string" ? runtime.workspaceDir.trim() : ""; @@ -408,7 +505,17 @@ export function registerMemoryRecallTool( }), limit: Type.Optional( Type.Number({ - description: "Max results to return (default: 5, max: 20)", + description: "Max results to return (default: 3, max: 20; summary mode soft max: 6)", + }), + ), + includeFullText: Type.Optional( + Type.Boolean({ + description: "Return full memory text when true (default: false returns summary previews)", + }), + ), + maxCharsPerItem: Type.Optional( + Type.Number({ + description: "Maximum characters per returned memory in summary mode (default: 180)", }), ), scope: Type.Optional( @@ -421,22 +528,29 @@ export function registerMemoryRecallTool( async execute(_toolCallId, params) { const { query, - limit = 5, + limit = 3, + includeFullText = false, + maxCharsPerItem = 180, scope, category, } = params as { query: string; limit?: number; + includeFullText?: boolean; + maxCharsPerItem?: number; scope?: string; category?: string; }; try { - const safeLimit = clampInt(limit, 1, 20); + const safeLimit = includeFullText + ? clampInt(limit, 1, 20) + : clampInt(limit, 1, 6); + const safeCharsPerItem = clampInt(maxCharsPerItem, 60, 1000); const agentId = runtimeContext.agentId; // Determine accessible scopes - let scopeFilter = runtimeContext.scopeManager.getAccessibleScopes(agentId); + let scopeFilter = resolveScopeFilter(runtimeContext.scopeManager, agentId); if (scope) { if (runtimeContext.scopeManager.isAccessible(scope, agentId)) { scopeFilter = [scope]; @@ -453,13 +567,13 @@ export function registerMemoryRecallTool( } } - const results = await retrieveWithRetry(runtimeContext.retriever, { + const results = filterUserMdExclusiveRecallResults(await retrieveWithRetry(runtimeContext.retriever, { query, limit: safeLimit, scopeFilter, category, source: "manual", - }); + }), runtimeContext.workspaceBoundary); if (results.length === 0) { return { @@ -477,6 +591,9 @@ export function registerMemoryRecallTool( { access_count: meta.access_count + 1, last_accessed_at: now, + last_confirmed_use_at: now, + bad_recall_count: 0, + suppressed_until_turn: 0, }, scopeFilter, ); @@ -486,10 +603,27 @@ export function registerMemoryRecallTool( const text = results .map((r, i) => { const categoryTag = getDisplayCategoryTag(r.entry); - return `${i + 1}. [${r.entry.id}] [${categoryTag}] ${r.entry.text}`; + const metadata = parseSmartMetadata(r.entry.metadata, r.entry); + const base = includeFullText + ? (metadata.l2_content || metadata.l1_overview || r.entry.text) + : (metadata.l0_abstract || r.entry.text); + const inline = normalizeInlineText(base); + const rendered = includeFullText + ? inline + : truncateText(inline, safeCharsPerItem); + return `${i + 1}. [${r.entry.id}] [${categoryTag}] ${rendered}`; }) .join("\n"); + const serializedMemories = sanitizeMemoryForSerialization(results); + if (includeFullText) { + for (let i = 0; i < results.length; i++) { + const metadata = parseSmartMetadata(results[i].entry.metadata, results[i].entry); + (serializedMemories[i] as Record).fullText = + metadata.l2_content || metadata.l1_overview || results[i].entry.text; + } + } + return { content: [ { @@ -499,7 +633,7 @@ export function registerMemoryRecallTool( ], details: { count: results.length, - memories: sanitizeMemoryForSerialization(results), + memories: serializedMemories, query, scopes: scopeFilter, retrievalMode: runtimeContext.retriever.getConfig().mode, @@ -563,7 +697,24 @@ export function registerMemoryStoreTool( try { const agentId = runtimeContext.agentId; // Determine target scope - let targetScope = scope || runtimeContext.scopeManager.getDefaultScope(agentId); + let targetScope = scope; + if (!targetScope) { + if (isSystemBypassId(agentId)) { + return { + content: [ + { + type: "text", + text: "Reserved bypass agent IDs must provide an explicit scope for memory_store writes.", + }, + ], + details: { + error: "explicit_scope_required", + agentId, + }, + }; + } + targetScope = runtimeContext.scopeManager.getDefaultScope(agentId); + } // Validate scope access if (!runtimeContext.scopeManager.isAccessible(targetScope, agentId)) { @@ -594,6 +745,26 @@ export function registerMemoryStoreTool( }; } + if ( + isUserMdExclusiveMemory( + { text }, + runtimeContext.workspaceBoundary, + ) + ) { + return { + content: [ + { + type: "text", + text: "Skipped: this fact belongs in USER.md, not plugin memory.", + }, + ], + details: { + action: "skipped_by_workspace_boundary", + boundary: "user_md_exclusive", + }, + }; + } + const safeImportance = clamp01(importance, 0.7); const vector = await runtimeContext.embedder.embedPassage(text); @@ -646,6 +817,12 @@ export function registerMemoryStoreTool( l0_abstract: text, l1_overview: `- ${text}`, l2_content: text, + source: "manual", + state: "confirmed", + memory_layer: deriveManualMemoryLayer(category as string), + last_confirmed_use_at: Date.now(), + bad_recall_count: 0, + suppressed_until_turn: 0, }, ), ), @@ -698,7 +875,7 @@ export function registerMemoryForgetTool( ) { api.registerTool( (toolCtx) => { - const agentId = resolveAgentId((toolCtx as any)?.agentId, context.agentId) ?? "main"; + const runtimeContext = resolveToolContext(context, toolCtx); return { name: "memory_forget", label: "Memory Forget", @@ -725,11 +902,11 @@ export function registerMemoryForgetTool( }; try { - const agentId = resolveRuntimeAgentId(context.agentId, runtimeCtx); + const agentId = resolveRuntimeAgentId(runtimeContext.agentId, runtimeCtx); // Determine accessible scopes - let scopeFilter = context.scopeManager.getAccessibleScopes(agentId); + let scopeFilter = resolveScopeFilter(runtimeContext.scopeManager, agentId); if (scope) { - if (context.scopeManager.isAccessible(scope, agentId)) { + if (runtimeContext.scopeManager.isAccessible(scope, agentId)) { scopeFilter = [scope]; } else { return { @@ -858,7 +1035,7 @@ export function registerMemoryUpdateTool( ) { api.registerTool( (toolCtx) => { - const agentId = resolveAgentId((toolCtx as any)?.agentId, context.agentId) ?? "main"; + const runtimeContext = resolveToolContext(context, toolCtx); return { name: "memory_update", label: "Memory Update", @@ -901,8 +1078,8 @@ export function registerMemoryUpdateTool( } // Determine accessible scopes - const agentId = resolveRuntimeAgentId(context.agentId, runtimeCtx); - const scopeFilter = context.scopeManager.getAccessibleScopes(agentId); + const agentId = resolveRuntimeAgentId(runtimeContext.agentId, runtimeCtx); + const scopeFilter = resolveScopeFilter(runtimeContext.scopeManager, agentId); // Resolve memoryId: if it doesn't look like a UUID, try search let resolvedId = memoryId; @@ -1122,7 +1299,7 @@ export function registerMemoryStatsTool( ) { api.registerTool( (toolCtx) => { - const agentId = resolveAgentId((toolCtx as any)?.agentId, context.agentId) ?? "main"; + const runtimeContext = resolveToolContext(context, toolCtx); return { name: "memory_stats", label: "Memory Statistics", @@ -1138,9 +1315,9 @@ export function registerMemoryStatsTool( const { scope } = params as { scope?: string }; try { - const agentId = resolveRuntimeAgentId(context.agentId, runtimeCtx); + const agentId = resolveRuntimeAgentId(runtimeContext.agentId, runtimeCtx); // Determine accessible scopes - let scopeFilter = context.scopeManager.getAccessibleScopes(agentId); + let scopeFilter = resolveScopeFilter(context.scopeManager, agentId); if (scope) { if (context.scopeManager.isAccessible(scope, agentId)) { scopeFilter = [scope]; @@ -1161,23 +1338,48 @@ export function registerMemoryStatsTool( const scopeManagerStats = context.scopeManager.getStats(); const retrievalConfig = context.retriever.getConfig(); - const text = [ + const textLines = [ `Memory Statistics:`, - `• Total memories: ${stats.totalCount}`, - `• Available scopes: ${scopeManagerStats.totalScopes}`, - `• Retrieval mode: ${retrievalConfig.mode}`, - `• FTS support: ${context.store.hasFtsSupport ? "Yes" : "No"}`, + `\u2022 Total memories: ${stats.totalCount}`, + `\u2022 Available scopes: ${scopeManagerStats.totalScopes}`, + `\u2022 Retrieval mode: ${retrievalConfig.mode}`, + `\u2022 FTS support: ${context.store.hasFtsSupport ? "Yes" : "No"}`, ``, `Memories by scope:`, ...Object.entries(stats.scopeCounts).map( - ([s, count]) => ` • ${s}: ${count}`, + ([s, count]) => ` \u2022 ${s}: ${count}`, ), ``, `Memories by category:`, ...Object.entries(stats.categoryCounts).map( - ([c, count]) => ` • ${c}: ${count}`, + ([c, count]) => ` \u2022 ${c}: ${count}`, ), - ].join("\n"); + ]; + + // Include retrieval quality metrics if stats collector is available + const statsCollector = context.retriever.getStatsCollector(); + let retrievalStats = undefined; + if (statsCollector && statsCollector.count > 0) { + retrievalStats = statsCollector.getStats(); + textLines.push( + ``, + `Retrieval Quality (last ${retrievalStats.totalQueries} queries):`, + ` \u2022 Zero-result queries: ${retrievalStats.zeroResultQueries}`, + ` \u2022 Avg latency: ${retrievalStats.avgLatencyMs}ms`, + ` \u2022 P95 latency: ${retrievalStats.p95LatencyMs}ms`, + ` \u2022 Avg result count: ${retrievalStats.avgResultCount}`, + ` \u2022 Rerank used: ${retrievalStats.rerankUsed}`, + ` \u2022 Noise filtered: ${retrievalStats.noiseFiltered}`, + ); + if (retrievalStats.topDropStages.length > 0) { + textLines.push(` Top drop stages:`); + for (const ds of retrievalStats.topDropStages) { + textLines.push(` \u2022 ${ds.name}: ${ds.totalDropped} dropped`); + } + } + } + + const text = textLines.join("\n"); return { content: [{ type: "text", text }], @@ -1189,6 +1391,7 @@ export function registerMemoryStatsTool( rerankApiKey: retrievalConfig.rerankApiKey ? "***" : undefined, }, hasFtsSupport: context.store.hasFtsSupport, + retrievalStats, }, }; } catch (error) { @@ -1209,13 +1412,126 @@ export function registerMemoryStatsTool( ); } -export function registerMemoryListTool( +export function registerMemoryDebugTool( api: OpenClawPluginApi, context: ToolContext, ) { api.registerTool( (toolCtx) => { const agentId = resolveAgentId((toolCtx as any)?.agentId, context.agentId) ?? "main"; + return { + name: "memory_debug", + label: "Memory Debug", + description: + "Debug memory retrieval: search with full pipeline trace showing per-stage drop info, score ranges, and timing.", + parameters: Type.Object({ + query: Type.String({ description: "Search query to debug" }), + limit: Type.Optional( + Type.Number({ description: "Max results to return (default: 5, max: 20)" }), + ), + scope: Type.Optional( + Type.String({ description: "Specific memory scope to search in (optional)" }), + ), + }), + async execute(_toolCallId, params) { + const { query, limit = 5, scope } = params as { + query: string; limit?: number; scope?: string; + }; + try { + const safeLimit = clampInt(limit, 1, 20); + let scopeFilter = resolveScopeFilter(context.scopeManager, agentId); + if (scope) { + if (context.scopeManager.isAccessible(scope, agentId)) { + scopeFilter = [scope]; + } else { + return { + content: [{ type: "text", text: `Access denied to scope: ${scope}` }], + details: { error: "scope_access_denied", requestedScope: scope }, + }; + } + } + + const { results, trace } = await context.retriever.retrieveWithTrace({ + query, limit: safeLimit, scopeFilter, source: "manual", + }); + + const traceLines: string[] = [ + `Retrieval Debug Trace:`, + ` Mode: ${trace.mode}`, + ` Total: ${trace.totalMs}ms`, + ` Stages:`, + ]; + for (const stage of trace.stages) { + const dropped = Math.max(0, stage.inputCount - stage.outputCount); + const scoreStr = stage.scoreRange + ? ` scores=[${stage.scoreRange[0].toFixed(3)}, ${stage.scoreRange[1].toFixed(3)}]` + : ""; + // For search stages (input=0), show "found N" instead of "dropped -N" + const dropStr = stage.inputCount === 0 + ? `found ${stage.outputCount}` + : `${stage.inputCount} -> ${stage.outputCount} (-${dropped})`; + traceLines.push( + ` ${stage.name}: ${dropStr} ${stage.durationMs}ms${scoreStr}`, + ); + if (stage.droppedIds.length > 0 && stage.droppedIds.length <= 3) { + traceLines.push(` dropped: ${stage.droppedIds.join(", ")}`); + } else if (stage.droppedIds.length > 3) { + traceLines.push( + ` dropped: ${stage.droppedIds.slice(0, 3).join(", ")} (+${stage.droppedIds.length - 3} more)`, + ); + } + } + + if (results.length === 0) { + traceLines.push(``, `No results survived the pipeline.`); + return { + content: [{ type: "text", text: traceLines.join("\n") }], + details: { count: 0, query, trace }, + }; + } + + const resultLines = results.map((r, i) => { + const sources: string[] = []; + if (r.sources.vector) sources.push("vector"); + if (r.sources.bm25) sources.push("BM25"); + if (r.sources.reranked) sources.push("reranked"); + const categoryTag = getDisplayCategoryTag(r.entry); + return `${i + 1}. [${r.entry.id}] [${categoryTag}] ${r.entry.text.slice(0, 120)}${r.entry.text.length > 120 ? "..." : ""} (${(r.score * 100).toFixed(1)}%${sources.length > 0 ? `, ${sources.join("+")}` : ""})`; + }); + + const text = [...traceLines, ``, `Results (${results.length}):`, ...resultLines].join("\n"); + return { + content: [{ type: "text", text }], + details: { + count: results.length, + memories: sanitizeMemoryForSerialization(results), + query, + trace, + }, + }; + } catch (error) { + return { + content: [{ + type: "text", + text: `Memory debug failed: ${error instanceof Error ? error.message : String(error)}`, + }], + details: { error: "debug_failed", message: String(error) }, + }; + } + }, + }; + }, + { name: "memory_debug" }, + ); +} + +export function registerMemoryListTool( + api: OpenClawPluginApi, + context: ToolContext, +) { + api.registerTool( + (toolCtx) => { + const runtimeContext = resolveToolContext(context, toolCtx); return { name: "memory_list", label: "Memory List", @@ -1253,10 +1569,10 @@ export function registerMemoryListTool( try { const safeLimit = clampInt(limit, 1, 50); const safeOffset = clampInt(offset, 0, 1000); - const agentId = resolveRuntimeAgentId(context.agentId, runtimeCtx); + const agentId = resolveRuntimeAgentId(runtimeContext.agentId, runtimeCtx); // Determine accessible scopes - let scopeFilter = context.scopeManager.getAccessibleScopes(agentId); + let scopeFilter = resolveScopeFilter(context.scopeManager, agentId); if (scope) { if (context.scopeManager.isAccessible(scope, agentId)) { scopeFilter = [scope]; @@ -1349,6 +1665,390 @@ export function registerMemoryListTool( ); } +export function registerMemoryPromoteTool( + api: OpenClawPluginApi, + context: ToolContext, +) { + api.registerTool( + (toolCtx) => { + const runtimeContext = resolveToolContext(context, toolCtx); + return { + name: "memory_promote", + label: "Memory Promote", + description: + "Promote a memory into confirmed/durable governance state so it can participate in conservative auto-recall.", + parameters: Type.Object({ + memoryId: Type.Optional( + Type.String({ description: "Memory id (UUID/prefix). Optional when query is provided." }), + ), + query: Type.Optional( + Type.String({ description: "Search query to locate a memory when memoryId is omitted." }), + ), + scope: Type.Optional(Type.String({ description: "Optional scope filter." })), + state: Type.Optional(Type.Union([ + Type.Literal("pending"), + Type.Literal("confirmed"), + Type.Literal("archived"), + ])), + layer: Type.Optional(Type.Union([ + Type.Literal("durable"), + Type.Literal("working"), + Type.Literal("reflection"), + Type.Literal("archive"), + ])), + }), + async execute(_toolCallId, params, _signal, _onUpdate, runtimeCtx) { + const { + memoryId, + query, + scope, + state = "confirmed", + layer = "durable", + } = params as { + memoryId?: string; + query?: string; + scope?: string; + state?: "pending" | "confirmed" | "archived"; + layer?: "durable" | "working" | "reflection" | "archive"; + }; + + if (!memoryId && !query) { + return { + content: [{ type: "text", text: "Provide memoryId or query." }], + details: { error: "missing_selector" }, + }; + } + + const agentId = resolveRuntimeAgentId(runtimeContext.agentId, runtimeCtx); + let scopeFilter = resolveScopeFilter(context.scopeManager, agentId); + if (scope) { + if (!context.scopeManager.isAccessible(scope, agentId)) { + return { + content: [{ type: "text", text: `Access denied to scope: ${scope}` }], + details: { error: "scope_access_denied", requestedScope: scope }, + }; + } + scopeFilter = [scope]; + } + + const resolved = await resolveMemoryId( + runtimeContext, + memoryId ?? query ?? "", + scopeFilter, + ); + if (!resolved.ok) { + return { + content: [{ type: "text", text: resolved.message }], + details: resolved.details ?? { error: "resolve_failed" }, + }; + } + + const before = await runtimeContext.store.getById(resolved.id, scopeFilter); + if (!before) { + return { + content: [{ type: "text", text: `Memory ${resolved.id.slice(0, 8)} not found.` }], + details: { error: "not_found", id: resolved.id }, + }; + } + + const now = Date.now(); + const updated = await runtimeContext.store.patchMetadata( + resolved.id, + { + source: "manual", + state, + memory_layer: layer, + last_confirmed_use_at: state === "confirmed" ? now : undefined, + bad_recall_count: 0, + suppressed_until_turn: 0, + }, + scopeFilter, + ); + if (!updated) { + return { + content: [{ type: "text", text: `Failed to promote memory ${resolved.id.slice(0, 8)}.` }], + details: { error: "promote_failed", id: resolved.id }, + }; + } + + return { + content: [{ + type: "text", + text: `Promoted memory ${resolved.id.slice(0, 8)} to state=${state}, layer=${layer}.`, + }], + details: { + action: "promoted", + id: resolved.id, + state, + layer, + }, + }; + }, + }; + }, + { name: "memory_promote" }, + ); +} + +export function registerMemoryArchiveTool( + api: OpenClawPluginApi, + context: ToolContext, +) { + api.registerTool( + (toolCtx) => { + const runtimeContext = resolveToolContext(context, toolCtx); + return { + name: "memory_archive", + label: "Memory Archive", + description: + "Archive a memory to remove it from default auto-recall while preserving history.", + parameters: Type.Object({ + memoryId: Type.Optional(Type.String({ description: "Memory id (UUID/prefix)." })), + query: Type.Optional(Type.String({ description: "Search query when memoryId is omitted." })), + scope: Type.Optional(Type.String({ description: "Optional scope filter." })), + reason: Type.Optional(Type.String({ description: "Archive reason for audit trail." })), + }), + async execute(_toolCallId, params, _signal, _onUpdate, runtimeCtx) { + const { memoryId, query, scope, reason = "manual_archive" } = params as { + memoryId?: string; + query?: string; + scope?: string; + reason?: string; + }; + if (!memoryId && !query) { + return { + content: [{ type: "text", text: "Provide memoryId or query." }], + details: { error: "missing_selector" }, + }; + } + + const agentId = resolveRuntimeAgentId(runtimeContext.agentId, runtimeCtx); + let scopeFilter = resolveScopeFilter(context.scopeManager, agentId); + if (scope) { + if (!context.scopeManager.isAccessible(scope, agentId)) { + return { + content: [{ type: "text", text: `Access denied to scope: ${scope}` }], + details: { error: "scope_access_denied", requestedScope: scope }, + }; + } + scopeFilter = [scope]; + } + + const resolved = await resolveMemoryId( + runtimeContext, + memoryId ?? query ?? "", + scopeFilter, + ); + if (!resolved.ok) { + return { + content: [{ type: "text", text: resolved.message }], + details: resolved.details ?? { error: "resolve_failed" }, + }; + } + + const patch = { + state: "archived" as const, + memory_layer: "archive" as const, + archive_reason: reason, + archived_at: Date.now(), + }; + const updated = await runtimeContext.store.patchMetadata(resolved.id, patch, scopeFilter); + if (!updated) { + return { + content: [{ type: "text", text: `Failed to archive memory ${resolved.id.slice(0, 8)}.` }], + details: { error: "archive_failed", id: resolved.id }, + }; + } + + return { + content: [{ type: "text", text: `Archived memory ${resolved.id.slice(0, 8)}.` }], + details: { action: "archived", id: resolved.id, reason }, + }; + }, + }; + }, + { name: "memory_archive" }, + ); +} + +export function registerMemoryCompactTool( + api: OpenClawPluginApi, + context: ToolContext, +) { + api.registerTool( + (toolCtx) => { + const runtimeContext = resolveToolContext(context, toolCtx); + return { + name: "memory_compact", + label: "Memory Compact", + description: + "Compact duplicate low-value memories by archiving redundant entries and linking them to a canonical memory.", + parameters: Type.Object({ + scope: Type.Optional(Type.String({ description: "Optional scope filter." })), + dryRun: Type.Optional(Type.Boolean({ description: "Preview compaction only (default true)." })), + limit: Type.Optional(Type.Number({ description: "Max entries to scan (default 200)." })), + }), + async execute(_toolCallId, params, _signal, _onUpdate, runtimeCtx) { + const { scope, dryRun = true, limit = 200 } = params as { + scope?: string; + dryRun?: boolean; + limit?: number; + }; + + const safeLimit = clampInt(limit, 20, 1000); + const agentId = resolveRuntimeAgentId(runtimeContext.agentId, runtimeCtx); + let scopeFilter = resolveScopeFilter(context.scopeManager, agentId); + if (scope) { + if (!context.scopeManager.isAccessible(scope, agentId)) { + return { + content: [{ type: "text", text: `Access denied to scope: ${scope}` }], + details: { error: "scope_access_denied", requestedScope: scope }, + }; + } + scopeFilter = [scope]; + } + + const entries = await runtimeContext.store.list(scopeFilter, undefined, safeLimit, 0); + const canonicalByKey = new Map(); + const duplicates: Array<{ duplicateId: string; canonicalId: string; key: string }> = []; + + for (const entry of entries) { + const meta = parseSmartMetadata(entry.metadata, entry); + if (meta.state === "archived") continue; + const key = `${meta.memory_category}:${normalizeInlineText(meta.l0_abstract).toLowerCase()}`; + const existing = canonicalByKey.get(key); + if (!existing) { + canonicalByKey.set(key, entry); + continue; + } + const keep = + existing.timestamp >= entry.timestamp ? existing : entry; + const drop = + keep.id === existing.id ? entry : existing; + canonicalByKey.set(key, keep); + duplicates.push({ duplicateId: drop.id, canonicalId: keep.id, key }); + } + + let archivedCount = 0; + if (!dryRun) { + for (const item of duplicates) { + await runtimeContext.store.patchMetadata( + item.duplicateId, + { + state: "archived", + memory_layer: "archive", + canonical_id: item.canonicalId, + archive_reason: "compact_duplicate", + archived_at: Date.now(), + }, + scopeFilter, + ); + archivedCount++; + } + } + + return { + content: [{ + type: "text", + text: dryRun + ? `Compaction preview: ${duplicates.length} duplicate(s) detected across ${entries.length} entries.` + : `Compaction complete: archived ${archivedCount} duplicate memory record(s).`, + }], + details: { + action: dryRun ? "compact_preview" : "compact_applied", + scanned: entries.length, + duplicates: duplicates.length, + archived: archivedCount, + sample: duplicates.slice(0, 20), + }, + }; + }, + }; + }, + { name: "memory_compact" }, + ); +} + +export function registerMemoryExplainRankTool( + api: OpenClawPluginApi, + context: ToolContext, +) { + api.registerTool( + (toolCtx) => { + const runtimeContext = resolveToolContext(context, toolCtx); + return { + name: "memory_explain_rank", + label: "Memory Explain Rank", + description: + "Run recall and explain why each memory was ranked, including governance metadata (state/layer/source/suppression).", + parameters: Type.Object({ + query: Type.String({ description: "Query used for ranking analysis." }), + limit: Type.Optional(Type.Number({ description: "How many items to explain (default 5)." })), + scope: Type.Optional(Type.String({ description: "Optional scope filter." })), + }), + async execute(_toolCallId, params, _signal, _onUpdate, runtimeCtx) { + const { query, limit = 5, scope } = params as { + query: string; + limit?: number; + scope?: string; + }; + + const safeLimit = clampInt(limit, 1, 20); + const agentId = resolveRuntimeAgentId(runtimeContext.agentId, runtimeCtx); + let scopeFilter = resolveScopeFilter(context.scopeManager, agentId); + if (scope) { + if (!context.scopeManager.isAccessible(scope, agentId)) { + return { + content: [{ type: "text", text: `Access denied to scope: ${scope}` }], + details: { error: "scope_access_denied", requestedScope: scope }, + }; + } + scopeFilter = [scope]; + } + + const results = await retrieveWithRetry(runtimeContext.retriever, { + query, + limit: safeLimit, + scopeFilter, + source: "manual", + }); + if (results.length === 0) { + return { + content: [{ type: "text", text: "No relevant memories found." }], + details: { action: "empty", query, scopeFilter }, + }; + } + + const lines = results.map((r, idx) => { + const meta = parseSmartMetadata(r.entry.metadata, r.entry); + const sourceBreakdown = []; + if (r.sources.vector) sourceBreakdown.push(`vec=${r.sources.vector.score.toFixed(3)}`); + if (r.sources.bm25) sourceBreakdown.push(`bm25=${r.sources.bm25.score.toFixed(3)}`); + if (r.sources.reranked) sourceBreakdown.push(`rerank=${r.sources.reranked.score.toFixed(3)}`); + return [ + `${idx + 1}. [${r.entry.id}] score=${r.score.toFixed(3)} ${sourceBreakdown.join(" ")}`.trim(), + ` state=${meta.state} layer=${meta.memory_layer} source=${meta.source} tier=${meta.tier}`, + ` access=${meta.access_count} injected=${meta.injected_count} badRecall=${meta.bad_recall_count} suppressedUntilTurn=${meta.suppressed_until_turn}`, + ` text=${truncateText(normalizeInlineText(meta.l0_abstract || r.entry.text), 180)}`, + ].join("\n"); + }); + + return { + content: [{ type: "text", text: lines.join("\n") }], + details: { + action: "explain_rank", + query, + count: results.length, + results: sanitizeMemoryForSerialization(results), + }, + }; + }, + }; + }, + { name: "memory_explain_rank" }, + ); +} + // ============================================================================ // Tool Registration Helper // ============================================================================ @@ -1370,7 +2070,12 @@ export function registerAllMemoryTools( // Management tools (optional) if (options.enableManagementTools) { registerMemoryStatsTool(api, context); + registerMemoryDebugTool(api, context); registerMemoryListTool(api, context); + registerMemoryPromoteTool(api, context); + registerMemoryArchiveTool(api, context); + registerMemoryCompactTool(api, context); + registerMemoryExplainRankTool(api, context); } if (options.enableSelfImprovementTools !== false) { registerSelfImprovementLogTool(api, context); diff --git a/src/workspace-boundary.ts b/src/workspace-boundary.ts new file mode 100644 index 00000000..06897ab9 --- /dev/null +++ b/src/workspace-boundary.ts @@ -0,0 +1,154 @@ +import { + classifyIdentityAndAddressingMemory, +} from "./identity-addressing.js"; +import { parseSmartMetadata } from "./smart-metadata.js"; + +export interface UserMdExclusiveConfig { + enabled?: boolean; + routeProfile?: boolean; + routeCanonicalName?: boolean; + routeCanonicalAddressing?: boolean; + filterRecall?: boolean; +} + +export interface WorkspaceBoundaryConfig { + userMdExclusive?: UserMdExclusiveConfig; +} + +export interface ResolvedUserMdExclusiveConfig { + enabled: boolean; + routeProfile: boolean; + routeCanonicalName: boolean; + routeCanonicalAddressing: boolean; + filterRecall: boolean; +} + +type UserMdExclusiveSlot = "profile" | "name" | "addressing"; + +type BoundaryEntryLike = { + text: string; + metadata?: string; + category?: "preference" | "fact" | "decision" | "entity" | "other" | "reflection"; + importance?: number; + timestamp?: number; +}; + +const PROFILE_HINT_PATTERNS = [ + /^User profile:/im, + /^##\s*(?:Background|Profile|Context)$/im, + /(?:^|\n)-\s*(?:Timezone|Pronouns?|Role|Language|Working style|Collaboration style)\s*:/i, + /(?:我的时区是|我的代词是|我是|我的身份是|my timezone is|my pronouns are|i am)\b/iu, + /(?:时区|代词|协作方式|工作方式|语言偏好)/u, +]; + +export function resolveUserMdExclusiveConfig( + workspaceBoundary?: WorkspaceBoundaryConfig | null, +): ResolvedUserMdExclusiveConfig { + const raw = workspaceBoundary?.userMdExclusive; + const enabled = raw?.enabled === true; + return { + enabled, + routeProfile: enabled && raw?.routeProfile !== false, + routeCanonicalName: enabled && raw?.routeCanonicalName !== false, + routeCanonicalAddressing: enabled && raw?.routeCanonicalAddressing !== false, + filterRecall: enabled && raw?.filterRecall !== false, + }; +} + +export function shouldFilterUserMdExclusiveRecall( + workspaceBoundary?: WorkspaceBoundaryConfig | null, +): boolean { + return resolveUserMdExclusiveConfig(workspaceBoundary).filterRecall; +} + +export function isUserMdExclusiveMemory( + params: { + memoryCategory?: string; + factKey?: string; + text?: string; + abstract?: string; + overview?: string; + content?: string; + }, + workspaceBoundary?: WorkspaceBoundaryConfig | null, +): boolean { + const config = resolveUserMdExclusiveConfig(workspaceBoundary); + if (!config.enabled) return false; + + const slots = new Set(); + if (params.memoryCategory === "profile") { + slots.add("profile"); + } + + const semantics = classifyIdentityAndAddressingMemory({ + factKey: params.factKey, + text: params.text, + abstract: params.abstract, + overview: params.overview, + content: params.content, + }); + + if (semantics.slots.has("name")) { + slots.add("name"); + } + if (semantics.slots.has("addressing")) { + slots.add("addressing"); + } + + const probe = [ + params.text, + params.abstract, + params.overview, + params.content, + ] + .filter((value): value is string => typeof value === "string" && value.trim().length > 0) + .map((value) => value.trim()) + .join("\n"); + + if (probe && PROFILE_HINT_PATTERNS.some((pattern) => pattern.test(probe))) { + slots.add("profile"); + } + + if (config.routeProfile && slots.has("profile")) { + return true; + } + + if (config.routeCanonicalName && slots.has("name")) { + return true; + } + + if (config.routeCanonicalAddressing && slots.has("addressing")) { + return true; + } + + return false; +} + +export function isUserMdExclusiveEntry( + entry: BoundaryEntryLike, + workspaceBoundary?: WorkspaceBoundaryConfig | null, +): boolean { + const meta = parseSmartMetadata(entry.metadata, entry); + return isUserMdExclusiveMemory( + { + memoryCategory: meta.memory_category, + factKey: meta.fact_key, + text: entry.text, + abstract: meta.l0_abstract, + overview: meta.l1_overview, + content: meta.l2_content, + }, + workspaceBoundary, + ); +} + +export function filterUserMdExclusiveRecallResults( + results: T[], + workspaceBoundary?: WorkspaceBoundaryConfig | null, +): T[] { + if (!shouldFilterUserMdExclusiveRecall(workspaceBoundary)) { + return results; + } + + return results.filter((result) => !isUserMdExclusiveEntry(result.entry, workspaceBoundary)); +} diff --git a/test/batch-dedup.test.mjs b/test/batch-dedup.test.mjs new file mode 100644 index 00000000..83d85e83 --- /dev/null +++ b/test/batch-dedup.test.mjs @@ -0,0 +1,196 @@ +import { describe, it } from "node:test"; +import assert from "node:assert/strict"; +import jitiFactory from "jiti"; + +const jiti = jitiFactory(import.meta.url, { interopDefault: true }); + +const { batchDedup, createExtractionCostStats } = jiti( + "../src/batch-dedup.ts", +); + +// ============================================================================ +// Helpers +// ============================================================================ + +/** + * Create a normalized unit vector with slight variation. + * seed controls the angle: same seed = same direction = high cosine similarity. + */ +function makeVector(seed, dim = 128) { + const vec = new Array(dim).fill(0); + for (let i = 0; i < dim; i++) { + vec[i] = Math.sin(seed * (i + 1)) + Math.cos(seed * (i + 2)); + } + // Normalize + const norm = Math.sqrt(vec.reduce((sum, v) => sum + v * v, 0)); + if (norm > 0) { + for (let i = 0; i < dim; i++) vec[i] /= norm; + } + return vec; +} + +/** + * Create a vector very similar to a base vector (add small noise). + */ +function makeSimilarVector(base, noise = 0.01) { + const vec = base.map((v) => v + (Math.random() - 0.5) * noise); + const norm = Math.sqrt(vec.reduce((sum, v) => sum + v * v, 0)); + if (norm > 0) { + for (let i = 0; i < vec.length; i++) vec[i] /= norm; + } + return vec; +} + +// ============================================================================ +// batchDedup tests +// ============================================================================ + +describe("batchDedup", () => { + it("returns all indices when no duplicates", () => { + const v1 = makeVector(1.0); + const v2 = makeVector(5.0); + const v3 = makeVector(10.0); + + const result = batchDedup( + ["abstract A", "abstract B", "abstract C"], + [v1, v2, v3], + 0.85, + ); + + assert.equal(result.inputCount, 3); + assert.equal(result.outputCount, 3); + assert.deepEqual(result.survivingIndices, [0, 1, 2]); + assert.deepEqual(result.duplicateIndices, []); + }); + + it("marks similar candidates as duplicates", () => { + const v1 = makeVector(1.0); + const v2 = makeSimilarVector(v1, 0.001); // Very similar to v1 + const v3 = makeVector(10.0); // Very different + + const result = batchDedup( + ["similar abstract 1", "similar abstract 2", "different abstract"], + [v1, v2, v3], + 0.85, + ); + + assert.equal(result.inputCount, 3); + assert.equal(result.outputCount, 2); + assert.ok(result.survivingIndices.includes(0)); + assert.ok(result.survivingIndices.includes(2)); + assert.ok(result.duplicateIndices.includes(1)); + }); + + it("keeps first of duplicate pair", () => { + const v1 = makeVector(1.0); + const v2 = makeSimilarVector(v1, 0.0001); // Nearly identical + + const result = batchDedup( + ["abstract A", "abstract A (duplicate)"], + [v1, v2], + 0.85, + ); + + assert.equal(result.outputCount, 1); + assert.deepEqual(result.survivingIndices, [0]); + assert.deepEqual(result.duplicateIndices, [1]); + }); + + it("handles single candidate", () => { + const result = batchDedup( + ["only abstract"], + [makeVector(1.0)], + 0.85, + ); + + assert.equal(result.inputCount, 1); + assert.equal(result.outputCount, 1); + assert.deepEqual(result.survivingIndices, [0]); + assert.deepEqual(result.duplicateIndices, []); + }); + + it("handles empty input", () => { + const result = batchDedup([], [], 0.85); + + assert.equal(result.inputCount, 0); + assert.equal(result.outputCount, 0); + assert.deepEqual(result.survivingIndices, []); + assert.deepEqual(result.duplicateIndices, []); + }); + + it("respects threshold: low threshold drops more", () => { + const v1 = makeVector(1.0); + const v2 = makeVector(1.3); // Somewhat similar + const v3 = makeVector(10.0); // Very different + + const strictResult = batchDedup( + ["a", "b", "c"], + [v1, v2, v3], + 0.5, // Very low threshold - more aggressive dedup + ); + + const lenientResult = batchDedup( + ["a", "b", "c"], + [v1, v2, v3], + 0.99, // Very high threshold - almost no dedup + ); + + // Strict should drop more or equal candidates + assert.ok(strictResult.outputCount <= lenientResult.outputCount); + }); + + it("handles empty/missing vectors gracefully", () => { + const v1 = makeVector(1.0); + + const result = batchDedup( + ["abstract A", "abstract B", "abstract C"], + [v1, [], v1], // Second vector is empty + 0.85, + ); + + // Should not crash; candidates with empty vectors survive + assert.ok(result.outputCount >= 1); + }); + + it("deduplicates multiple similar pairs correctly", () => { + const v1 = makeVector(1.0); + const v1dup = makeSimilarVector(v1, 0.0001); + const v2 = makeVector(10.0); + const v2dup = makeSimilarVector(v2, 0.0001); + + const result = batchDedup( + ["topic A", "topic A copy", "topic B", "topic B copy"], + [v1, v1dup, v2, v2dup], + 0.85, + ); + + assert.equal(result.inputCount, 4); + assert.equal(result.outputCount, 2); + assert.ok(result.survivingIndices.includes(0)); + assert.ok(result.survivingIndices.includes(2)); + }); +}); + +// ============================================================================ +// ExtractionCostStats tests +// ============================================================================ + +describe("ExtractionCostStats", () => { + it("creates fresh stats with zero values", () => { + const stats = createExtractionCostStats(); + assert.equal(stats.batchDeduped, 0); + assert.equal(stats.durationMs, 0); + assert.equal(stats.llmCalls, 0); + }); + + it("tracks batch dedup count", () => { + const stats = createExtractionCostStats(); + stats.batchDeduped = 3; + stats.durationMs = 1500; + stats.llmCalls = 2; + + assert.equal(stats.batchDeduped, 3); + assert.equal(stats.durationMs, 1500); + assert.equal(stats.llmCalls, 2); + }); +}); diff --git a/test/cjk-recursion-regression.test.mjs b/test/cjk-recursion-regression.test.mjs new file mode 100644 index 00000000..247e7c14 --- /dev/null +++ b/test/cjk-recursion-regression.test.mjs @@ -0,0 +1,338 @@ +import assert from "node:assert/strict"; +import http from "node:http"; + +import jitiFactory from "jiti"; + +const jiti = jitiFactory(import.meta.url, { interopDefault: true }); +const { Embedder } = jiti("../src/embedder.ts"); +const { smartChunk } = jiti("../src/chunker.ts"); + +function generateCJKText(charCount) { + const chars = "中文字符测试数据内容关键词信息处理系统计算机软件硬件网络数据库服务器客户端浏览器应用程序编程语言算法数据结构人工智能机器学习深度学习神经网络。".split(""); + let text = ""; + for (let i = 0; i < charCount; i++) text += chars[i % chars.length]; + return text; +} + +function createJsonServer(handler) { + const server = http.createServer(async (req, res) => { + if (req.url !== "/v1/embeddings" || req.method !== "POST") { + res.writeHead(404); + res.end("not found"); + return; + } + + let body = ""; + req.on("data", (chunk) => { + body += chunk; + }); + req.on("end", async () => { + try { + await handler(JSON.parse(body || "{}"), req, res); + } catch (error) { + res.writeHead(500, { "content-type": "application/json" }); + res.end(JSON.stringify({ error: { message: String(error?.message || error), code: "test_handler_error" } })); + } + }); + }); + return server; +} + +async function withServer(handler, fn) { + const server = createJsonServer(handler); + await new Promise((resolve) => server.listen(0, "127.0.0.1", resolve)); + const address = server.address(); + const port = typeof address === "object" && address ? address.port : 0; + const baseURL = `http://127.0.0.1:${port}/v1`; + try { + await fn({ baseURL }); + } finally { + await new Promise((resolve) => server.close(resolve)); + } +} + +async function testSingleChunkFallbackTerminates() { + console.log("Test 1: single-chunk fallback terminates instead of looping"); + + let callCount = 0; + await withServer((payload, _req, res) => { + callCount++; + const input = Array.isArray(payload.input) ? payload.input[0] : payload.input; + if (typeof input === "string" && input.length > 100) { + res.writeHead(400, { "content-type": "application/json" }); + res.end(JSON.stringify({ error: { message: "Input length exceeds maximum tokens (max 8192)", code: "context_length_exceeded" } })); + return; + } + + const dims = 1024; + res.writeHead(200, { "content-type": "application/json" }); + res.end(JSON.stringify({ data: [{ embedding: Array.from({ length: dims }, () => 1), index: 0 }] })); + }, async ({ baseURL }) => { + const embedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "mxbai-embed-large", + baseURL, + dimensions: 1024, + }); + + await assert.rejects( + () => embedder.embedPassage(generateCJKText(3000)), + (error) => { + assert.match(error.message, /Failed to embed: input too large for model context after 3 retries/i); + assert(callCount < 20, `Expected bounded retries, got ${callCount}`); + return true; + } + ); + }); + + console.log(` API calls before termination: ${callCount}`); + console.log(" PASSED\n"); +} + +async function testDepthLimitTermination() { + console.log("Test 2: depth limit terminates repeated forced reductions"); + + await withServer((_payload, _req, res) => { + res.writeHead(400, { "content-type": "application/json" }); + res.end(JSON.stringify({ error: { message: "Input length exceeds maximum tokens (max 8192)", code: "context_length_exceeded" } })); + }, async ({ baseURL }) => { + const embedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "mxbai-embed-large", + baseURL, + dimensions: 1024, + }); + + await assert.rejects( + () => embedder.embedPassage(generateCJKText(220)), + (error) => { + assert.match(error.message, /Failed to embed: input too large for model context after 3 retries|chunking couldn't reduce input size enough/i); + return true; + } + ); + }); + + console.log(" PASSED\n"); +} + +async function testCjkAwareChunkSizing() { + console.log("Test 3: CJK-aware chunk sizing produces more chunks than Latin text for same model budget"); + const cjkText = generateCJKText(5000); + const latinText = "english text sentence. ".repeat(220); + const cjkResult = smartChunk(cjkText, "mxbai-embed-large"); + const latinResult = smartChunk(latinText, "mxbai-embed-large"); + + assert(cjkResult.chunkCount > 1, "Expected multiple chunks for long CJK text"); + assert(cjkResult.chunks[0].length < latinResult.chunks[0].length, "Expected smaller CJK chunks than Latin chunks"); + console.log(` CJK first chunk: ${cjkResult.chunks[0].length} chars`); + console.log(` Latin first chunk: ${latinResult.chunks[0].length} chars`); + console.log(" PASSED\n"); +} + +async function testChunkErrorSurfaced() { + console.log("Test 4: chunkError is surfaced instead of generic context_length_exceeded wrapper"); + + await withServer((payload, _req, res) => { + const input = Array.isArray(payload.input) ? payload.input[0] : payload.input; + if (typeof input === "string" && input.length > 1500) { + res.writeHead(400, { "content-type": "application/json" }); + res.end(JSON.stringify({ error: { message: "Input length exceeds maximum tokens (max 8192)", code: "context_length_exceeded" } })); + return; + } + + res.writeHead(400, { "content-type": "application/json" }); + res.end(JSON.stringify({ error: { message: "chunk child failed with synthetic downstream error", code: "synthetic_chunk_failure" } })); + }, async ({ baseURL }) => { + const embedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "mxbai-embed-large", + baseURL, + dimensions: 1024, + }); + + await assert.rejects( + () => embedder.embedPassage(generateCJKText(5000)), + (error) => { + assert.match(error.message, /synthetic_chunk_failure|synthetic downstream error|chunk child failed/i); + assert.doesNotMatch(error.message, /context_length_exceeded/i); + return true; + } + ); + }); + + console.log(" PASSED\n"); +} + +async function testSmallContextChunking() { + console.log("Test 5: small-context model no longer keeps a 1000-char hard floor"); + const text = generateCJKText(2000); + const result = smartChunk(text, "all-MiniLM-L6-v2"); + assert(result.chunkCount > 1, "Expected multiple chunks for small-context CJK text"); + const maxChunkLen = Math.max(...result.chunks.map((c) => c.length)); + assert(maxChunkLen <= 200, `Expected chunk size <= 200 chars after clamp, got ${maxChunkLen}`); + console.log(` Largest chunk: ${maxChunkLen} chars`); + console.log(" PASSED\n"); +} + +async function testTimeoutAbortPropagation() { + console.log("Test 6: timeout abort propagates to underlying request path"); + + await withServer(async (_payload, req, res) => { + await new Promise((resolve) => setTimeout(resolve, 11_000)); + if (req.aborted || req.destroyed) { + return; + } + const dims = 1024; + res.writeHead(200, { "content-type": "application/json" }); + res.end(JSON.stringify({ data: [{ embedding: Array.from({ length: dims }, () => 0), index: 0 }] })); + }, async ({ baseURL }) => { + const embedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "mxbai-embed-large", + baseURL, + dimensions: 1024, + }); + + await assert.rejects( + () => embedder.embedPassage("short timeout probe"), + (error) => { + assert.match(error.message, /aborted|abort|timed out|fetch failed/i); + return true; + } + ); + }); + + console.log(" PASSED\n"); +} + +async function testBatchEmbeddingStillWorks() { + console.log("Test 7: batch embedding still works without withTimeout wrapper"); + + await withServer((_payload, _req, res) => { + const dims = 1024; + res.writeHead(200, { "content-type": "application/json" }); + res.end(JSON.stringify({ + data: [0, 1, 2].map((index) => ({ embedding: Array.from({ length: dims }, () => index), index })), + })); + }, async ({ baseURL }) => { + const embedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "mxbai-embed-large", + baseURL, + dimensions: 1024, + }); + + const embeddings = await embedder.embedBatchPassage(["a", "b", "c"]); + assert.equal(embeddings.length, 3); + assert.equal(embeddings[0].length, 1024); + assert.equal(embeddings[2][0], 2); + }); + + console.log(" PASSED\n"); +} + +async function testOllamaAbortWithNativeFetch() { + console.log("Test 8: Ollama native fetch respects external AbortSignal (PR354 fix regression)"); + + // Author's analysis: the previous test used withServer() on a random port but hardcoded + // http://127.0.0.1:11434/v1 for the Embedder — so the request always hit "connection refused" + // immediately and never touched the slow handler. This test fixes that by: + // 1. Binding the mock server directly to 127.0.0.1:11434 (so isOllamaProvider() is true) + // 2. Delaying the response by 5 seconds + // 3. Passing an external AbortSignal that fires after 2 seconds + // 4. Asserting total time ≈ 2s (proving abort interrupted the slow request) + + const SLOW_DELAY_MS = 5_000; + const ABORT_AFTER_MS = 2_000; + const DIMS = 1024; + + const server = http.createServer((req, res) => { + if (req.url === "/v1/embeddings" && req.method === "POST") { + const timer = setTimeout(() => { + if (res.writableEnded) return; // already aborted + res.writeHead(200, { "Content-Type": "application/json" }); + res.end(JSON.stringify({ + data: [{ embedding: Array.from({ length: DIMS }, () => 0.1), index: 0 }] + })); + }, SLOW_DELAY_MS); + req.on("aborted", () => clearTimeout(timer)); + return; + } + res.writeHead(404); + res.end("not found"); + }); + + // Bind directly to 127.0.0.1:11434 so isOllamaProvider() returns true + await new Promise((resolve) => server.listen(11434, "127.0.0.1", resolve)); + + try { + const embedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "mxbai-embed-large", + baseURL: "http://127.0.0.1:11434/v1", + dimensions: DIMS, + }); + + assert.equal( + embedder.isOllamaProvider ? embedder.isOllamaProvider() : false, + true, + "isOllamaProvider should return true for 127.0.0.1:11434" + ); + + const start = Date.now(); + const controller = new AbortController(); + const abortTimer = setTimeout(() => controller.abort(), ABORT_AFTER_MS); + + let errorCaught; + try { + // Pass external AbortSignal — should interrupt the 5-second slow response at ~2s + await embedder.embedPassage("abort test probe", controller.signal); + } catch (e) { + errorCaught = e; + } + + clearTimeout(abortTimer); + const elapsed = Date.now() - start; + + assert.ok(errorCaught, "embedPassage should throw (abort or timeout)"); + const msg = errorCaught instanceof Error ? errorCaught.message : String(errorCaught); + assert.ok( + /timed out|abort|ollama|ECONNREFUSED/i.test(msg), + `Expected abort/timeout error, got: ${msg}` + ); + + // If abort works: elapsed ≈ 2000ms. If abort fails: elapsed ≈ 5000ms. + assert.ok( + elapsed < SLOW_DELAY_MS * 0.75, + `Expected abort ~${ABORT_AFTER_MS}ms, got ${elapsed}ms — abort did NOT interrupt slow request` + ); + + console.log(` PASSED (aborted in ${elapsed}ms < ${SLOW_DELAY_MS}ms threshold)\n`); + } finally { + await new Promise((resolve) => server.close(resolve)); + } +} + +async function run() { + console.log("Running regression tests for PR #238...\n"); + await testSingleChunkFallbackTerminates(); + await testDepthLimitTermination(); + await testCjkAwareChunkSizing(); + await testChunkErrorSurfaced(); + await testSmallContextChunking(); + await testTimeoutAbortPropagation(); + await testBatchEmbeddingStillWorks(); + await testOllamaAbortWithNativeFetch(); + console.log("All regression tests passed!"); +} + +run().catch((err) => { + console.error("Test failed:", err); + process.exit(1); +}); diff --git a/test/clawteam-scope.test.mjs b/test/clawteam-scope.test.mjs new file mode 100644 index 00000000..14759394 --- /dev/null +++ b/test/clawteam-scope.test.mjs @@ -0,0 +1,128 @@ +import { describe, it, beforeEach } from "node:test"; +import assert from "node:assert/strict"; +import jitiFactory from "jiti"; + +const jiti = jitiFactory(import.meta.url, { interopDefault: true }); +const { MemoryScopeManager, _resetLegacyFallbackWarningState } = jiti("../src/scopes.ts"); +const { parseClawteamScopes, applyClawteamScopes } = jiti("../src/clawteam-scope.ts"); + +describe("ClawTeam Scope Integration", () => { + let manager; + + beforeEach(() => { + manager = new MemoryScopeManager({ default: "global", agentAccess: {} }); + _resetLegacyFallbackWarningState(); + }); + + // ── parseClawteamScopes ────────────────────────────────────────────── + + describe("parseClawteamScopes", () => { + it("parses comma-separated scope names", () => { + assert.deepStrictEqual( + parseClawteamScopes("custom:team-a,custom:team-b"), + ["custom:team-a", "custom:team-b"], + ); + }); + + it("trims whitespace around scope names", () => { + assert.deepStrictEqual( + parseClawteamScopes(" custom:team-a , custom:team-b "), + ["custom:team-a", "custom:team-b"], + ); + }); + + it("returns empty array for undefined", () => { + assert.deepStrictEqual(parseClawteamScopes(undefined), []); + }); + + it("returns empty array for empty string", () => { + assert.deepStrictEqual(parseClawteamScopes(""), []); + }); + + it("filters out empty segments from trailing commas", () => { + assert.deepStrictEqual( + parseClawteamScopes("custom:team-a,,, "), + ["custom:team-a"], + ); + }); + + it("handles single scope without commas", () => { + assert.deepStrictEqual( + parseClawteamScopes("custom:team-demo"), + ["custom:team-demo"], + ); + }); + }); + + // ── applyClawteamScopes ────────────────────────────────────────────── + + describe("applyClawteamScopes", () => { + it("registers scope definitions for unknown scopes", () => { + assert.strictEqual(manager.getScopeDefinition("custom:team-x"), undefined); + + applyClawteamScopes(manager, ["custom:team-x"]); + + const def = manager.getScopeDefinition("custom:team-x"); + assert.notStrictEqual(def, undefined); + assert.match(def.description, /ClawTeam shared scope/); + }); + + it("does not overwrite existing scope definitions", () => { + manager.addScopeDefinition("custom:team-x", { description: "My custom def" }); + + applyClawteamScopes(manager, ["custom:team-x"]); + + assert.strictEqual(manager.getScopeDefinition("custom:team-x").description, "My custom def"); + }); + + it("extends getAccessibleScopes for a normal agent", () => { + applyClawteamScopes(manager, ["custom:team-demo"]); + + const scopes = manager.getAccessibleScopes("agent-1"); + assert.ok(scopes.includes("custom:team-demo"), "should include team scope"); + }); + + it("preserves original agent scopes after extension", () => { + applyClawteamScopes(manager, ["custom:team-demo"]); + + const scopes = manager.getAccessibleScopes("main"); + assert.ok(scopes.includes("global"), "should still have global"); + assert.ok(scopes.includes("agent:main"), "should still have agent:main"); + assert.ok(scopes.includes("reflection:agent:main"), "should still have reflection scope"); + }); + + it("does not duplicate scopes already in the base list", () => { + // global is always in the base list + applyClawteamScopes(manager, ["global"]); + + const scopes = manager.getAccessibleScopes("main"); + const globalCount = scopes.filter(s => s === "global").length; + assert.strictEqual(globalCount, 1, "global should appear exactly once"); + }); + + it("supports multiple team scopes", () => { + applyClawteamScopes(manager, ["custom:team-a", "custom:team-b"]); + + const scopes = manager.getAccessibleScopes("agent-1"); + assert.ok(scopes.includes("custom:team-a")); + assert.ok(scopes.includes("custom:team-b")); + }); + + it("no-ops when given empty scopes array", () => { + const before = manager.getAccessibleScopes("main"); + applyClawteamScopes(manager, []); + const after = manager.getAccessibleScopes("main"); + assert.deepStrictEqual(before, after); + }); + }); + + // ── Baseline (no ClawTeam) ─────────────────────────────────────────── + + describe("without applyClawteamScopes", () => { + it("agent does not have team scopes by default", () => { + const scopes = manager.getAccessibleScopes("main"); + assert.ok(!scopes.includes("custom:team-demo"), "should NOT include team scope"); + assert.deepStrictEqual(scopes, ["global", "agent:main", "reflection:agent:main"]); + }); + }); +}); diff --git a/test/cli-oauth-login.test.mjs b/test/cli-oauth-login.test.mjs new file mode 100644 index 00000000..1ae0e75e --- /dev/null +++ b/test/cli-oauth-login.test.mjs @@ -0,0 +1,577 @@ +import assert from "node:assert/strict"; +import { afterEach, beforeEach, describe, it } from "node:test"; +import { existsSync, mkdirSync, mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs"; +import { tmpdir } from "node:os"; +import path from "node:path"; +import http from "node:http"; +import { Command } from "commander"; +import jitiFactory from "jiti"; + +const jiti = jitiFactory(import.meta.url, { interopDefault: true }); +const { createMemoryCLI } = jiti("../cli.ts"); + +const ENV_KEYS = [ + "MEMORY_PRO_OAUTH_AUTHORIZE_URL", + "MEMORY_PRO_OAUTH_TOKEN_URL", + "MEMORY_PRO_OAUTH_REDIRECT_URI", + "MEMORY_PRO_OAUTH_CLIENT_ID", + "OPENCLAW_HOME", +]; + +function encodeSegment(value) { + return Buffer.from(JSON.stringify(value)).toString("base64url"); +} + +function makeJwt(accountId) { + return [ + encodeSegment({ alg: "none", typ: "JWT" }), + encodeSegment({ + exp: Math.floor((Date.now() + 3_600_000) / 1000), + "https://api.openai.com/auth": { chatgpt_account_id: accountId }, + }), + "signature", + ].join("."); +} + +function getBackupPath(oauthPath) { + const parsed = path.parse(oauthPath); + const fileName = parsed.ext + ? `${parsed.name}.llm-backup${parsed.ext}` + : `${parsed.base}.llm-backup.json`; + return path.join(parsed.dir, fileName); +} + +describe("memory-pro auth", () => { + let tempDir; + let server; + let originalEnv; + let originalCwd; + + beforeEach(() => { + tempDir = mkdtempSync(path.join(tmpdir(), "memory-cli-oauth-")); + originalEnv = Object.fromEntries(ENV_KEYS.map((key) => [key, process.env[key]])); + originalCwd = process.cwd(); + }); + + afterEach(async () => { + process.chdir(originalCwd); + for (const key of ENV_KEYS) { + if (originalEnv[key] === undefined) { + delete process.env[key]; + } else { + process.env[key] = originalEnv[key]; + } + } + if (server) { + await new Promise((resolve) => server.close(resolve)); + server = null; + } + rmSync(tempDir, { recursive: true, force: true }); + }); + + it("round-trips a dedicated llm api-key config through OAuth login/logout", async () => { + const authCode = "test-auth-code"; + const accountId = "acct_cli_123"; + const redirectPort = 18765; + let tokenRequests = 0; + + server = http.createServer(async (req, res) => { + if (req.method !== "POST" || req.url !== "/oauth/token") { + res.writeHead(404).end(); + return; + } + + let body = ""; + for await (const chunk of req) body += chunk; + const params = new URLSearchParams(body); + tokenRequests += 1; + + assert.equal(params.get("grant_type"), "authorization_code"); + assert.equal(params.get("code"), authCode); + + res.writeHead(200, { "Content-Type": "application/json" }); + res.end(JSON.stringify({ + access_token: makeJwt(accountId), + refresh_token: "refresh-cli-token", + expires_in: 3600, + })); + }); + await new Promise((resolve) => server.listen(0, "127.0.0.1", resolve)); + const tokenPort = server.address().port; + + process.env.MEMORY_PRO_OAUTH_AUTHORIZE_URL = `http://127.0.0.1:${tokenPort}/oauth/authorize`; + process.env.MEMORY_PRO_OAUTH_TOKEN_URL = `http://127.0.0.1:${tokenPort}/oauth/token`; + process.env.MEMORY_PRO_OAUTH_REDIRECT_URI = `http://localhost:${redirectPort}/auth/callback`; + process.env.MEMORY_PRO_OAUTH_CLIENT_ID = "test-client-id"; + + const configPath = path.join(tempDir, "openclaw.json"); + const oauthPath = path.join(tempDir, ".memory-lancedb-pro", "oauth.json"); + const backupPath = getBackupPath(oauthPath); + const originalLlmConfig = { + auth: "api-key", + apiKey: "old-llm-key", + model: "gpt-4o-mini", + baseURL: "https://api.openai.com/v1", + timeoutMs: 45000, + }; + writeFileSync(configPath, JSON.stringify({ + plugins: { + entries: { + "memory-lancedb-pro": { + enabled: true, + config: { + embedding: { + provider: "openai-compatible", + apiKey: "embed-key", + }, + llm: originalLlmConfig, + }, + }, + }, + }, + }, null, 2)); + + let capturedAuthorizeUrl = ""; + const program = new Command(); + program.exitOverride(); + createMemoryCLI({ + store: {} , + retriever: {}, + scopeManager: {}, + migrator: {}, + pluginId: "memory-lancedb-pro", + pluginConfig: { + llm: { + model: "openai/gpt-5.4", + }, + }, + oauthTestHooks: { + authorizeUrl: async (url) => { + capturedAuthorizeUrl = url; + const parsed = new URL(url); + const state = parsed.searchParams.get("state"); + setTimeout(() => { + const callback = new URL(process.env.MEMORY_PRO_OAUTH_REDIRECT_URI); + callback.searchParams.set("code", authCode); + callback.searchParams.set("state", state || ""); + http.get(callback); + }, 25); + }, + }, + })({ program }); + + const logs = []; + const originalLog = console.log; + console.log = (...args) => logs.push(args.join(" ")); + try { + await program.parseAsync([ + "node", + "openclaw", + "memory-pro", + "auth", + "login", + "--config", + configPath, + "--provider", + "openai-codex", + "--oauth-path", + oauthPath, + "--model", + "openai/gpt-5.4", + "--no-browser", + ]); + } finally { + console.log = originalLog; + } + + assert.equal(tokenRequests, 1); + assert.ok(capturedAuthorizeUrl.includes("client_id=test-client-id")); + assert.ok(readFileSync(oauthPath, "utf8").includes(accountId)); + + const updatedConfig = JSON.parse(readFileSync(configPath, "utf8")); + const pluginConfig = updatedConfig.plugins.entries["memory-lancedb-pro"].config; + assert.equal(pluginConfig.llm.auth, "oauth"); + assert.equal(pluginConfig.llm.oauthProvider, "openai-codex"); + assert.equal(pluginConfig.llm.oauthPath, oauthPath); + assert.equal(pluginConfig.llm.model, "gpt-5.4"); + assert.equal(pluginConfig.llm.timeoutMs, 45000); + assert.equal(Object.prototype.hasOwnProperty.call(pluginConfig.llm, "apiKey"), false); + assert.equal(Object.prototype.hasOwnProperty.call(pluginConfig.llm, "baseURL"), false); + + const backup = JSON.parse(readFileSync(backupPath, "utf8")); + assert.equal(backup.hadLlmConfig, true); + assert.deepEqual(backup.llm, originalLlmConfig); + + const output = logs.join("\n"); + assert.match(output, /Provider: OpenAI Codex \(openai-codex,/); + assert.match(output, /Authorization URL:/); + assert.match(output, /OAuth login completed/); + assert.match(output, /Updated memory-lancedb-pro config: llm.auth=oauth, llm.oauthProvider=openai-codex/); + + const logoutProgram = new Command(); + logoutProgram.exitOverride(); + createMemoryCLI({ + store: {}, + retriever: {}, + scopeManager: {}, + migrator: {}, + pluginId: "memory-lancedb-pro", + })({ program: logoutProgram }); + + const logoutLogs = []; + console.log = (...args) => logoutLogs.push(args.join(" ")); + try { + await logoutProgram.parseAsync([ + "node", + "openclaw", + "memory-pro", + "auth", + "logout", + "--config", + configPath, + ]); + } finally { + console.log = originalLog; + } + + assert.equal(existsSync(oauthPath), false); + assert.equal(existsSync(backupPath), false); + + const restoredConfig = JSON.parse(readFileSync(configPath, "utf8")); + const restoredPluginConfig = restoredConfig.plugins.entries["memory-lancedb-pro"].config; + assert.deepEqual(restoredPluginConfig.llm, originalLlmConfig); + + const logoutOutput = logoutLogs.join("\n"); + assert.match(logoutOutput, /Updated memory-lancedb-pro config: llm.auth=api-key/); + }); + + it("supports interactive provider selection when --provider is omitted", async () => { + const authCode = "test-auth-code"; + const accountId = "acct_cli_prompt_123"; + const redirectPort = 18766; + + server = http.createServer(async (req, res) => { + if (req.method !== "POST" || req.url !== "/oauth/token") { + res.writeHead(404).end(); + return; + } + + let body = ""; + for await (const chunk of req) body += chunk; + const params = new URLSearchParams(body); + + assert.equal(params.get("grant_type"), "authorization_code"); + assert.equal(params.get("code"), authCode); + + res.writeHead(200, { "Content-Type": "application/json" }); + res.end(JSON.stringify({ + access_token: makeJwt(accountId), + refresh_token: "refresh-cli-token", + expires_in: 3600, + })); + }); + await new Promise((resolve) => server.listen(0, "127.0.0.1", resolve)); + const tokenPort = server.address().port; + + process.env.MEMORY_PRO_OAUTH_AUTHORIZE_URL = `http://127.0.0.1:${tokenPort}/oauth/authorize`; + process.env.MEMORY_PRO_OAUTH_TOKEN_URL = `http://127.0.0.1:${tokenPort}/oauth/token`; + process.env.MEMORY_PRO_OAUTH_REDIRECT_URI = `http://localhost:${redirectPort}/auth/callback`; + process.env.MEMORY_PRO_OAUTH_CLIENT_ID = "test-client-id"; + + const configPath = path.join(tempDir, "openclaw.json"); + const oauthPath = path.join(tempDir, ".memory-lancedb-pro", "oauth.json"); + writeFileSync(configPath, JSON.stringify({ + plugins: { + entries: { + "memory-lancedb-pro": { + enabled: true, + config: { + embedding: { + provider: "openai-compatible", + apiKey: "embed-key", + }, + }, + }, + }, + }, + }, null, 2)); + + const selectedProviders = []; + const program = new Command(); + program.exitOverride(); + createMemoryCLI({ + store: {} , + retriever: {}, + scopeManager: {}, + migrator: {}, + pluginId: "memory-lancedb-pro", + oauthTestHooks: { + chooseProvider: async (providers, currentProviderId) => { + selectedProviders.push(currentProviderId); + selectedProviders.push(...providers.map((provider) => provider.id)); + return "openai-codex"; + }, + authorizeUrl: async (url) => { + const parsed = new URL(url); + const state = parsed.searchParams.get("state"); + setTimeout(() => { + const callback = new URL(process.env.MEMORY_PRO_OAUTH_REDIRECT_URI); + callback.searchParams.set("code", authCode); + callback.searchParams.set("state", state || ""); + http.get(callback); + }, 25); + }, + }, + })({ program }); + + const logs = []; + const originalLog = console.log; + console.log = (...args) => logs.push(args.join(" ")); + try { + await program.parseAsync([ + "node", + "openclaw", + "memory-pro", + "auth", + "login", + "--config", + configPath, + "--oauth-path", + oauthPath, + "--model", + "openai/gpt-5.4", + "--no-browser", + ]); + } finally { + console.log = originalLog; + } + + assert.deepEqual(selectedProviders, ["openai-codex", "openai-codex"]); + + const updatedConfig = JSON.parse(readFileSync(configPath, "utf8")); + const pluginConfig = updatedConfig.plugins.entries["memory-lancedb-pro"].config; + assert.equal(pluginConfig.llm.oauthProvider, "openai-codex"); + + const output = logs.join("\n"); + assert.match(output, /Provider: OpenAI Codex \(openai-codex, prompt\)/); + }); + + it("defaults the OAuth file to the plugin-scoped path under OPENCLAW_HOME", async () => { + const authCode = "test-auth-code"; + const accountId = "acct_cli_default_path_123"; + const redirectPort = 18767; + + server = http.createServer(async (req, res) => { + if (req.method !== "POST" || req.url !== "/oauth/token") { + res.writeHead(404).end(); + return; + } + + let body = ""; + for await (const chunk of req) body += chunk; + const params = new URLSearchParams(body); + + assert.equal(params.get("grant_type"), "authorization_code"); + assert.equal(params.get("code"), authCode); + + res.writeHead(200, { "Content-Type": "application/json" }); + res.end(JSON.stringify({ + access_token: makeJwt(accountId), + refresh_token: "refresh-cli-token", + expires_in: 3600, + })); + }); + await new Promise((resolve) => server.listen(0, "127.0.0.1", resolve)); + const tokenPort = server.address().port; + + process.env.MEMORY_PRO_OAUTH_AUTHORIZE_URL = `http://127.0.0.1:${tokenPort}/oauth/authorize`; + process.env.MEMORY_PRO_OAUTH_TOKEN_URL = `http://127.0.0.1:${tokenPort}/oauth/token`; + process.env.MEMORY_PRO_OAUTH_REDIRECT_URI = `http://localhost:${redirectPort}/auth/callback`; + process.env.MEMORY_PRO_OAUTH_CLIENT_ID = "test-client-id"; + process.env.OPENCLAW_HOME = path.join(tempDir, "openclaw-home"); + + const configPath = path.join(tempDir, "openclaw.json"); + const oauthPath = path.join(process.env.OPENCLAW_HOME, ".memory-lancedb-pro", "oauth.json"); + const backupPath = getBackupPath(oauthPath); + writeFileSync(configPath, JSON.stringify({ + plugins: { + entries: { + "memory-lancedb-pro": { + enabled: true, + config: { + embedding: { + provider: "openai-compatible", + apiKey: "embed-key", + }, + }, + }, + }, + }, + }, null, 2)); + + const program = new Command(); + program.exitOverride(); + createMemoryCLI({ + store: {}, + retriever: {}, + scopeManager: {}, + migrator: {}, + pluginId: "memory-lancedb-pro", + oauthTestHooks: { + authorizeUrl: async (url) => { + const parsed = new URL(url); + const state = parsed.searchParams.get("state"); + setTimeout(() => { + const callback = new URL(process.env.MEMORY_PRO_OAUTH_REDIRECT_URI); + callback.searchParams.set("code", authCode); + callback.searchParams.set("state", state || ""); + http.get(callback); + }, 25); + }, + }, + })({ program }); + + await program.parseAsync([ + "node", + "openclaw", + "memory-pro", + "auth", + "login", + "--config", + configPath, + "--provider", + "openai-codex", + "--model", + "openai/gpt-5.4", + "--no-browser", + ]); + + assert.equal(existsSync(oauthPath), true); + assert.equal(existsSync(backupPath), true); + + const updatedConfig = JSON.parse(readFileSync(configPath, "utf8")); + const pluginConfig = updatedConfig.plugins.entries["memory-lancedb-pro"].config; + assert.equal(pluginConfig.llm.oauthPath, oauthPath); + }); + + it("resolves stored relative oauthPath against the config location during logout", async () => { + const workspaceDir = path.join(tempDir, "workspace"); + const otherDir = path.join(tempDir, "other"); + mkdirSync(workspaceDir, { recursive: true }); + mkdirSync(otherDir, { recursive: true }); + + const configPath = path.join(workspaceDir, "openclaw.json"); + const storedOauthPath = ".memory-lancedb-pro/oauth.json"; + const actualOauthPath = path.join(workspaceDir, ".memory-lancedb-pro", "oauth.json"); + mkdirSync(path.dirname(actualOauthPath), { recursive: true }); + writeFileSync(actualOauthPath, JSON.stringify({ access_token: "token" }), "utf8"); + writeFileSync(configPath, JSON.stringify({ + plugins: { + entries: { + "memory-lancedb-pro": { + enabled: true, + config: { + llm: { + auth: "oauth", + oauthPath: storedOauthPath, + baseURL: "https://chatgpt-proxy.example/v1", + }, + }, + }, + }, + }, + }, null, 2)); + + process.chdir(otherDir); + + const program = new Command(); + program.exitOverride(); + createMemoryCLI({ + store: {}, + retriever: {}, + scopeManager: {}, + migrator: {}, + pluginId: "memory-lancedb-pro", + })({ program }); + + const logs = []; + const originalLog = console.log; + console.log = (...args) => logs.push(args.join(" ")); + try { + await program.parseAsync([ + "node", + "openclaw", + "memory-pro", + "auth", + "logout", + "--config", + configPath, + ]); + } finally { + console.log = originalLog; + } + + assert.equal(existsSync(actualOauthPath), false); + + const updatedConfig = JSON.parse(readFileSync(configPath, "utf8")); + const pluginConfig = updatedConfig.plugins.entries["memory-lancedb-pro"].config; + assert.equal(pluginConfig.llm.baseURL, "https://chatgpt-proxy.example/v1"); + assert.equal(Object.prototype.hasOwnProperty.call(pluginConfig.llm, "oauthPath"), false); + assert.equal(Object.prototype.hasOwnProperty.call(pluginConfig.llm, "oauthProvider"), false); + assert.equal(Object.prototype.hasOwnProperty.call(pluginConfig.llm, "auth"), false); + + const output = logs.join("\n"); + assert.match(output, new RegExp(`Deleted OAuth file: ${actualOauthPath.replace(/[.*+?^${}()|[\]\\]/g, "\\$&")}`)); + }); + + it("removes llm config on logout when only OAuth-generated fields remain and no backup exists", async () => { + const workspaceDir = path.join(tempDir, "workspace"); + mkdirSync(workspaceDir, { recursive: true }); + + const configPath = path.join(workspaceDir, "openclaw.json"); + const oauthPath = path.join(workspaceDir, ".memory-lancedb-pro", "oauth.json"); + mkdirSync(path.dirname(oauthPath), { recursive: true }); + writeFileSync(oauthPath, JSON.stringify({ access_token: "token" }), "utf8"); + writeFileSync(configPath, JSON.stringify({ + plugins: { + entries: { + "memory-lancedb-pro": { + enabled: true, + config: { + llm: { + auth: "oauth", + oauthProvider: "openai-codex", + oauthPath, + model: "gpt-5.4", + }, + }, + }, + }, + }, + }, null, 2)); + + const program = new Command(); + program.exitOverride(); + createMemoryCLI({ + store: {}, + retriever: {}, + scopeManager: {}, + migrator: {}, + pluginId: "memory-lancedb-pro", + })({ program }); + + await program.parseAsync([ + "node", + "openclaw", + "memory-pro", + "auth", + "logout", + "--config", + configPath, + ]); + + const updatedConfig = JSON.parse(readFileSync(configPath, "utf8")); + const pluginConfig = updatedConfig.plugins.entries["memory-lancedb-pro"].config; + assert.equal(Object.prototype.hasOwnProperty.call(pluginConfig, "llm"), false); + }); +}); diff --git a/test/cross-process-lock.test.mjs b/test/cross-process-lock.test.mjs new file mode 100644 index 00000000..9370a954 --- /dev/null +++ b/test/cross-process-lock.test.mjs @@ -0,0 +1,119 @@ +import { describe, it } from "node:test"; +import assert from "node:assert/strict"; +import { mkdtempSync, rmSync, existsSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; +import jitiFactory from "jiti"; + +const jiti = jitiFactory(import.meta.url, { interopDefault: true }); +const { MemoryStore } = jiti("../src/store.ts"); + +function makeStore() { + const dir = mkdtempSync(join(tmpdir(), "memory-lancedb-pro-lock-")); + const store = new MemoryStore({ dbPath: dir, vectorDim: 3 }); + return { store, dir }; +} + +function makeEntry(i) { + return { + text: `memory-${i}`, + vector: [0.1 * i, 0.2 * i, 0.3 * i], + category: "fact", + scope: "global", + importance: 0.5, + metadata: "{}", + }; +} + +describe("Cross-process file lock", () => { + it("creates .memory-write.lock file on first write", async () => { + const { store, dir } = makeStore(); + try { + await store.store(makeEntry(1)); + assert.ok(existsSync(join(dir, ".memory-write.lock")), "lock file should exist"); + } finally { + rmSync(dir, { recursive: true, force: true }); + } + }); + + it("sequential writes succeed without conflict", async () => { + const { store, dir } = makeStore(); + try { + const e1 = await store.store(makeEntry(1)); + const e2 = await store.store(makeEntry(2)); + assert.ok(e1.id !== e2.id, "entries should have different IDs"); + + const all = await store.list(undefined, undefined, 20, 0); + assert.strictEqual(all.length, 2); + } finally { + rmSync(dir, { recursive: true, force: true }); + } + }); + + it("concurrent writes do not lose data", async () => { + const { store, dir } = makeStore(); + const count = 4; + try { + // Fire 4 concurrent stores (realistic ClawTeam swarm size) + const results = await Promise.all( + Array.from({ length: count }, (_, i) => store.store(makeEntry(i + 1))), + ); + + assert.strictEqual(results.length, count, "all store calls should resolve"); + + const ids = new Set(results.map(r => r.id)); + assert.strictEqual(ids.size, count, "all entries should have unique IDs"); + + const all = await store.list(undefined, undefined, 100, 0); + assert.strictEqual(all.length, count, "all entries should be retrievable"); + } finally { + rmSync(dir, { recursive: true, force: true }); + } + }); + + it("concurrent updates do not corrupt data", async () => { + const { store, dir } = makeStore(); + try { + // Seed entries + const entries = await Promise.all( + Array.from({ length: 4 }, (_, i) => store.store(makeEntry(i + 1))), + ); + + // Concurrently update all of them + const updated = await Promise.all( + entries.map((e, i) => + store.update(e.id, { text: `updated-${i}`, importance: 0.9 }), + ), + ); + + assert.strictEqual(updated.filter(Boolean).length, 4, "all updates should succeed"); + + // Verify data integrity + for (let i = 0; i < 4; i++) { + const fetched = await store.getById(entries[i].id); + assert.ok(fetched, `entry ${i} should exist`); + assert.strictEqual(fetched.text, `updated-${i}`); + assert.strictEqual(fetched.importance, 0.9); + } + } finally { + rmSync(dir, { recursive: true, force: true }); + } + }); + + it("lock is released after each operation", async () => { + const { store, dir } = makeStore(); + try { + await store.store(makeEntry(1)); + // If lock were stuck, this second store would hang/fail + await store.store(makeEntry(2)); + await store.delete((await store.list(undefined, undefined, 1, 0))[0].id); + // Still works after delete + await store.store(makeEntry(3)); + + const all = await store.list(undefined, undefined, 20, 0); + assert.strictEqual(all.length, 2, "should have 2 entries after store+store+delete+store"); + } finally { + rmSync(dir, { recursive: true, force: true }); + } + }); +}); diff --git a/test/embedder-error-hints.test.mjs b/test/embedder-error-hints.test.mjs index 0e7d153a..38db8ae1 100644 --- a/test/embedder-error-hints.test.mjs +++ b/test/embedder-error-hints.test.mjs @@ -4,7 +4,7 @@ import http from "node:http"; import jitiFactory from "jiti"; const jiti = jitiFactory(import.meta.url, { interopDefault: true }); -const { Embedder, formatEmbeddingProviderError } = jiti("../src/embedder.ts"); +const { Embedder, formatEmbeddingProviderError, getVectorDimensions } = jiti("../src/embedder.ts"); async function withJsonServer(status, body, fn) { const server = http.createServer((req, res) => { @@ -29,6 +29,66 @@ async function withJsonServer(status, body, fn) { } } +function createEmbeddingResponse(dimensions, value = 0.1) { + return { + data: [ + { + object: "embedding", + index: 0, + embedding: new Array(dimensions).fill(value), + }, + ], + }; +} + +async function withEmbeddingCaptureServer(handler, fn) { + const server = http.createServer(async (req, res) => { + if (req.url !== "/v1/embeddings" || req.method !== "POST") { + res.writeHead(404); + res.end("not found"); + return; + } + + const chunks = []; + for await (const chunk of req) chunks.push(chunk); + const body = Buffer.concat(chunks).toString("utf8"); + const payload = JSON.parse(body); + const response = await handler(payload, req); + res.writeHead(response.status ?? 200, { "content-type": "application/json" }); + res.end(JSON.stringify(response.body)); + }); + + await new Promise((resolve) => server.listen(0, "127.0.0.1", resolve)); + const address = server.address(); + const port = typeof address === "object" && address ? address.port : 0; + const baseURL = `http://127.0.0.1:${port}/v1`; + + try { + await fn({ baseURL, port }); + } finally { + await new Promise((resolve) => server.close(resolve)); + } +} + +function installMockEmbeddingClient(embedder, onCreate) { + embedder.clients = [ + { + embeddings: { + create: async (payload) => onCreate(payload), + }, + }, + ]; +} + +/** Capture console.debug calls emitted synchronously during fn(). */ +function captureDebug(fn) { + const messages = []; + const orig = console.debug; + console.debug = (...args) => messages.push(args.join(" ")); + try { fn(); } finally { console.debug = orig; } + return messages; +} + async function expectReject(promiseFactory, pattern) { try { await promiseFactory(); @@ -41,6 +101,133 @@ async function expectReject(promiseFactory, pattern) { } async function run() { + assert.equal(getVectorDimensions("voyage-4-lite"), 1024); + assert.equal(getVectorDimensions("voyage-3-large"), 1024); + + const voyageEmbedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "voyage-3-lite", + baseURL: "https://api.voyageai.com/v1", + dimensions: 1024, + }); + installMockEmbeddingClient(voyageEmbedder, async (payload) => { + assert.notEqual(payload.encoding_format, "float"); + assert.equal(payload.dimensions, undefined); + return createEmbeddingResponse(1024); + }); + await voyageEmbedder.embedPassage("hello"); + + const jinaEmbedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "jina-embeddings-v5-text-small", + baseURL: "https://api.jina.ai/v1", + dimensions: 1024, + taskPassage: "retrieval.passage", + normalized: true, + }); + installMockEmbeddingClient(jinaEmbedder, async (payload) => { + assert.equal(payload.task, "retrieval.passage"); + assert.equal(payload.normalized, true); + assert.equal(payload.dimensions, 1024); + return createEmbeddingResponse(1024); + }); + await jinaEmbedder.embedPassage("hello"); + + const genericEmbedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "custom-embed-model", + baseURL: "https://embeddings.example.invalid/v1", + dimensions: 384, + }); + installMockEmbeddingClient(genericEmbedder, async (payload) => { + assert.equal(payload.encoding_format, "float"); + assert.equal(payload.dimensions, 384); + return createEmbeddingResponse(384); + }); + await genericEmbedder.embedPassage("hello"); + + // voyage-4 should be detected as voyage-compatible via model name prefix, + // even when baseURL is NOT api.voyageai.com (e.g. behind a proxy). + const voyageProxyEmbedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "voyage-4", + baseURL: "https://proxy.example.invalid/v1", + dimensions: 1024, + }); + installMockEmbeddingClient(voyageProxyEmbedder, async (payload) => { + assert.notEqual(payload.encoding_format, "float", "voyage-4 should not send encoding_format"); + assert.equal(payload.dimensions, undefined, "voyage-4 should not send dimensions"); + return createEmbeddingResponse(1024); + }); + await voyageProxyEmbedder.embedPassage("hello"); + + // Voyage: taskPassage "retrieval.passage" → input_type "document" + // taskQuery "retrieval.query" → input_type "query" + const voyageTaskEmbedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "voyage-3-lite", + baseURL: "https://api.voyageai.com/v1", + dimensions: 1024, + taskPassage: "retrieval.passage", + taskQuery: "retrieval.query", + }); + installMockEmbeddingClient(voyageTaskEmbedder, async (payload) => { + assert.equal(payload.input_type, "document", "voyage taskPassage should map to input_type=document"); + assert.equal(payload.task, undefined, "voyage should not send task field"); + return createEmbeddingResponse(1024); + }); + await voyageTaskEmbedder.embedPassage("hello"); + + installMockEmbeddingClient(voyageTaskEmbedder, async (payload) => { + assert.equal(payload.input_type, "query", "voyage taskQuery should map to input_type=query"); + return createEmbeddingResponse(1024); + }); + await voyageTaskEmbedder.embedQuery("hello"); + + // Voyage: configured dimensions should be sent as output_dimension, not dimensions. + // voyage-4-lite is a recommended Voyage model that supports output_dimension. + const voyageDimEmbedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "voyage-4-lite", + baseURL: "https://api.voyageai.com/v1", + dimensions: 512, + }); + installMockEmbeddingClient(voyageDimEmbedder, async (payload) => { + assert.equal(payload.output_dimension, 512, "voyage should send output_dimension"); + assert.equal(payload.dimensions, undefined, "voyage should not send dimensions"); + return createEmbeddingResponse(512); + }); + await voyageDimEmbedder.embedPassage("hello"); + + // End-to-end HTTP payload verification for generic-openai-compatible profile. + // Unlike the mock tests above, this spins up a real HTTP server and verifies + // the actual request body sent by the OpenAI SDK. + await withEmbeddingCaptureServer( + (payload) => { + assert.equal(payload.encoding_format, "float", "generic profile should send encoding_format"); + assert.equal(payload.dimensions, 384, "generic profile should send dimensions"); + assert.equal(payload.task, undefined, "generic profile should not send task"); + assert.equal(payload.normalized, undefined, "generic profile should not send normalized"); + return { body: createEmbeddingResponse(384) }; + }, + async ({ baseURL }) => { + const embedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "custom-embed-model", + baseURL, + dimensions: 384, + }); + await embedder.embedPassage("hello world"); + }, + ); + await withJsonServer( 403, { error: { message: "Invalid API key", code: "invalid_api_key" } }, @@ -63,6 +250,50 @@ async function run() { }, ); + // Constructor warning: normalized set on OpenAI profile → debug warning fires + { + const msgs = captureDebug(() => new Embedder({ + provider: "openai-compatible", apiKey: "test-key", + model: "text-embedding-3-small", dimensions: 1536, normalized: true, + })); + assert.ok(msgs.some((m) => /normalized/i.test(m)), + `Expected warning about normalized, got: ${msgs.join(" | ")}`); + } + + // Constructor warning: taskQuery set on generic profile → debug warning fires + { + const msgs = captureDebug(() => new Embedder({ + provider: "openai-compatible", apiKey: "test-key", + model: "custom-embed-model", baseURL: "https://embeddings.example.invalid/v1", + dimensions: 384, taskQuery: "retrieval.query", + })); + assert.ok(msgs.some((m) => /taskQuery/i.test(m)), + `Expected warning about taskQuery, got: ${msgs.join(" | ")}`); + } + + // Constructor no false positive: normalized on Jina profile is valid → no warning + { + const msgs = captureDebug(() => new Embedder({ + provider: "openai-compatible", apiKey: "test-key", + model: "jina-embeddings-v5-text-small", baseURL: "https://api.jina.ai/v1", + dimensions: 1024, normalized: true, + })); + assert.ok(!msgs.some((m) => /normalized/i.test(m)), + `Unexpected warning for Jina profile: ${msgs.join(" | ")}`); + } + + // Jina proxy: jina-* model at a proxy URL still gets the Jina-specific auth hint + const jinaProxyAuth = formatEmbeddingProviderError( + Object.assign(new Error("401 Unauthorized"), { status: 401 }), + { + baseURL: "https://proxy.example.invalid/v1", + model: "jina-embeddings-v5-text-small", + }, + ); + assert.match(jinaProxyAuth, /authentication failed/i, jinaProxyAuth); + assert.match(jinaProxyAuth, /Jina key expired/i, jinaProxyAuth); + assert.match(jinaProxyAuth, /Ollama/i, jinaProxyAuth); + const jinaAuth = formatEmbeddingProviderError( Object.assign(new Error("403 Invalid API key"), { status: 403, @@ -100,6 +331,15 @@ async function run() { ); assert.match(formattedBatch, /^Failed to generate batch embeddings from /, formattedBatch); + const formattedVoyage = formatEmbeddingProviderError( + new Error("unsupported request field"), + { + baseURL: "https://api.voyageai.com/v1", + model: "voyage-3-lite", + }, + ); + assert.match(formattedVoyage, /^Failed to generate embedding from Voyage:/, formattedVoyage); + console.log("OK: embedder auth/network error hints verified"); } diff --git a/test/embedder-ollama-abort.test.mjs b/test/embedder-ollama-abort.test.mjs new file mode 100644 index 00000000..73a55f2d --- /dev/null +++ b/test/embedder-ollama-abort.test.mjs @@ -0,0 +1,99 @@ +import assert from "node:assert/strict"; +import http from "node:http"; +import { test } from "node:test"; + +import jitiFactory from "jiti"; + +const jiti = jitiFactory(import.meta.url, { interopDefault: true }); +const { Embedder } = jiti("../src/embedder.ts"); + +/** + * Test: Ollama native fetch correctly aborts a slow HTTP request. + * + * Root cause (Issue #361 / PR #383): + * OpenAI SDK's HTTP client does not reliably abort Ollama TCP connections + * when AbortController.abort() fires in Node.js, causing stalled sockets + * that hang until the gateway-level timeout. + * + * Fix: For Ollama endpoints (localhost:11434), use Node.js native fetch + * instead of the OpenAI SDK. Native fetch properly closes TCP on abort. + * + * This test verifies the fix by: + * 1. Mocking a slow Ollama server on 127.0.0.1:11434 (5s delay) + * 2. Calling embedPassage with an AbortSignal that fires after 2s + * 3. Asserting total time ≈ 2s (not 5s) — proving abort interrupted the request + * + * Note: The mock server is bound to 127.0.0.1:11434 (not a random port) so that + * isOllamaProvider() returns true and the native fetch path is exercised. + */ +test("Ollama embedWithNativeFetch aborts slow request within expected time", async () => { + const SLOW_DELAY_MS = 5_000; + const ABORT_AFTER_MS = 2_000; + const DIMS = 1024; + + const server = http.createServer((req, res) => { + if (req.url === "/v1/embeddings" && req.method === "POST") { + const timer = setTimeout(() => { + if (res.writableEnded) return; // already aborted + res.writeHead(200, { "Content-Type": "application/json" }); + res.end(JSON.stringify({ + data: [{ embedding: Array.from({ length: DIMS }, () => 0.1), index: 0 }] + })); + }, SLOW_DELAY_MS); + req.on("aborted", () => clearTimeout(timer)); + return; + } + res.writeHead(404); + res.end("not found"); + }); + + // Bind to 127.0.0.1:11434 so isOllamaProvider() returns true → native fetch path + await new Promise((resolve) => server.listen(11434, "127.0.0.1", resolve)); + + try { + const embedder = new Embedder({ + provider: "openai-compatible", + apiKey: "test-key", + model: "mxbai-embed-large", + baseURL: "http://127.0.0.1:11434/v1", + dimensions: DIMS, + }); + + assert.ok( + embedder.isOllamaProvider(), + "isOllamaProvider() should return true for http://127.0.0.1:11434", + ); + + const start = Date.now(); + const controller = new AbortController(); + const abortTimer = setTimeout(() => controller.abort(), ABORT_AFTER_MS); + + let errorCaught; + try { + await embedder.embedPassage("abort test probe", controller.signal); + assert.fail("embedPassage should have thrown"); + } catch (e) { + errorCaught = e; + } + + clearTimeout(abortTimer); + const elapsed = Date.now() - start; + + assert.ok(errorCaught, "embedPassage should have thrown (abort or timeout)"); + const msg = errorCaught instanceof Error ? errorCaught.message : String(errorCaught); + assert.ok( + /timed out|abort|ollama/i.test(msg), + `Expected abort/timeout/Ollama error, got: ${msg}`, + ); + + // Elapsed time must be close to ABORT_AFTER_MS, NOT SLOW_DELAY_MS. + // If abort worked: elapsed ≈ 2000ms. + // If abort failed: elapsed ≈ 5000ms (waited for slow response). + assert.ok( + elapsed < SLOW_DELAY_MS * 0.75, + `Expected abort ~${ABORT_AFTER_MS}ms, got ${elapsed}ms — abort did NOT interrupt slow request`, + ); + } finally { + await new Promise((resolve) => server.close(resolve)); + } +}); diff --git a/test/governance-metadata.test.mjs b/test/governance-metadata.test.mjs new file mode 100644 index 00000000..085dd9ce --- /dev/null +++ b/test/governance-metadata.test.mjs @@ -0,0 +1,72 @@ +import assert from "node:assert/strict"; +import { describe, it } from "node:test"; +import jitiFactory from "jiti"; + +const jiti = jitiFactory(import.meta.url, { interopDefault: true }); +const { + parseSmartMetadata, + buildSmartMetadata, +} = jiti("../src/smart-metadata.ts"); + +describe("governance metadata compatibility", () => { + it("fills governance defaults for legacy metadata", () => { + const meta = parseSmartMetadata(undefined, { + text: "legacy memory", + category: "fact", + importance: 0.7, + timestamp: 1710000000000, + }); + + assert.equal(meta.state, "confirmed"); + assert.equal(meta.source, "legacy"); + assert.equal(meta.memory_layer, "working"); + assert.equal(meta.injected_count, 0); + assert.equal(meta.bad_recall_count, 0); + assert.equal(meta.suppressed_until_turn, 0); + }); + + it("maps session-summary records to archived/reflection defaults", () => { + const meta = parseSmartMetadata( + JSON.stringify({ type: "session-summary", l0_abstract: "summary" }), + { + text: "summary", + category: "other", + }, + ); + + assert.equal(meta.source, "session-summary"); + assert.equal(meta.state, "archived"); + assert.equal(meta.memory_layer, "reflection"); + }); + + it("buildSmartMetadata preserves and updates governance fields", () => { + const original = { + text: "captured note", + category: "other", + timestamp: 1710000000000, + metadata: JSON.stringify({ + state: "pending", + source: "auto-capture", + memory_layer: "working", + injected_count: 2, + bad_recall_count: 1, + }), + }; + + const patched = buildSmartMetadata(original, { + state: "confirmed", + source: "manual", + memory_layer: "durable", + injected_count: 3, + bad_recall_count: 0, + last_confirmed_use_at: 1710000001234, + }); + + assert.equal(patched.state, "confirmed"); + assert.equal(patched.source, "manual"); + assert.equal(patched.memory_layer, "durable"); + assert.equal(patched.injected_count, 3); + assert.equal(patched.bad_recall_count, 0); + assert.equal(patched.last_confirmed_use_at, 1710000001234); + }); +}); diff --git a/test/helpers/openclaw-plugin-sdk-stub.mjs b/test/helpers/openclaw-plugin-sdk-stub.mjs index 22f66865..f32628e9 100644 --- a/test/helpers/openclaw-plugin-sdk-stub.mjs +++ b/test/helpers/openclaw-plugin-sdk-stub.mjs @@ -4,4 +4,3 @@ export function stringEnum(values) { enum: Array.isArray(values) ? values : [], }; } - diff --git a/test/intent-analyzer.test.mjs b/test/intent-analyzer.test.mjs new file mode 100644 index 00000000..8d25876f --- /dev/null +++ b/test/intent-analyzer.test.mjs @@ -0,0 +1,209 @@ +import { describe, it } from "node:test"; +import assert from "node:assert/strict"; +import { analyzeIntent, applyCategoryBoost, formatAtDepth } from "../src/intent-analyzer.ts"; + +describe("analyzeIntent", () => { + it("detects preference intent (English)", () => { + const result = analyzeIntent("What is my preferred coding style?"); + assert.equal(result.label, "preference"); + assert.equal(result.confidence, "high"); + assert.equal(result.depth, "l0"); + assert.ok(result.categories.includes("preference")); + }); + + it("detects preference intent (Chinese)", () => { + const result = analyzeIntent("我的代码风格偏好是什么?"); + assert.equal(result.label, "preference"); + assert.equal(result.confidence, "high"); + }); + + it("detects decision intent", () => { + const result = analyzeIntent("Why did we choose PostgreSQL over MySQL?"); + assert.equal(result.label, "decision"); + assert.equal(result.confidence, "high"); + assert.equal(result.depth, "l1"); + assert.ok(result.categories.includes("decision")); + }); + + it("detects decision intent (Chinese)", () => { + const result = analyzeIntent("当时决定用哪个方案?"); + assert.equal(result.label, "decision"); + assert.equal(result.confidence, "high"); + }); + + it("detects entity intent", () => { + const result = analyzeIntent("Who is the project lead for auth service?"); + assert.equal(result.label, "entity"); + assert.equal(result.confidence, "high"); + assert.ok(result.categories.includes("entity")); + }); + + it("detects entity intent (Chinese)", () => { + const result = analyzeIntent("谁是这个项目的负责人?"); + assert.equal(result.label, "entity"); + assert.equal(result.confidence, "high"); + }); + + it("does NOT misclassify tool/component queries as entity", () => { + // These should match fact, not entity (Codex review finding #4) + const tool = analyzeIntent("How do I install the tool?"); + assert.notEqual(tool.label, "entity"); + const component = analyzeIntent("How does this component work?"); + assert.notEqual(component.label, "entity"); + }); + + it("detects event intent and routes to entity+decision categories", () => { + const result = analyzeIntent("What happened during last week's deploy?"); + assert.equal(result.label, "event"); + assert.equal(result.confidence, "high"); + assert.equal(result.depth, "full"); + // event is not a stored category — should route to entity + decision + assert.ok(result.categories.includes("entity")); + assert.ok(result.categories.includes("decision")); + assert.ok(!result.categories.includes("event")); + }); + + it("detects event intent (Chinese)", () => { + const result = analyzeIntent("最近发生了什么?"); + assert.equal(result.label, "event"); + assert.equal(result.confidence, "high"); + assert.ok(!result.categories.includes("event")); + }); + + it("detects fact intent", () => { + const result = analyzeIntent("How does the authentication API work?"); + assert.equal(result.label, "fact"); + assert.equal(result.confidence, "high"); + assert.equal(result.depth, "l1"); + }); + + it("detects fact intent (Chinese)", () => { + const result = analyzeIntent("这个接口怎么配置?"); + assert.equal(result.label, "fact"); + assert.equal(result.confidence, "high"); + }); + + it("returns broad signal for ambiguous queries", () => { + const result = analyzeIntent("write a function to sort arrays"); + assert.equal(result.label, "broad"); + assert.equal(result.confidence, "low"); + assert.deepEqual(result.categories, []); + assert.equal(result.depth, "l0"); + }); + + it("returns empty signal for empty input", () => { + const result = analyzeIntent(""); + assert.equal(result.label, "empty"); + assert.equal(result.confidence, "low"); + }); +}); + +describe("applyCategoryBoost", () => { + const mockResults = [ + { entry: { category: "fact" }, score: 0.8 }, + { entry: { category: "preference" }, score: 0.75 }, + { entry: { category: "entity" }, score: 0.7 }, + ]; + + it("boosts matching categories and re-sorts", () => { + const intent = { + categories: ["preference"], + depth: "l0", + confidence: "high", + label: "preference", + }; + const boosted = applyCategoryBoost(mockResults, intent); + // preference entry (0.75 * 1.15 = 0.8625) should now rank first + assert.equal(boosted[0].entry.category, "preference"); + assert.ok(boosted[0].score > 0.75); + }); + + it("returns results unchanged for low confidence", () => { + const intent = { + categories: [], + depth: "l0", + confidence: "low", + label: "broad", + }; + const result = applyCategoryBoost(mockResults, intent); + assert.equal(result[0].entry.category, "fact"); // original order preserved + }); + + it("caps boosted scores at 1.0", () => { + const highScoreResults = [ + { entry: { category: "preference" }, score: 0.95 }, + ]; + const intent = { + categories: ["preference"], + depth: "l0", + confidence: "high", + label: "preference", + }; + const boosted = applyCategoryBoost(highScoreResults, intent); + assert.ok(boosted[0].score <= 1.0); + }); +}); + +describe("formatAtDepth", () => { + const entry = { + text: "User prefers TypeScript over JavaScript for all new projects. This was decided after the migration incident in Q3 where type errors caused a production outage.", + category: "preference", + scope: "global", + }; + + it("l0: returns compact one-line summary", () => { + const line = formatAtDepth(entry, "l0", 0.85, 0); + assert.ok(line.length < entry.text.length + 30); // shorter than full + assert.ok(line.includes("[preference]")); + assert.ok(line.includes("85%")); + assert.ok(!line.includes("global")); // l0 omits scope + }); + + it("l1: returns medium detail with scope", () => { + const line = formatAtDepth(entry, "l1", 0.72, 1); + assert.ok(line.includes("[preference:global]")); + assert.ok(line.includes("72%")); + }); + + it("full: returns complete text", () => { + const line = formatAtDepth(entry, "full", 0.9, 0); + assert.ok(line.includes(entry.text)); + assert.ok(line.includes("[preference:global]")); + }); + + it("includes BM25 and rerank source tags", () => { + const line = formatAtDepth(entry, "full", 0.8, 0, { bm25Hit: true, reranked: true }); + assert.ok(line.includes("vector+BM25")); + assert.ok(line.includes("+reranked")); + }); + + it("handles short text without truncation", () => { + const short = { text: "Use tabs.", category: "preference", scope: "global" }; + const l0 = formatAtDepth(short, "l0", 0.9, 0); + assert.ok(l0.includes("Use tabs.")); + }); + + it("splits CJK sentences correctly at l0 depth", () => { + const cjk = { + text: "第一句结束。第二句开始,这里有更多内容需要处理。", + category: "fact", + scope: "global", + }; + const l0 = formatAtDepth(cjk, "l0", 0.8, 0); + // Should stop at first 。 not include second sentence + assert.ok(l0.includes("第一句结束。")); + assert.ok(!l0.includes("第二句开始")); + }); + + it("applies sanitize function when provided", () => { + const malicious = { + text: ' normal text', + category: "fact", + scope: "global", + }; + const sanitize = (t) => t.replace(/<[^>]*>/g, "").trim(); + const line = formatAtDepth(malicious, "full", 0.8, 0, { sanitize }); + assert.ok(!line.includes("