feat: add autoRecallExcludeAgents config + idempotent guard fix#1
Open
feat: add autoRecallExcludeAgents config + idempotent guard fix#1
Conversation
…-rerank fix: add TEI rerank provider support
…view-on-fork-prs ci: skip Claude Code Review on fork PRs
…ortexReach#221) fix(cli): flush pending writes after import by calling store.close()
…close() (CortexReach#221)" (CortexReach#232) revert: undo CortexReach#221 - store.close() based on incorrect root cause
feat: skip USER.md-exclusive facts in plugin memory
…nostics Improve LLM diagnostics and make timeouts configurable
When a large CJK text (14KB+ Chinese .md file) is processed by auto-recall, embedSingle() enters an infinite recursion loop because: 1. smartChunk() treats token limits as character limits, but CJK characters use 2-3x more tokens than ASCII characters 2. Chunks of 5740 chars (70% of 8192 token limit) still exceed the model's token context for CJK text 3. smartChunk() returns 1 chunk identical to input → embedSingle() recurses with the same text → infinite loop This produced ~50,000 embedding errors in 12 minutes, blocking the entire Node.js event loop and making all agents unresponsive. Fixes: - Add recursion depth limit (max 3) to embedSingle() with forced truncation as fallback - Detect single-chunk output (same size as input) and truncate instead of recursing - Add CJK-aware chunk sizing in smartChunk() (divide char limit by 2.5 when CJK ratio > 30%) - Truncate auto-recall query to 1000 chars before embedding - Add 10s global timeout on embedPassage()/embedQuery() Closes CortexReach#214 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e (PR CortexReach#215 follow-up) This commit addresses the two blocking issues raised in PR CortexReach#215: 1. Timeout now uses AbortController for TRUE request cancellation - Timer is properly cleaned up in .finally() - AbortSignal is passed through to embedWithRetry 2. Recursion now guarantees monotonic convergence - Introduced STRICT_REDUCTION_FACTOR = 0.5 - Each recursion level must reduce input by 50% - Works regardless of model context size Modified by AI assistant (not human code) based on PR CortexReach#215. Thanks to the original author and maintainers. Co-authored-by: Hi-Jiajun <Hi-Jiajun@users.noreply.github.com>
… section - Highlight "AI Memory Assistant" as the core value proposition with before/after demo - Add Ecosystem section featuring setup script (CortexReach/toolbox) and Claude Code/OpenClaw skill - Move comparison table into collapsible details - Fix RRF → Hybrid Fusion naming to match actual implementation - Add reflection to storage categories (aligns with store.ts/tools.ts) - Clarify 6-category semantic labels vs storage categories in schema docs - Add openclaw plugins install as primary manual install path - Fix minor English expressions and Chinese translation polish - Reviewed by CC (Claude Code) + Codex across 3 rounds Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eedback (CortexReach#216) Voyage AI (and Voyage-style proxies) expose an OpenAI-compatible embeddings endpoint but reject standard OpenAI request fields and use different field names for task hints and output dimensions. Without handling these differences, all embedding calls to Voyage endpoints failed with 400 Bad Request: Unknown request body key: encoding_format, and task/dimension configuration was silently dropped. Changes: - Introduce EmbeddingCapabilities with taskField/taskValueMap/ dimensionsField replacing boolean flags; buildPayload is now fully data-driven with no provider conditionals - voyage-compatible profile: maps task values (retrieval.query→query, retrieval.passage→document) and sends output_dimension instead of dimensions; suppresses encoding_format - Remove unused _profile field from Embedder (capabilities already cached via _capabilities) - Add debug warnings in Embedder constructor when normalized or taskQuery/taskPassage are configured but the provider profile does not support them, so misconfiguration surfaces immediately - Fix Jina-specific auth hint to fire for jina-* models behind a proxy by checking profile instead of provider label; remove dead code - Add voyage-3/3-lite/code-3/3-large to EMBEDDING_DIMENSIONS - Add Voyage row to README embedding providers table - Add tests: Voyage input_type translation, output_dimension field, constructor debug warnings, Jina proxy auth hint; extract captureDebug helper to reduce warning test boilerplate Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…on (CortexReach#226) 核心变更: - ACL 键规范化:trim agentAccess 键,避免空白填充导致 ACL 失效 - 明确 deny-all 语义:scopeFilter=[] 拒绝所有读写,与 undefined(绕过)区分 - 统一 hook agentId 解析:before_agent_start 和 before_prompt_build 均使用 resolveHookAgentId - importConfig 原子性:验证失败时警告不泄漏,配置完整回滚 - 保留 bypass ID 限制:system/undefined 仅限内部使用,拒绝配置和 sessionKey 解析 - reflection 加载安全:bypass 调用方使用可选 scopeFilter,避免过滤绕过 新增测试: - test/reflection-bypass-hook.test.mjs - hook bypass 和 sessionKey 解析 - test/scope-access-undefined.test.mjs - bypass ID 拒绝和 scope 访问 - test/smart-extractor-scope-filter.test.mjs - SmartExtractor scopeFilter 语义 - test/store-empty-scope-filter.test.mjs - 空数组 deny-all 语义 影响范围:6 个源文件 + 4 个测试文件,约 970 行净变动
…rtexReach#217) * feat: add provider-aware OAuth login for memory-pro * fix: address oauth review feedback * fix: preserve llm baseURL and correct oauth callback/path handling * fix(cli): restore pre-oauth llm config on logout * fix(cli): tighten oauth path and logout restore --------- Co-authored-by: Heng Xia <pope@Hengs-Mac-mini.local>
- Remove unused SAFE_CHAR_LIMITS, getSafeCharLimit, DEFAULT_SAFE_CHAR_LIMIT - Add comment explaining batch timeout asymmetry (embedBatchQuery/embedBatchPassage not wrapped) - Note: withTimeout already has .finally() cleanup, no change needed
…ortexReach#238) - Test single-chunk detection (force-reduce when chunk >= 90% of original) - Test depth limit termination (depth >= MAX_EMBED_DEPTH throws) - Test CJK-aware chunk sizing (>30% CJK -> smaller chunks) - Test strict reduction factor (50% per recursion level) - Test batch embedding works correctly
- Preserve and surface chunkError instead of hiding behind original error - Remove 1000 char hard floor in smartChunk for small-context models (now 200) - Add regression test for small-context model chunking (all-MiniLM-L6-v2) - Add regression test for chunkError preservation - Wire cjk-recursion-regression.test.mjs into main test suite (CI)
1. Replace fragile readFileSync source-string test with behavior tests that mock SmartExtractor dedup pipeline (fixes review items 1 & 2) 2. Strip English articles (the/a/an) in normalizePreferenceToken to prevent "the Big Mac" vs "Big Mac" producing different tokens (item 4) 3. Remove redundant Chinese regex branches (喜欢喝|喜欢用|喜欢买) that can never match because 喜欢 always matches first (item 5) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…dedup fix: prevent same-brand different-item preferences from being deduped
…prompt_build Auto-recall and reflection invariants injection now use the before_prompt_build hook instead of the deprecated before_agent_start, with explicit priority values to control execution order (auto-recall=10, invariants=12, derived=15). This change eliminates the "Legacy before_agent_start" warning in OpenClaw 2026.3+ and follows the official architecture guidance that before_prompt_build is the preferred hook for prompt mutation work. Key changes: - index.ts: Migrate 2 before_agent_start hooks to before_prompt_build with priority ordering; add explicit type annotations to all hook handlers - test/recall-text-cleanup: Update harness registerHook to capture handlers; update assertions from before_agent_start to before_prompt_build - test/reflection-bypass-hook: Same harness fix; expect 2 before_prompt_build hooks (invariants + derived) sorted by priority instead of 1 before_agent_start + 1 before_prompt_build - README.md: Document hook adaptation approach, api.on() vs api.registerHook() registry distinction, migration steps, and OpenClaw version requirements Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tion testing 1. Fix auto-recall gating regression (P3): before_prompt_build's event.prompt contains the full assembled prompt including system instructions, causing shouldSkipRetrieval's short-message skip to never trigger. Added a message_received hook to cache the raw user message and use it for gating. 2. Fix auto-recall timeout race condition (P1): Promise.race allowed both the successful injection path and the timeout handler to fire. Added clearTimeout after recall completes to prevent misleading timeout warnings. 3. Lower auto-capture dedup threshold (P2): Reduced from 0.95 to 0.90 to catch more semantic duplicates (e.g. "我喜欢喝美式咖啡" vs "我最喜欢的 咖啡是美式") that were slipping through. Tested on QA-Probe bot with OpenClaw latest (2026-03-23), 27-item deep test suite, all fixes verified with live Discord interaction + log analysis. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Minimum version bumped to 2026.3.22 (before_prompt_build hook support) - Added note about automatic config migration via openclaw doctor --fix Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s for OpenClaw 2026.3+ - Migrate auto-recall and reflection hooks from deprecated before_agent_start to before_prompt_build - Fix auto-recall gating regression with message_received cache - Fix Promise.race timeout race condition - Lower auto-capture dedup threshold 0.95→0.90 - Update OpenClaw version requirements in README
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix CI version-sync check failure (package.json and openclaw.plugin.json must have matching versions). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Default raised from 3s to 5s for remote embedding APIs behind proxies.
Configurable in openclaw.json:
"plugins": {
"entries": {
"memory-lancedb-pro": {
"config": {
"autoRecallTimeoutMs": 8000
}
}
}
}
Closes CortexReach#314
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e link Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…fined agentId (CortexReach#288, CortexReach#231) Root cause: isAccessible(scope, undefined) used bypass mode (any valid scope returns true), but getScopeFilter(undefined) returned only explicitly defined scopes (["global"]), excluding implicit agent scopes like "agent:main". This caused memory_update to reject accessible memories while recall/forget worked. Changes: - scopes.ts: getScopeFilter now returns undefined (full bypass) when agentId is missing, matching isAccessible's existing bypass behavior - tools.ts: memory_update uses runtimeContext.scopeManager instead of context.scopeManager, consistent with recall/forget - test: updated scope-access-undefined assertions to match new bypass semantics Note: beta.10's resolveRuntimeAgentId already ensures agentId defaults to "main" in normal operation. This fix is defensive — prevents the same bug from recurring if any future code path passes undefined agentId. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… columns (CortexReach#326) Previously doInitialize() only logged a warning when detecting missing columns but never actually added them, causing stats/list/search/update/delete to fail with "No field named scope" on databases upgraded from legacy memory-lancedb. Now uses table.schema() to detect missing columns and table.addColumns() to add them with sensible defaults (scope='global', timestamp=0.0, metadata='{}'). Handles concurrent initialization race via "already exists" error detection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add recallMode config (full/summary/adaptive/off) for granular control over auto-recall injection depth. Adaptive mode uses zero-LLM-cost pattern matching to analyze query intent and route to appropriate memory categories with depth-aware formatting. - New: src/intent-analyzer.ts — analyzeIntent(), applyCategoryBoost() - New: test/intent-analyzer.test.mjs — 23 tests (all passing) - Modified: index.ts — integrate intent analysis into before_prompt_build - Modified: openclaw.plugin.json — add recallMode schema + UI field Rewritten for v1.1.0-beta.10 (replaces PR CortexReach#313 which was based on beta.9). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: yjjheizhu <yjjheizhu@users.noreply.github.com>
…etadata before extraction
feat: add recallMode with adaptive intent routing (beta.10)
Adds a `MemoryCompactor` that periodically consolidates semantically
similar old memories into single refined entries, inspired by the
progressive summarization pattern (MemOS). Over time, related memory
fragments are merged rather than accumulated, reducing retrieval noise
and keeping the LanceDB index lean.
Key additions:
- `src/memory-compactor.ts`: pure, dependency-free compaction module
with cosine-similarity clustering, greedy seed expansion, and
rule-based merge (dedup lines, max importance, plurality category)
- `store.ts`: new `fetchForCompaction()` method that fetches old entries
with vectors (intentionally omitted from `list()` for performance)
- `index.ts`: `memory_compact` management tool (requires
`enableManagementTools: true`) + optional auto-compaction at
`gateway_start` with configurable cooldown
- `openclaw.plugin.json`: `memoryCompaction` config schema + uiHints
- `test/memory-compactor.test.mjs`: 23 tests, 100% pass
Config example:
memoryCompaction:
enabled: true # auto-run at gateway_start
minAgeDays: 7 # only touch memories ≥ 7 days old
similarityThreshold: 0.88
cooldownHours: 24
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
feat: memory compaction — progressive summarization for stored memories
Combines the following rebased changes onto latest cortexreach/master: - feat: add session compression and adaptive extraction throttling - fix: disable extraction throttle and session compression in smart-extractor-branches test - fix: address Codex review for session compression - fix: address PR CortexReach#318 review — align opt-in defaults, CJK-aware scoring, rate limiter docs Session compression scores conversation texts before auto-capture and drops low-signal content; extraction throttling adds a sliding-window rate limiter and a low-value conversation skip gate. Both features default to opt-in (disabled) to avoid surprising existing users. Resolves merge conflicts with memory-compaction (CortexReach#343) and recall-mode-v2 (CortexReach#342) that landed on master since the original branch.
…sion feat: session compression + adaptive extraction throttling
* feat: add observable retrieval traces, stats collector, and batch dedup Observable Retrieval (Feature 3): - Add TraceCollector (src/retrieval-trace.ts) that tracks entry IDs through each pipeline stage, computing drops, score ranges, and timing - Add RetrievalStatsCollector (src/retrieval-stats.ts) for aggregate query metrics: latency percentiles, zero-result rate, top drop stages - Instrument all retrieval stages in MemoryRetriever.retrieve() with optional trace (zero overhead when disabled via optional chaining) - Add retrieveWithTrace() for always-on debug tracing - Add memory_debug tool (requires enableManagementTools) returning full per-stage pipeline trace with drop info - Extend memory_stats tool to include retrieval quality metrics Batch Dedup (Feature 2): - Add batchDedup() (src/batch-dedup.ts) for cosine similarity dedup within extraction batches before expensive LLM dedup calls - Add ExtractionCostStats tracking (batchDeduped, durationMs, llmCalls) Tests: 26 new tests (16 trace + 10 batch-dedup), all passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address Codex review for observable retrieval 1. [High] Fix hybrid trace timing: replace sequential vector_search/bm25_search stages with single parallel_search stage that correctly represents concurrent execution 2. [High] Fix negative drop display: search stages with input=0 now show "found N" instead of "dropped -N" 3. [Medium] Fix rerankUsed overcount: only emit rerank trace stage when rerank is actually enabled (config.rerank !== "none"), not on every query Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: correct trace mode for BM25 tag queries + batchDedup TODO - RetrievalTrace.mode now includes "bm25" variant - retrieve() and retrieveWithTrace() record mode as "bm25" when tagTokens trigger bm25OnlyRetrieval (was incorrectly "vector") - Add TODO for wiring batchDedup() into extraction pipeline Addresses review feedback from rwmjhb on PR CortexReach#319. * fix: address PR CortexReach#319 review feedback (rwmjhb) 1. Wire batchDedup into extraction pipeline (smart-extractor.ts) - Import and call batchDedup() between LLM extraction and per-candidate dedup, embedding candidate abstracts upfront and filtering near-dupes before expensive LLM dedup calls - Graceful fallback: if embedding or dedup fails, all candidates proceed - Batch-deduped candidates counted in stats.skipped 2. Fix memory_debug scope parameter bug (tools.ts) - resolveScopeFilter() only accepts 2 params (scopeManager, agentId), not 3 — was passing `scope` as a third arg which was silently ignored - Match the pattern used by all other tools: resolve default scope filter first, then override with [scope] if scope param is provided and accessible --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add _initialized singleton flag to prevent re-initialization when register() is called multiple times during gateway boot - Add per-entry debug logging for governance filter decisions (id, reason, score, text snippet) for observability - Export _resetInitialized() for test harness reset - Fixes initialization block repeated N times on startup - Fixes governance filter decisions not observable in logs
0f510a0 to
a1d2b09
Compare
…ard + openclaw.plugin.json schema
a1d2b09 to
783db51
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements PR CortexReach#365 missing pieces:
parsePluginConfig: AddedautoRecallExcludeAgentsandrecallModeparsingrecallWork: Added excluded agent check - returnsundefinedearly ifagentIdis inautoRecallExcludeAgentsmemoryLanceDBProPlugin: Added_resetInitialized()method for testing idempotent guardtest/pr365-auto-recall-exclude.test.mjs(8 tests, 7 pass)Changes
index.ts:autoRecallExcludeAgents?: string[]toPluginConfiginterfaceautoRecallExcludeAgentsandrecallModeinparsePluginConfigrecallWorkfor excluded agents_resetInitialized()on plugin objectTest Results
7/8 tests pass. T5 is a pre-existing gap in the
recallMode=summaryimplementation.