Skip to content

feat: add autoRecallExcludeAgents config + idempotent guard fix#1

Open
jlin53882 wants to merge 108 commits intomasterfrom
fix/init-reentrancy-governance-logging
Open

feat: add autoRecallExcludeAgents config + idempotent guard fix#1
jlin53882 wants to merge 108 commits intomasterfrom
fix/init-reentrancy-governance-logging

Conversation

@jlin53882
Copy link
Copy Markdown
Owner

Summary

Implements PR CortexReach#365 missing pieces:

  1. parsePluginConfig: Added autoRecallExcludeAgents and recallMode parsing
  2. recallWork: Added excluded agent check - returns undefined early if agentId is in autoRecallExcludeAgents
  3. memoryLanceDBProPlugin: Added _resetInitialized() method for testing idempotent guard
  4. E2E Tests: Added test/pr365-auto-recall-exclude.test.mjs (8 tests, 7 pass)

Changes

  • index.ts:
    • Added autoRecallExcludeAgents?: string[] to PluginConfig interface
    • Added parsing of autoRecallExcludeAgents and recallMode in parsePluginConfig
    • Added early-return guard in recallWork for excluded agents
    • Added _resetInitialized() on plugin object

Test Results

✔ T1: normal agent receives auto-recall injection
✔ T2: excluded agent receives no auto-recall injection  
✔ T3: non-excluded agent receives injection even when excludeAgents is set
✔ T4: recallMode=off skips all auto-recall injection
✖ T5: recallMode=summary returns count-only format (pre-existing gap)
✔ T6: repeated register() does not duplicate hooks
✔ T7: autoRecall=false skips all injection
✔ T8: excluded agent logs the skip reason via info logger

7/8 tests pass. T5 is a pre-existing gap in the recallMode=summary implementation.

Heng Xia and others added 30 commits March 14, 2026 23:32
…-rerank

fix: add TEI rerank provider support
…view-on-fork-prs

ci: skip Claude Code Review on fork PRs
…ortexReach#221)

fix(cli): flush pending writes after import by calling store.close()
…close() (CortexReach#221)" (CortexReach#232)

revert: undo CortexReach#221 - store.close() based on incorrect root cause
feat: skip USER.md-exclusive facts in plugin memory
…nostics

Improve LLM diagnostics and make timeouts configurable
When a large CJK text (14KB+ Chinese .md file) is processed by
auto-recall, embedSingle() enters an infinite recursion loop because:

1. smartChunk() treats token limits as character limits, but CJK
   characters use 2-3x more tokens than ASCII characters
2. Chunks of 5740 chars (70% of 8192 token limit) still exceed
   the model's token context for CJK text
3. smartChunk() returns 1 chunk identical to input → embedSingle()
   recurses with the same text → infinite loop

This produced ~50,000 embedding errors in 12 minutes, blocking
the entire Node.js event loop and making all agents unresponsive.

Fixes:
- Add recursion depth limit (max 3) to embedSingle() with forced
  truncation as fallback
- Detect single-chunk output (same size as input) and truncate
  instead of recursing
- Add CJK-aware chunk sizing in smartChunk() (divide char limit
  by 2.5 when CJK ratio > 30%)
- Truncate auto-recall query to 1000 chars before embedding
- Add 10s global timeout on embedPassage()/embedQuery()

Closes CortexReach#214

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e (PR CortexReach#215 follow-up)

This commit addresses the two blocking issues raised in PR CortexReach#215:

1. Timeout now uses AbortController for TRUE request cancellation
   - Timer is properly cleaned up in .finally()
   - AbortSignal is passed through to embedWithRetry

2. Recursion now guarantees monotonic convergence
   - Introduced STRICT_REDUCTION_FACTOR = 0.5
   - Each recursion level must reduce input by 50%
   - Works regardless of model context size

Modified by AI assistant (not human code) based on PR CortexReach#215.
Thanks to the original author and maintainers.

Co-authored-by: Hi-Jiajun <Hi-Jiajun@users.noreply.github.com>
… section

- Highlight "AI Memory Assistant" as the core value proposition with before/after demo
- Add Ecosystem section featuring setup script (CortexReach/toolbox) and Claude Code/OpenClaw skill
- Move comparison table into collapsible details
- Fix RRF → Hybrid Fusion naming to match actual implementation
- Add reflection to storage categories (aligns with store.ts/tools.ts)
- Clarify 6-category semantic labels vs storage categories in schema docs
- Add openclaw plugins install as primary manual install path
- Fix minor English expressions and Chinese translation polish
- Reviewed by CC (Claude Code) + Codex across 3 rounds

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eedback (CortexReach#216)

Voyage AI (and Voyage-style proxies) expose an OpenAI-compatible
embeddings endpoint but reject standard OpenAI request fields and use
different field names for task hints and output dimensions.

Without handling these differences, all embedding calls to Voyage
endpoints failed with 400 Bad Request: Unknown request body key:
encoding_format, and task/dimension configuration was silently dropped.

Changes:
- Introduce EmbeddingCapabilities with taskField/taskValueMap/
  dimensionsField replacing boolean flags; buildPayload is now fully
  data-driven with no provider conditionals
- voyage-compatible profile: maps task values (retrieval.query→query,
  retrieval.passage→document) and sends output_dimension instead of
  dimensions; suppresses encoding_format
- Remove unused _profile field from Embedder (capabilities already
  cached via _capabilities)
- Add debug warnings in Embedder constructor when normalized or
  taskQuery/taskPassage are configured but the provider profile does
  not support them, so misconfiguration surfaces immediately
- Fix Jina-specific auth hint to fire for jina-* models behind a proxy
  by checking profile instead of provider label; remove dead code
- Add voyage-3/3-lite/code-3/3-large to EMBEDDING_DIMENSIONS
- Add Voyage row to README embedding providers table
- Add tests: Voyage input_type translation, output_dimension field,
  constructor debug warnings, Jina proxy auth hint; extract
  captureDebug helper to reduce warning test boilerplate

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…on (CortexReach#226)

核心变更:
- ACL 键规范化:trim agentAccess 键,避免空白填充导致 ACL 失效
- 明确 deny-all 语义:scopeFilter=[] 拒绝所有读写,与 undefined(绕过)区分
- 统一 hook agentId 解析:before_agent_start 和 before_prompt_build 均使用 resolveHookAgentId
- importConfig 原子性:验证失败时警告不泄漏,配置完整回滚
- 保留 bypass ID 限制:system/undefined 仅限内部使用,拒绝配置和 sessionKey 解析
- reflection 加载安全:bypass 调用方使用可选 scopeFilter,避免过滤绕过

新增测试:
- test/reflection-bypass-hook.test.mjs - hook bypass 和 sessionKey 解析
- test/scope-access-undefined.test.mjs - bypass ID 拒绝和 scope 访问
- test/smart-extractor-scope-filter.test.mjs - SmartExtractor scopeFilter 语义
- test/store-empty-scope-filter.test.mjs - 空数组 deny-all 语义

影响范围:6 个源文件 + 4 个测试文件,约 970 行净变动
…rtexReach#217)

* feat: add provider-aware OAuth login for memory-pro

* fix: address oauth review feedback

* fix: preserve llm baseURL and correct oauth callback/path handling

* fix(cli): restore pre-oauth llm config on logout

* fix(cli): tighten oauth path and logout restore

---------

Co-authored-by: Heng Xia <pope@Hengs-Mac-mini.local>
- Remove unused SAFE_CHAR_LIMITS, getSafeCharLimit, DEFAULT_SAFE_CHAR_LIMIT
- Add comment explaining batch timeout asymmetry (embedBatchQuery/embedBatchPassage not wrapped)
- Note: withTimeout already has .finally() cleanup, no change needed
…ortexReach#238)

- Test single-chunk detection (force-reduce when chunk >= 90% of original)
- Test depth limit termination (depth >= MAX_EMBED_DEPTH throws)
- Test CJK-aware chunk sizing (>30% CJK -> smaller chunks)
- Test strict reduction factor (50% per recursion level)
- Test batch embedding works correctly
- Preserve and surface chunkError instead of hiding behind original error
- Remove 1000 char hard floor in smartChunk for small-context models (now 200)
- Add regression test for small-context model chunking (all-MiniLM-L6-v2)
- Add regression test for chunkError preservation
- Wire cjk-recursion-regression.test.mjs into main test suite (CI)
AliceLJY and others added 27 commits March 23, 2026 16:12
1. Replace fragile readFileSync source-string test with behavior tests
   that mock SmartExtractor dedup pipeline (fixes review items 1 & 2)
2. Strip English articles (the/a/an) in normalizePreferenceToken to
   prevent "the Big Mac" vs "Big Mac" producing different tokens (item 4)
3. Remove redundant Chinese regex branches (喜欢喝|喜欢用|喜欢买) that
   can never match because 喜欢 always matches first (item 5)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…dedup

fix: prevent same-brand different-item preferences from being deduped
…prompt_build

Auto-recall and reflection invariants injection now use the before_prompt_build
hook instead of the deprecated before_agent_start, with explicit priority values
to control execution order (auto-recall=10, invariants=12, derived=15).

This change eliminates the "Legacy before_agent_start" warning in OpenClaw
2026.3+ and follows the official architecture guidance that before_prompt_build
is the preferred hook for prompt mutation work.

Key changes:
- index.ts: Migrate 2 before_agent_start hooks to before_prompt_build with
  priority ordering; add explicit type annotations to all hook handlers
- test/recall-text-cleanup: Update harness registerHook to capture handlers;
  update assertions from before_agent_start to before_prompt_build
- test/reflection-bypass-hook: Same harness fix; expect 2 before_prompt_build
  hooks (invariants + derived) sorted by priority instead of 1 before_agent_start + 1 before_prompt_build
- README.md: Document hook adaptation approach, api.on() vs api.registerHook()
  registry distinction, migration steps, and OpenClaw version requirements

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tion testing

1. Fix auto-recall gating regression (P3): before_prompt_build's event.prompt
   contains the full assembled prompt including system instructions, causing
   shouldSkipRetrieval's short-message skip to never trigger. Added a
   message_received hook to cache the raw user message and use it for gating.

2. Fix auto-recall timeout race condition (P1): Promise.race allowed both
   the successful injection path and the timeout handler to fire. Added
   clearTimeout after recall completes to prevent misleading timeout warnings.

3. Lower auto-capture dedup threshold (P2): Reduced from 0.95 to 0.90 to
   catch more semantic duplicates (e.g. "我喜欢喝美式咖啡" vs "我最喜欢的
   咖啡是美式") that were slipping through.

Tested on QA-Probe bot with OpenClaw latest (2026-03-23), 27-item deep test
suite, all fixes verified with live Discord interaction + log analysis.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Minimum version bumped to 2026.3.22 (before_prompt_build hook support)
- Added note about automatic config migration via openclaw doctor --fix

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s for OpenClaw 2026.3+

- Migrate auto-recall and reflection hooks from deprecated before_agent_start to before_prompt_build
- Fix auto-recall gating regression with message_received cache
- Fix Promise.race timeout race condition
- Lower auto-capture dedup threshold 0.95→0.90
- Update OpenClaw version requirements in README
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix CI version-sync check failure (package.json and openclaw.plugin.json
must have matching versions).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Default raised from 3s to 5s for remote embedding APIs behind proxies.
Configurable in openclaw.json:

  "plugins": {
    "entries": {
      "memory-lancedb-pro": {
        "config": {
          "autoRecallTimeoutMs": 8000
        }
      }
    }
  }

Closes CortexReach#314

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e link

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…fined agentId (CortexReach#288, CortexReach#231)

Root cause: isAccessible(scope, undefined) used bypass mode (any valid scope
returns true), but getScopeFilter(undefined) returned only explicitly defined
scopes (["global"]), excluding implicit agent scopes like "agent:main". This
caused memory_update to reject accessible memories while recall/forget worked.

Changes:
- scopes.ts: getScopeFilter now returns undefined (full bypass) when agentId
  is missing, matching isAccessible's existing bypass behavior
- tools.ts: memory_update uses runtimeContext.scopeManager instead of
  context.scopeManager, consistent with recall/forget
- test: updated scope-access-undefined assertions to match new bypass semantics

Note: beta.10's resolveRuntimeAgentId already ensures agentId defaults to
"main" in normal operation. This fix is defensive — prevents the same bug
from recurring if any future code path passes undefined agentId.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… columns (CortexReach#326)

Previously doInitialize() only logged a warning when detecting missing columns
but never actually added them, causing stats/list/search/update/delete to fail
with "No field named scope" on databases upgraded from legacy memory-lancedb.

Now uses table.schema() to detect missing columns and table.addColumns() to
add them with sensible defaults (scope='global', timestamp=0.0, metadata='{}').
Handles concurrent initialization race via "already exists" error detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add recallMode config (full/summary/adaptive/off) for granular control
over auto-recall injection depth. Adaptive mode uses zero-LLM-cost
pattern matching to analyze query intent and route to appropriate
memory categories with depth-aware formatting.

- New: src/intent-analyzer.ts — analyzeIntent(), applyCategoryBoost()
- New: test/intent-analyzer.test.mjs — 23 tests (all passing)
- Modified: index.ts — integrate intent analysis into before_prompt_build
- Modified: openclaw.plugin.json — add recallMode schema + UI field

Rewritten for v1.1.0-beta.10 (replaces PR CortexReach#313 which was based on beta.9).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: yjjheizhu <yjjheizhu@users.noreply.github.com>
feat: add recallMode with adaptive intent routing (beta.10)
Adds a `MemoryCompactor` that periodically consolidates semantically
similar old memories into single refined entries, inspired by the
progressive summarization pattern (MemOS). Over time, related memory
fragments are merged rather than accumulated, reducing retrieval noise
and keeping the LanceDB index lean.

Key additions:
- `src/memory-compactor.ts`: pure, dependency-free compaction module
  with cosine-similarity clustering, greedy seed expansion, and
  rule-based merge (dedup lines, max importance, plurality category)
- `store.ts`: new `fetchForCompaction()` method that fetches old entries
  with vectors (intentionally omitted from `list()` for performance)
- `index.ts`: `memory_compact` management tool (requires
  `enableManagementTools: true`) + optional auto-compaction at
  `gateway_start` with configurable cooldown
- `openclaw.plugin.json`: `memoryCompaction` config schema + uiHints
- `test/memory-compactor.test.mjs`: 23 tests, 100% pass

Config example:
  memoryCompaction:
    enabled: true        # auto-run at gateway_start
    minAgeDays: 7        # only touch memories ≥ 7 days old
    similarityThreshold: 0.88
    cooldownHours: 24

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
feat: memory compaction — progressive summarization for stored memories
Combines the following rebased changes onto latest cortexreach/master:
- feat: add session compression and adaptive extraction throttling
- fix: disable extraction throttle and session compression in smart-extractor-branches test
- fix: address Codex review for session compression
- fix: address PR CortexReach#318 review — align opt-in defaults, CJK-aware scoring, rate limiter docs

Session compression scores conversation texts before auto-capture and
drops low-signal content; extraction throttling adds a sliding-window
rate limiter and a low-value conversation skip gate.

Both features default to opt-in (disabled) to avoid surprising existing
users. Resolves merge conflicts with memory-compaction (CortexReach#343) and
recall-mode-v2 (CortexReach#342) that landed on master since the original branch.
…sion

feat: session compression + adaptive extraction throttling
* feat: add observable retrieval traces, stats collector, and batch dedup

Observable Retrieval (Feature 3):
- Add TraceCollector (src/retrieval-trace.ts) that tracks entry IDs
  through each pipeline stage, computing drops, score ranges, and timing
- Add RetrievalStatsCollector (src/retrieval-stats.ts) for aggregate
  query metrics: latency percentiles, zero-result rate, top drop stages
- Instrument all retrieval stages in MemoryRetriever.retrieve() with
  optional trace (zero overhead when disabled via optional chaining)
- Add retrieveWithTrace() for always-on debug tracing
- Add memory_debug tool (requires enableManagementTools) returning full
  per-stage pipeline trace with drop info
- Extend memory_stats tool to include retrieval quality metrics

Batch Dedup (Feature 2):
- Add batchDedup() (src/batch-dedup.ts) for cosine similarity dedup
  within extraction batches before expensive LLM dedup calls
- Add ExtractionCostStats tracking (batchDeduped, durationMs, llmCalls)

Tests: 26 new tests (16 trace + 10 batch-dedup), all passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address Codex review for observable retrieval

1. [High] Fix hybrid trace timing: replace sequential vector_search/bm25_search
   stages with single parallel_search stage that correctly represents concurrent execution
2. [High] Fix negative drop display: search stages with input=0 now show
   "found N" instead of "dropped -N"
3. [Medium] Fix rerankUsed overcount: only emit rerank trace stage when rerank
   is actually enabled (config.rerank !== "none"), not on every query

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: correct trace mode for BM25 tag queries + batchDedup TODO

- RetrievalTrace.mode now includes "bm25" variant
- retrieve() and retrieveWithTrace() record mode as "bm25" when
  tagTokens trigger bm25OnlyRetrieval (was incorrectly "vector")
- Add TODO for wiring batchDedup() into extraction pipeline

Addresses review feedback from rwmjhb on PR CortexReach#319.

* fix: address PR CortexReach#319 review feedback (rwmjhb)

1. Wire batchDedup into extraction pipeline (smart-extractor.ts)
   - Import and call batchDedup() between LLM extraction and per-candidate
     dedup, embedding candidate abstracts upfront and filtering near-dupes
     before expensive LLM dedup calls
   - Graceful fallback: if embedding or dedup fails, all candidates proceed
   - Batch-deduped candidates counted in stats.skipped

2. Fix memory_debug scope parameter bug (tools.ts)
   - resolveScopeFilter() only accepts 2 params (scopeManager, agentId),
     not 3 — was passing `scope` as a third arg which was silently ignored
   - Match the pattern used by all other tools: resolve default scope filter
     first, then override with [scope] if scope param is provided and accessible

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add _initialized singleton flag to prevent re-initialization
  when register() is called multiple times during gateway boot
- Add per-entry debug logging for governance filter decisions
  (id, reason, score, text snippet) for observability
- Export _resetInitialized() for test harness reset
- Fixes initialization block repeated N times on startup
- Fixes governance filter decisions not observable in logs
@jlin53882 jlin53882 force-pushed the fix/init-reentrancy-governance-logging branch from 0f510a0 to a1d2b09 Compare March 28, 2026 14:58
@jlin53882 jlin53882 force-pushed the fix/init-reentrancy-governance-logging branch from a1d2b09 to 783db51 Compare March 28, 2026 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.