F023: Memory Admission Control — A-MAC-inspired fact scoring gate#126
F023: Memory Admission Control — A-MAC-inspired fact scoring gate#126
Conversation
A-MAC-inspired 5-factor admission scoring for learn_fact(): - Phase 1: Grounding (anti-hallucination), Novelty, Type Prior, Recency - Phase 2: LLM Utility assessment - Phase 3: Calibration feedback loop via F017 usage tracking - Phase 4: Contradiction-aware admission (integrates with F022) Addresses memory poisoning gap — currently zero defense against hallucinated facts entering long-term storage.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 40f3b8d302
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - `admission_enabled` is False (global toggle) | ||
| - `input.confidence == 1.0` (explicit override by LLM) | ||
| - `input.source` is `"system"`, `"migration"`, or `"initiation"` (non-conversational) | ||
| - `input.category == "rule"` (rules are always explicit declarations) |
There was a problem hiding this comment.
Remove rule-category bypass from admission scoring
Bypassing admission whenever input.category == "rule" creates a reliable path around the new quality gate, because category assignment is LLM-driven and therefore fallible/manipulable; any hallucinated or low-novelty fact mislabeled as rule would skip grounding/novelty checks and be persisted, which directly undermines the feature’s stated goal of preventing bad memories at write time.
Useful? React with 👍 / 👎.
|
⚡ Emerson — Initial review Strong spec. The A-MAC adaptation is thoughtful — you took the architecture and adjusted it for single-user context instead of blindly replicating a benchmark-optimized system. A few observations: What works well
Concerns to addressP1: Grounding check + inference-type facts P1: Tool output in grounding window (your Open Question #3) P2: Override abuse potential (your Open Question #2)
P2: Recency factor source_timestamp P3: F011 connection On the F022 → F023 pipelineThe pipeline vision makes sense: F023 gates admission → F022 links via graph → F011 activates skills from recall. Each layer builds on cleaner data from the layer below. Phase 4's contradiction-aware admission is the right integration point with F022 — don't try to do it earlier. VerdictApprove with revisions. The two P1s (tool output in grounding window as Phase 1 requirement, and inference-type fact tracking) should be addressed before merge. The P2s can be open questions that get resolved during implementation. Solid work. The shadow mode approach means even if the weights are wrong initially, we'll have data to correct them. |
tfatykhov
left a comment
There was a problem hiding this comment.
🔍 CodeReviewer — Spec Review: F023 Memory Admission Control
Summary
Well-researched spec with strong theoretical backing (A-MAC paper). The 5-factor scoring architecture is sound, the phase rollout plan is excellent, and the complementary relationship with F017 (admission vs retrieval filtering) is a good architectural choice.
However, there are 2 P1 critical issues that would make the system non-functional as specified, 4 P2 issues that need resolution before implementation, and 2 P3 suggestions.
P1 — Critical Issues
P1-1: confidence=1.0 bypass makes admission a no-op by default
The spec states (Bypass conditions section):
input.confidence == 1.0(explicit override by LLM)
But FactInput.confidence defaults to 1.0 in nous/heart/schemas.py:
confidence: float = Field(default=1.0, ge=0.0, le=1.0)And the learn_fact tool also defaults to confidence=1.0:
async def learn_fact(content, ..., confidence: float = 1.0, ...):This means every fact learned via the standard tool call bypasses admission scoring by default. The LLM almost never explicitly sets confidence — it just calls learn_fact(content=..., category=...). The entire admission system would be a no-op.
Fix options:
- Change the bypass to require an explicit
override=Trueflag (not tied to confidence) - Change the default confidence to something lower (e.g., 0.8) — but this has ripple effects
- Only bypass when confidence is explicitly passed AND equals 1.0 (requires tracking whether it was explicitly set vs defaulted)
Option 1 is cleanest — add an admission_override: bool = False parameter to learn_fact and FactInput.
P1-2: FactManager has no access to conversation messages for grounding check
FactManager.__init__ takes (db, embeddings, agent_id). It has no reference to WorkingMemoryManager, ConversationState, or any source of recent conversation turns.
The grounding check needs the last N conversation messages, but the spec says only "Uses the last N messages from working memory / episode transcript" without specifying how FactManager gets them.
The _learn() call chain: learn_fact() (tools.py) → heart.learn() → FactManager.learn() → FactManager._learn(). None of these currently pass conversation context.
Fix — the spec must specify the plumbing:
- Option A: Add
recent_messages: list[str] | None = Noneparameter tolearn()/_learn(), and havetools.pyfetch them fromConversationStatebefore calling - Option B: Give
FactManagera reference toWorkingMemoryManagerat construction time - Option C: Have
_compute_admission_score()queryConversationStatefrom the database directly usingself.db
Option C is simplest (no API changes to callers) but couples FactManager to conversation storage.
P2 — Should Fix
P2-1: ROUGE-L grounding blind spot for tool-sourced facts
The spec acknowledges this in Open Questions #3 and #6 but offers no solution. This is not an edge case — tool output is a primary source of facts in Nous. web_fetch, web_search, run_python results all feed into learn_fact.
If the grounding window only checks user/assistant messages (not tool call results), a large fraction of legitimate facts will get low grounding scores. The spec needs to specify:
- Which message types are included (user, assistant, tool results?)
- How tool output is accessed (it is in the conversation message array as tool_result messages)
- Whether this changes the grounding window size considerations
This should be resolved in the spec, not deferred to Open Questions.
P2-2: FactDetail schema mismatch
The spec says _rejected_detail() returns a FactDetail with admitted=False, but FactDetail has no admitted field. Current fields: id, agent_id, content, category, subject, confidence, source, learned_at, last_confirmed, confirmation_count, superseded_by, contradiction_of, active, tags, created_at, contradiction_warning.
The spec claims "No schema changes for Phase 1" but this requires adding at minimum admitted: bool and possibly admission_score: float and rejection_reason: str to FactDetail. These are Pydantic schema changes (not DB schema), but they need to be specified.
P2-3: category="rule" bypass is risky
The spec bypasses admission for category="rule" because "rules are always explicit declarations." But the LLM assigns categories — a hallucinated fact could be categorized as "rule" and bypass all checks. This creates a second bypass loophole alongside the confidence=1.0 issue.
The Type Prior table already gives rules 0.90, which is generous. Let rules go through scoring with their high prior rather than bypassing entirely. If a rule is hallucinated, grounding will catch it.
P2-4: Recency factor adds minimal signal in Phase 1
With a 24h half-life, most facts learned during active conversation (age ~0) score ~1.0. Even 6-hour-old context scores 0.84. The factor barely differentiates in practice because the vast majority of learn_fact calls happen during live conversation.
The only meaningful case is F016 pre-prune extraction, where facts come from tool output that could be minutes to hours old. Consider:
- Whether 20% weight for a factor that is almost always ~1.0 is dead weight
- Whether
source_timestampis even reliably available (the spec assumes it butFactInputhas no timestamp field — it hassource_episode_idwhich could be used to derive age, but that is not addressed)
P3 — Nice to Have
P3-1: Type Prior table has dead entry for "rule"
Since rules bypass admission entirely (per the bypass conditions), the TYPE_PRIORS["rule"] = 0.90 entry is dead code. Either remove the bypass (recommended, see P2-3) or remove the table entry. Having both is confusing.
P3-2: INDEX.md ordering
The PR inserts F023 between F022 and F021, resulting in the order: F022, F023, F021. Numerical order would be F021, F022, F023. This is a pre-existing issue (F021 was already out of order) but the PR perpetuates it.
What Works Well
- Problem statement — Clearly articulated with concrete examples. The "prevent vs filter" principle is sound.
- Research backing — A-MAC mapping to Nous gaps is thorough and honest.
- Phase rollout — Shadow → soft → full mode is excellent risk management.
- Continuous novelty — Extending binary dedup to continuous scoring is a meaningful improvement.
- Rejection UX — Feeding rejection info back to the LLM is well-designed.
- Scope control — "What We are NOT Building" section shows disciplined scoping.
- Metrics — Comprehensive observability plan.
- F017 complementarity — Using F017 usage data for Phase 3 calibration is an elegant feedback loop.
Verdict
DO NOT implement until P1 issues are resolved. The confidence=1.0 bypass (P1-1) makes the entire system a no-op, and the missing conversation access plumbing (P1-2) means the highest-weighted factor (grounding) cannot be implemented as specified.
After P1 fixes, this is a strong spec that should produce a valuable feature. The P2 issues are design refinements that can be addressed in the spec update.
Issue count: 2 P1 | 4 P2 | 2 P3
Summary
Adds spec for F023 — Memory Admission Control, an A-MAC-inspired 5-factor scoring gate that evaluates every candidate fact before it enters long-term storage.
Problem
Currently
learn_fact()stores everything the LLM says to store. There is zero defense against:5-Factor Scoring
4 Phases
Shadow mode rollout: scores without rejecting for first week.