Skip to content

feat(hindsight): add richer session-scoped retain metadata#6290

Open
Abnertheforeman wants to merge 23 commits intoNousResearch:mainfrom
Abnertheforeman:feat/hindsight-retain-metadata
Open

feat(hindsight): add richer session-scoped retain metadata#6290
Abnertheforeman wants to merge 23 commits intoNousResearch:mainfrom
Abnertheforeman:feat/hindsight-retain-metadata

Conversation

@Abnertheforeman
Copy link
Copy Markdown

@Abnertheforeman Abnertheforeman commented Apr 8, 2026

Summary

  • add configurable retain tags/source/transcript labels for native Hindsight memory
  • thread gateway session metadata into retain payloads when available
  • preserve Hermes' existing session-scoped accumulating document_id topology while enriching retain metadata

Why

Hermes' native Hindsight integration was retaining session conversation content with very little provenance. This patch keeps the native provider session-scoped, but makes each retain/update carry enough structure for better recall, filtering, and debugging:

  • session/platform/chat/thread/user metadata now rides along with retained content
  • default retain tags merge cleanly with per-call tool tags
  • optional source metadata is attached to retained content
  • auto-retained conversation transcripts now honor configurable user/assistant labels

Config added

  • retain_tags
  • retain_source
  • retain_user_prefix
  • retain_assistant_prefix

Config behavior clarified

  • retain_async remains supported on the batch retain path
  • retain_context still labels retained conversation content
  • retain_user_prefix / retain_assistant_prefix shape the auto-retained transcript content

Test plan

  • python3 -m pytest tests/plugins/memory/test_hindsight_provider.py tests/agent/test_memory_user_id.py tests/gateway/test_session_env.py -q
  • python3 -m py_compile run_agent.py gateway/run.py plugins/memory/hindsight/__init__.py tests/plugins/memory/test_hindsight_provider.py tests/agent/test_memory_user_id.py tests/gateway/test_session_env.py

@essendigitalgroup-cyber
Copy link
Copy Markdown

Thanks for picking this up, great work closing the gap with the vectorize-io PRs #695 (Claude Code) and #937 (OpenClaw).

One enhancement worth considering for a follow-up: this PR applies _retain_tags from config to every retain, but doesn't expose per-call tags from the hindsight_retain tool args. For multi-client setups where a single Hermes instance handles several clients in one shared bank, it's useful to let the agent attach context-specific tags at call time, not just config-level defaults.

Concrete use case: I run Hermes as a consulting co-founder across multiple clients in a shared Hindsight bank. The config default makes sense for stable tags like source:hermes, but when the agent is actively working on a specific client (say, retaining a decision about client X), it should be able to pass tags=[\"client:x\", \"type:decision\"] in the tool call so the memory lands pre-tagged rather than waiting for a nightly auto-tag cron to catch it.

Sketch of the change in handle_tool_call() for the hindsight_retain branch:

```python

Merge config-level default tags with any per-call tags from tool args

_tool_tags = list(self._retain_tags or [])
_arg_tags = args.get("tags") or []
if isinstance(_arg_tags, list):
for _t in _arg_tags:
if _t and _t not in _tool_tags:
_tool_tags.append(_t)

kwargs = self._build_retain_kwargs(...) # your helper
if _tool_tags:
kwargs["tags"] = _tool_tags
_run_sync(client.aretain(**kwargs))
```

Also would need `tags` added to `RETAIN_SCHEMA` so the agent sees it as an optional parameter.

Happy to open a follow-up PR against this once it merges, if that's easier than folding it in here. Either way, thanks for getting this in, we're running a locally-patched version of your diff in production and it's working well.

@Abnertheforeman
Copy link
Copy Markdown
Author

@essendigitalgroup-cyber Good catch. I folded this into the branch while resolving the merge conflicts.

hindsight_retain now exposes inline tool-call tags, and those tags are merged with config-level retain_tags before calling aretain(), with dedupe so overlapping tags do not get duplicated.

I also cleaned up the config naming so retain_tags is the canonical config key, while tool-call tags stays the per-retain override/extension point. That kept the config side explicit and the tool side ergonomic. Thanks for calling it out — it was a small change, but the right one.

@Abnertheforeman
Copy link
Copy Markdown
Author

Abnertheforeman commented Apr 9, 2026

@nicoloboschi Thanks for the additional Hindsight feature work in 25757d6 — it helped a lot bringing Hermes closer to parity.

@Aldoustheorchestrator and I have been happy to collaborate with you on the Hermes and OpenClaw fixes. We are fully intending to lean on Hindsight hard, and we want the richest retain and recall path possible — strong retain shaping, good tag support, and high-fidelity recall controls. Appreciate the work you've been putting in here.

Comment thread plugins/memory/hindsight/__init__.py Outdated
Copy link
Copy Markdown
Contributor

@nicoloboschi nicoloboschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retain_async it's supported. and with 0.4.22 it must work. the integration now forces the client to install 0.4.22

@Abnertheforeman
Copy link
Copy Markdown
Author

@nicoloboschi I removed retain_async unintentionally while tightening the branch up. It’s back now in the latest push.

@Abnertheforeman
Copy link
Copy Markdown
Author

@nicoloboschi I added a similar stable document_id + predictable turn counts in this PR, just like in vectorize-io/hindsight#953. Just letting you know because I noticed #6654

@nicoloboschi
Copy link
Copy Markdown
Contributor

Thanks for the work here — the metadata enrichment (platform/user/chat/thread identifiers threaded into retain payloads) is a clear win and something I'd like to see land.

But the document-id restructure is a non-starter:

  1. Duplicate def initialize(). Lines 554 and 582 both define initialize at class level. Python keeps only the second, which means the hindsight-client version check + auto-upgrade logic (added in feat(hindsight): feature parity, setup wizard, and config improvements #6428) is dead code. Users on old clients silently hit the retain_async serialization bug with no auto-fix.

  2. Per-turn document IDs defeat the point of document_id. document_id exists so related content stays grouped in one document that gets updated over time. Hindsight's extraction, entity linking, knowledge graph, and observation consolidation all work better on coherent multi-turn content than on isolated turns. A 100-turn conversation becoming 100 tiny documents is worse for recall quality and costs more to store/index.

  3. Sliding window documents are a double-store bandaid. The chunk docs try to recover conversation context that per-turn IDs throw away, but now the same content lives in both the turn doc and multiple overlapping window docs. This isn't a tradeoff — it's strictly worse than keeping one document per session.

Could you reduce the scope of this PR to only the metadata enrichment? Specifically:

  • Thread user_name / chat_id / chat_name / chat_type / thread_id from the gateway through to initialize() kwargs ✅
  • Add the corresponding HERMES_SESSION_* env vars ✅
  • Populate the metadata dict on aretain calls ✅
  • Add retain_source, retain_user_prefix, retain_assistant_prefix config options ✅
  • Rename tagsretain_tags, add per-call tags to the hindsight_retain tool schema ✅
  • Keep the existing document_id topology (session-scoped, accumulating)

The resume-overwrite bug you're working around is being tracked separately in #6602 / #6654 / #6672 — let's keep those changes independent.

@Abnertheforeman
Copy link
Copy Markdown
Author

Abnertheforeman commented Apr 10, 2026

Thanks for the work here — the metadata enrichment (platform/user/chat/thread identifiers threaded into retain payloads) is a clear win and something I'd like to see land.

But the document-id restructure is a non-starter:

  1. Duplicate def initialize(). Lines 554 and 582 both define initialize at class level. Python keeps only the second, which means the hindsight-client version check + auto-upgrade logic (added in feat(hindsight): feature parity, setup wizard, and config improvements #6428) is dead code. Users on old clients silently hit the retain_async serialization bug with no auto-fix.
  2. Per-turn document IDs defeat the point of document_id. document_id exists so related content stays grouped in one document that gets updated over time. Hindsight's extraction, entity linking, knowledge graph, and observation consolidation all work better on coherent multi-turn content than on isolated turns. A 100-turn conversation becoming 100 tiny documents is worse for recall quality and costs more to store/index.
  3. Sliding window documents are a double-store bandaid. The chunk docs try to recover conversation context that per-turn IDs throw away, but now the same content lives in both the turn doc and multiple overlapping window docs. This isn't a tradeoff — it's strictly worse than keeping one document per session.

Could you reduce the scope of this PR to only the metadata enrichment? Specifically:

  • Thread user_name / chat_id / chat_name / chat_type / thread_id from the gateway through to initialize() kwargs ✅
  • Add the corresponding HERMES_SESSION_* env vars ✅
  • Populate the metadata dict on aretain calls ✅
  • Add retain_source, retain_user_prefix, retain_assistant_prefix config options ✅
  • Rename tagsretain_tags, add per-call tags to the hindsight_retain tool schema ✅
  • Keep the existing document_id topology (session-scoped, accumulating)

The resume-overwrite bug you're working around is being tracked separately in #6602 / #6654 / #6672 — let's keep those changes independent.

I agree that the scope grew a little, but both products are moving fast enough and things are being solved that I might've missed after submitting this PR and using this branch locally.

I checked Hermes PR #6654 yesterday, but hadn't seen the others. They are 300+ PRs/issues after my initial opening and I haven't kept watch. 6654's fix is (at the time of me looking) pragmatic and effective: stop using raw session_id as the document_id, stamp a per-process suffix, and add lineage tags like session: and parent:. It avoids overwrite on resume and fork collisions without needing extra persistence. The tradeoff is that it treats each process lifetime as a new document family, so you lose a clean single canonical session document and start accumulating shards. When the process/container recycles and we pick a conversation back up, I'd like to see the same lineage carry through. To me, it makes sense to keep document chunks somewhat aligned so that in a future state of Hindsight (or any other downstream tooling), it's easy to re-form the conversation as a whole from the chunks for possible summarization etc.

The Hindsight PR #953 feels cleaner architecturally. Instead of inventing a new identifier scheme, it derives turn numbering from the persisted conversation history — the actual source of truth. That means resumes continue at turn 7 if the conversation is really at turn 7, and document IDs stay semantically stable instead of process-stamped. I believe there's room at the table for both a stable document_id with stable incremental turn identification + identifying tags.

Ship only rendered_content (with prefix) in turn message dicts, not
both content and rendered_content.  Hindsight indexes both fields,
so sending the raw content alongside rendered_content doubled the
indexed data for no retrieval benefit.
…ndered_content

Replace the two-field pattern (content + rendered_content) with a single
content field that includes the prefix inline.  Removes speaker_label
and rendered_content from turn message dicts entirely.

This is simpler for Hindsight to index (one field, not two carrying
the same text) and avoids inventing non-standard fields that need
review justification.
@Abnertheforeman
Copy link
Copy Markdown
Author

Abnertheforeman commented Apr 14, 2026

After the closing of Hindsight 953 and assuming that a single long-lived and appended document per session is the long-term vision, I scratched my previous comment.

None of the document lifecycle is in this PR anymore however. Strictly tag/metadata focused for retain purposes only. Though it might benefit from the Anthropic JSON pattern adjustment if that's the go-north for Hindsight retain structure long-term.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants