feat(hindsight): add richer session-scoped retain metadata#6290
feat(hindsight): add richer session-scoped retain metadata#6290Abnertheforeman wants to merge 23 commits intoNousResearch:mainfrom
Conversation
|
Thanks for picking this up, great work closing the gap with the vectorize-io PRs #695 (Claude Code) and #937 (OpenClaw). One enhancement worth considering for a follow-up: this PR applies Concrete use case: I run Hermes as a consulting co-founder across multiple clients in a shared Hindsight bank. The config default makes sense for stable tags like Sketch of the change in ```python Merge config-level default tags with any per-call tags from tool args_tool_tags = list(self._retain_tags or []) kwargs = self._build_retain_kwargs(...) # your helper Also would need `tags` added to `RETAIN_SCHEMA` so the agent sees it as an optional parameter. Happy to open a follow-up PR against this once it merges, if that's easier than folding it in here. Either way, thanks for getting this in, we're running a locally-patched version of your diff in production and it's working well. |
|
@essendigitalgroup-cyber Good catch. I folded this into the branch while resolving the merge conflicts.
I also cleaned up the config naming so |
|
@nicoloboschi Thanks for the additional Hindsight feature work in 25757d6 — it helped a lot bringing Hermes closer to parity. @Aldoustheorchestrator and I have been happy to collaborate with you on the Hermes and OpenClaw fixes. We are fully intending to lean on Hindsight hard, and we want the richest retain and recall path possible — strong retain shaping, good tag support, and high-fidelity recall controls. Appreciate the work you've been putting in here. |
nicoloboschi
left a comment
There was a problem hiding this comment.
retain_async it's supported. and with 0.4.22 it must work. the integration now forces the client to install 0.4.22
|
@nicoloboschi I removed |
|
@nicoloboschi I added a similar stable document_id + predictable turn counts in this PR, just like in vectorize-io/hindsight#953. Just letting you know because I noticed #6654 |
|
Thanks for the work here — the metadata enrichment (platform/user/chat/thread identifiers threaded into retain payloads) is a clear win and something I'd like to see land. But the document-id restructure is a non-starter:
Could you reduce the scope of this PR to only the metadata enrichment? Specifically:
The resume-overwrite bug you're working around is being tracked separately in #6602 / #6654 / #6672 — let's keep those changes independent. |
|
Ship only rendered_content (with prefix) in turn message dicts, not both content and rendered_content. Hindsight indexes both fields, so sending the raw content alongside rendered_content doubled the indexed data for no retrieval benefit.
…ndered_content Replace the two-field pattern (content + rendered_content) with a single content field that includes the prefix inline. Removes speaker_label and rendered_content from turn message dicts entirely. This is simpler for Hindsight to index (one field, not two carrying the same text) and avoids inventing non-standard fields that need review justification.
|
After the closing of Hindsight 953 and assuming that a single long-lived and appended document per session is the long-term vision, I scratched my previous comment. None of the document lifecycle is in this PR anymore however. Strictly tag/metadata focused for retain purposes only. Though it might benefit from the Anthropic JSON pattern adjustment if that's the go-north for Hindsight retain structure long-term. |
Summary
document_idtopology while enriching retain metadataWhy
Hermes' native Hindsight integration was retaining session conversation content with very little provenance. This patch keeps the native provider session-scoped, but makes each retain/update carry enough structure for better recall, filtering, and debugging:
Config added
retain_tagsretain_sourceretain_user_prefixretain_assistant_prefixConfig behavior clarified
retain_asyncremains supported on the batch retain pathretain_contextstill labels retained conversation contentretain_user_prefix/retain_assistant_prefixshape the auto-retained transcript contentTest plan
python3 -m pytest tests/plugins/memory/test_hindsight_provider.py tests/agent/test_memory_user_id.py tests/gateway/test_session_env.py -qpython3 -m py_compile run_agent.py gateway/run.py plugins/memory/hindsight/__init__.py tests/plugins/memory/test_hindsight_provider.py tests/agent/test_memory_user_id.py tests/gateway/test_session_env.py