Summary [16/933]
When a session starts, hook_session_start.py calls search_memories (GET /api/v1/memories/search) solely to retrieve pending_messages.
However, this endpoint triggers the full retrieval pipeline (Milvus vector search + ElasticSearch + rerank), even though the query is an empty string and the search results themselves are never used — resulting in unnecessary performance overhead.
The root cause is that the pending_messages retrieval logic is coupled to the retrieve_mem path. The fetch_mem path (GET /api/v1/memories) does not return pending_messages, and no lightweight dedicated endpoint currently exists.
Problem
The current call flow in hook_session_start.py:
fetch_recent_memories() → GET /api/v1/memories: fetches recent episodic memories (direct MongoDB query, lightweight)
search_memories("", method="hybrid") → GET /api/v1/memories/search: used solely to retrieve pending_messages
Step 2 triggers:
- Milvus vector search
- ElasticSearch keyword search
- Rerank (triggered unconditionally in
hybrid mode)
The memories results returned by the search are never used in hook_session_start.py — this is pure waste.
Root Cause
The pending_messages retrieval logic lives inside memory_manager.retrieve_mem() and is only triggered via the search endpoint. The fetch_mem() path does not include this logic, and the FetchMemResponse DTO does not have a pending_messages field.
TODO
Expose a dedicated pending messages endpoint (P1)
- Add
GET /api/v1/messages/pending
- Parameters:
user_id, group_id, limit
- Directly invoke
_get_pending_messages() or MemoryRequestLogService.get_pending_messages(), without triggering any vector or keyword
search
- Return format should be consistent with the existing
pending_messages field
Update hook_session_start.py (P1)
- Replace
search_memories("", method="hybrid") with a call to the new GET /api/v1/messages/pending endpoint
Reference
src/agentic_layer/memory_manager.py — retrieve_mem() vs fetch_mem()
src/api_specs/dtos/memory.py — FetchMemResponse (no pending_messages) vs RetrieveMemResponse (has pending_messages)
~/.claude/skills/evermemos/scripts/hook_session_start.py
Summary [16/933]
When a session starts,
hook_session_start.pycallssearch_memories(GET /api/v1/memories/search) solely to retrievepending_messages.However, this endpoint triggers the full retrieval pipeline (Milvus vector search + ElasticSearch + rerank), even though the query is an empty string and the search results themselves are never used — resulting in unnecessary performance overhead.
The root cause is that the
pending_messagesretrieval logic is coupled to theretrieve_mempath. Thefetch_mempath (GET /api/v1/memories) does not returnpending_messages, and no lightweight dedicated endpoint currently exists.Problem
The current call flow in
hook_session_start.py:fetch_recent_memories()→GET /api/v1/memories: fetches recent episodic memories (direct MongoDB query, lightweight)search_memories("", method="hybrid")→GET /api/v1/memories/search: used solely to retrievepending_messagesStep 2 triggers:
hybridmode)The
memoriesresults returned by the search are never used inhook_session_start.py— this is pure waste.Root Cause
The
pending_messagesretrieval logic lives insidememory_manager.retrieve_mem()and is only triggered via thesearchendpoint. Thefetch_mem()path does not include this logic, and theFetchMemResponseDTO does not have apending_messagesfield.TODO
Expose a dedicated pending messages endpoint (P1)
GET /api/v1/messages/pendinguser_id,group_id,limit_get_pending_messages()orMemoryRequestLogService.get_pending_messages(), without triggering any vector or keywordsearch
pending_messagesfieldUpdate
hook_session_start.py(P1)search_memories("", method="hybrid")with a call to the newGET /api/v1/messages/pendingendpointReference
src/agentic_layer/memory_manager.py—retrieve_mem()vsfetch_mem()src/api_specs/dtos/memory.py—FetchMemResponse(nopending_messages) vsRetrieveMemResponse(haspending_messages)~/.claude/skills/evermemos/scripts/hook_session_start.py