fix(sessions): guard syncFromHermes vs SessionDeleter race + import all sources on startup#354
Open
29206394 wants to merge 1 commit intoEKKOLearnAI:mainfrom
Open
Conversation
…ll sources on startup (EKKOLearnAI#352) Two related bugs in the session pipeline introduced in v0.5.3. Bug 1 — SessionDeleter race condition (issue EKKOLearnAI#352): chat-run-socket.syncFromHermes asynchronously reads a Hermes session and writes its assistant/tool messages into the local mirror. It then enqueues the Hermes session for deletion via gc_pending_session_deletes. SessionDeleter drains that queue on a 5-minute timer (and on profile switch + immediately on start), with no coordination against in-flight reads. A drain that fires while a sync is mid-flight deletes the Hermes session out from under the reader, and the assistant/tool messages never make it into the local DB. Visible symptom: switching away and back to a session shows assistant replies vanishing; message_count decrements over time (6 -> 5 -> 4 ...). Fix: introduce an in-memory `inFlightHermesSessionIds` set. syncFromHermes marks the id BEFORE awaiting the read, releases it after the read completes (success, failure, and the empty-detail early return are all handled). drain() filters out any candidate that is currently in the set, so periodic / profile-switch ticks defer deletion until the read is done. Important ordering: enqueueEphemeralDelete is called BEFORE clearing the guard, so drain never observes the pending row without the guard active. Bug 2 — Startup sync silently drops CLI sessions: syncProfileSessions hardcoded `'api_server'` as the source filter on listHermesSessionSummaries, so first-startup sync only imported chat sessions originated by the WebUI itself. CLI conversations (source='cli'), Telegram, Discord, Slack, etc. were never synced into the local DB and stayed invisible in the sidebar forever (sync only runs when the local DB is empty). As a side effect, long CLI conversations that had been compressed into a parent_session_id chain also weren't aggregated by sync, since they weren't synced at all. Fix: pass `undefined` to listSessionSummaries so it returns roots from ALL sources. That function already filters out tool / compress_* sessions and aggregates parent_session_id chains, so no other change is needed. Verified locally: clean re-sync against a Hermes state.db with mixed api_server + cli sessions now imports 11 chains across 4 profiles (previously: 0 from this code path), correctly aggregating multi-step chains (thread_session_count up to 5) into single sidebar entries.
albert748
added a commit
to albert748/hermes-web-ui
that referenced
this pull request
May 1, 2026
…eleter race guard ## Root cause The session-sync service introduced in PR EKKOLearnAI#294 has four defects: 1. Hardcoded source='api_server' filter — only api_server sessions imported 2. One-shot sync gate (count > 0) — permanently skipped after first run 3. Hardcoded source label in createSession — source info destroyed 4. addMessage() failed on NULL content from tool-call/system messages Additionally, SessionDeleter race with syncFromHermes (EKKOLearnAI#352). ## Changes ### session-sync.ts — incremental dedup - Remove count > 0 gate — sync runs on every startup - Dedup by hermes_id column: query existing ids, skip already-imported - Pass hermes_id to createSession for future dedup - Import ALL sources (pass undefined instead of 'api_server') ### schemas.ts + session-store.ts — hermes_id tracking - Add hermes_id TEXT column (auto-migrated via ensureTable) - CREATE UNIQUE INDEX for fast dedup lookups - createSession() accepts optional hermes_id parameter ### SessionDeleter race fix (credit: EKKOLearnAI#354 by moxian) - Export inFlightHermesSessionIds Set from session-deleter.ts - chat-run-socket.ts: register/release guard around async reads - session-deleter.ts: filter out in-flight sessions in drain() ### Sidebar collapse toggle - Add collapse button for compact icon-rail mode (persisted) Fixes EKKOLearnAI#322, Fixes EKKOLearnAI#321, Fixes EKKOLearnAI#352
Contributor
|
Hey @29206394, thanks for the SessionDeleter fix — I've incorporated the inFlightHermesSessionIds guard into #373 along with the source filter fix and incremental sync (hermes_id dedup column). The incremental sync is the key piece: without it, the source filter fix alone doesn't help existing users who already have a populated DB (the count > 0 gate blocks re-sync forever). Hope we can consolidate — my PR has all three fixes in one commit. cc @54laowang |
albert748
added a commit
to albert748/hermes-web-ui
that referenced
this pull request
May 1, 2026
…eleter race guard ## Root cause The session-sync service introduced in PR EKKOLearnAI#294 has four defects: 1. Hardcoded source='api_server' filter — only api_server sessions imported 2. One-shot sync gate (count > 0) — permanently skipped after first run 3. Hardcoded source label in createSession — source info destroyed 4. addMessage() failed on NULL content from tool-call/system messages Additionally, SessionDeleter race with syncFromHermes (EKKOLearnAI#352). ## Changes ### session-sync.ts — incremental dedup - Remove count > 0 gate — sync runs on every startup - Dedup by hermes_id column: query existing ids, skip already-imported - Pass hermes_id to createSession for future dedup - Import ALL sources (pass undefined instead of 'api_server') ### schemas.ts + session-store.ts — hermes_id tracking - Add hermes_id TEXT column (auto-migrated via ensureTable) - CREATE UNIQUE INDEX for fast dedup lookups - createSession() accepts optional hermes_id parameter ### SessionDeleter race fix (credit: EKKOLearnAI#354 by moxian) - Export inFlightHermesSessionIds Set from session-deleter.ts - chat-run-socket.ts: register/release guard around async reads - session-deleter.ts: filter out in-flight sessions in drain() Fixes EKKOLearnAI#322, Fixes EKKOLearnAI#321, Fixes EKKOLearnAI#352
This was referenced May 1, 2026
Merged
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #352.
Summary
Two related bugs in the v0.5.3 session pipeline:
Bug 1 — SessionDeleter race condition (#352)
chat-run-socket.syncFromHermesasynchronously reads a Hermes session and writes its assistant/tool messages into the local mirror, then enqueues the Hermes session id intogc_pending_session_deletes.SessionDeleterdrains that queue on a 5-minute timer, on profile switch, and immediately on start — with no coordination against in-flight reads. A drain that fires while a sync is mid-flight deletes the Hermes session out from under the reader, and assistant/tool messages never make it into the local DB.Visible symptom: switching away and back to a session shows assistant replies vanishing;
message_countdecrements over time (6 → 5 → 4 …).Fix: introduce an in-memory
inFlightHermesSessionIds: Set<string>exported fromsession-deleter.ts.syncFromHermesadds the id BEFORE awaiting the read and removes it after the read completes — handling all three exit paths:.catch()→ release guard, do not enqueue (next successful sync will pick it up)drain()filters out any candidate currently in the set, so periodic / profile-switch ticks defer deletion until the read is done.Bug 2 — Startup sync silently drops CLI sessions
syncProfileSessionshardcoded'api_server'as the source filter onlistHermesSessionSummaries, so first-startup sync only imported chat sessions originated by the WebUI itself. CLI conversations (source='cli'), Telegram, Discord, Slack, etc. were never synced into the local DB and stayed invisible in the sidebar forever (sync only runs when the local DB is empty, so they couldn't be recovered without dropping the DB).As a side effect, long CLI conversations that had been compressed into a
parent_session_idchain weren't aggregated either, since they weren't synced at all.Fix: pass
undefinedtolistSessionSummariesso it returns roots from ALL sources. That function already filters outtool/compress_*sessions and aggregatesparent_session_idchains, so no other change is needed.Files changed
packages/server/src/services/hermes/session-deleter.ts— exportinFlightHermesSessionIds, filter eligible rows indrain()packages/server/src/services/hermes/chat-run-socket.ts— register/release guard aroundsyncFromHermespackages/server/src/services/hermes/session-sync.ts— import all sources, not justapi_server3 files, +65 / −6 lines.
Test plan
state.dbcontaining mixedapi_server+clisessions across 4 profiles → imported 11 chains (previously 0 from this code path), correctly aggregating multi-step chains (maxthread_session_count=5) into single sidebar entries.npm run buildpasses (vue-tsc + tsc + esbuild).[session-sync] sync complete: synced=11, errors=0in logs.Notes
syncFromHermesonly registers in this process, so a Set is sufficient and avoids schema migration.enqueue → then release) is load-bearing — reversing it reintroduces a small window where drain could miss the guard. This is documented in the code with a// IMPORTANT:comment referencing the issue.