Add thoughts timeline: capture, store, search & broadcast Claude thinking blocks#1083
Add thoughts timeline: capture, store, search & broadcast Claude thinking blocks#1083thedotmack wants to merge 18 commits intomainfrom
Conversation
Implements ThinkingBlock interface and extractThinkingBlocks() function that reads Claude Code JSONL transcript files and extracts thinking content blocks from assistant messages. Includes 11 unit tests covering edge cases (empty files, malformed JSON, missing timestamps, falsy thinking content). Validated against real transcript files (64 thinking blocks found across 5 files). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds migration for the `thoughts` table (stores extracted thinking blocks from Claude Code session transcripts) with full-text search via FTS5 and sync triggers. Migration added to both legacy migrations.ts (v8) and active MigrationRunner (v21). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added ThoughtInput/Thought types and four SessionStore methods for thinking block persistence: storeThoughts, getThoughts, getThoughtsByIds, and searchThoughts (FTS5). Includes 23 passing tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Creates handleThoughtsExtraction() that extracts thinking blocks from transcripts and stores them via the worker API POST /api/sessions/thoughts. Includes comprehensive test suite with 5 tests covering edge cases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wire thoughts extraction into the CLI hook dispatch chain: - Create thoughts-extract EventHandler wrapping handleThoughtsExtraction - Register as event type in handlers/index.ts (between summarize and session-complete) - Add POST /api/sessions/thoughts endpoint in SessionRoutes for storing thinking blocks - Endpoint resolves memorySessionId/project from contentSessionId when not provided - Add 7 tests covering missing input, success, worker unavailable, and error cases Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wire the thoughts-extract handler into the Stop hook chain in hooks.json, positioned after summarize and before session-complete, to capture thinking blocks when a Claude Code session ends. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create dedicated ThoughtsRoutes extending BaseRouteHandler with three endpoints: POST /api/thoughts for storing thinking blocks, GET /api/thoughts for project-scoped retrieval with time filtering, and GET /api/thoughts/search for FTS5 full-text search. Register in worker service background init after database initialization. Includes 18 integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…beddings Adds single-thought Chroma sync following the syncUserPrompt pattern. Creates documents with thought_id prefix and thinking_text as searchable content. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ht vector embeddings Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… search integration After storing thoughts, the POST /api/thoughts endpoint now retrieves full thought records and fires async chromaSync.syncThoughts(). Chroma failures are caught and logged as warnings without blocking storage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… search completeness Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eploy full Chroma integration The createThoughtsTable migration existed only in MigrationRunner (version 21) but SessionStore—which the worker service actually uses—had no call to it. Additionally, version 21 was already used by addOnUpdateCascadeToForeignKeys in SessionStore, causing MigrationRunner's version check to skip table creation entirely. Fix: Added createThoughtsTable() to SessionStore with migration version 22 and table-existence check for robustness. Updated MigrationRunner to also use version 22. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… all strategies Adds thought document support to search types (ThoughtSearchResult, ChromaDocType), ResultFormatter (formatting, combining, counting), ChromaSearchStrategy (where filter, categorization, hydration), and all supporting strategies. MCP search automatically includes thoughts via Worker HTTP API proxy. 119 bun tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…notifications Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…T endpoint After storing thoughts and syncing to Chroma, the POST /api/thoughts endpoint now broadcasts a thought_stored SSE event for each stored thought, enabling real-time UI updates when new thinking blocks arrive. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…egration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…endpoint The production stop hook calls /api/sessions/thoughts (SessionRoutes), not /api/thoughts (ThoughtsRoutes). SSE broadcasting and Chroma vector sync were only wired into ThoughtsRoutes but not SessionRoutes, creating a gap where thoughts stored via the actual production path would not trigger real-time UI updates or vector indexing. Adds broadcastThoughtStored and chromaSync.syncThoughts calls to SessionRoutes.handleStoreThoughts, matching the pattern in ThoughtsRoutes. Includes 5 new integration tests verifying the fix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
# Conflicts: # plugin/scripts/mcp-server.cjs # plugin/scripts/worker-service.cjs
PR Review: Thoughts Timeline FeatureGreat work on this comprehensive feature addition! The implementation is well-structured with excellent test coverage. Below are my findings across code quality, architecture, security, and potential improvements. ✅ Strengths1. Excellent Test Coverage
2. Clean Architecture
3. Robust Database Design
4. Error Handling
🔍 Issues & RecommendationsCritical: Privacy Tag Stripping MissingIssue: Thinking blocks bypass the privacy tag stripping system. While observations/summaries use Location: Impact: Users marking content as Recommendation: // In src/hooks/handlers/thoughts.ts
import { stripMemoryTagsFromJson } from '../../utils/tag-stripping.js';
const thoughts: ThoughtInput[] = blocks.map(block => ({
thinking_text: stripMemoryTagsFromJson(block.thinking), // Strip privacy tags
thinking_summary: null,
message_index: block.messageIndex,
}));Alternative: Document that thinking blocks are captured before privacy filtering as they represent Claude's internal reasoning (pre-response). This would require updating the privacy documentation to clarify the scope. High: Error Handling InconsistencyIssue: The transcript parser in Current Code: } catch {
logger.debug('THINKING', 'Skipping malformed transcript line', { lineIndex, transcriptPath });
}Recommendation: Add error details to debug logging: } catch (error) {
logger.debug('THINKING', 'Skipping malformed transcript line', {
lineIndex,
transcriptPath,
error: error instanceof Error ? error.message : String(error)
});
}Medium: Race Condition in SessionRoutes.handleStoreThoughtsIssue: The endpoint calls Location: Current Flow: const sessionDbId = store.createSDKSession(contentSessionId, '', '');
const dbSession = store.getSessionById(sessionDbId);
const memorySessionId = providedMemorySessionId || dbSession?.memory_session_id;Recommendation: Add validation or document that const sessionDbId = store.createSDKSession(contentSessionId, '', '');
const dbSession = store.getSessionById(sessionDbId);
if (!dbSession) {
logger.error('SESSION', 'Session not found after creation', { sessionDbId, contentSessionId });
return this.internalError(res, 'Session initialization failed');
}Low: Migration Version MismatchMinor Inconsistency: The migration is labeled as version 22 in Location: Observation: This appears to be migration #22 in the sequence. Just verify this aligns with Low: Potential Performance ConsiderationObservation: The
Current: Code Quality: Minor Improvements
📊 Test Coverage AssessmentExcellent coverage across:
Missing coverage (nice-to-haves):
🔒 Security AssessmentPositive:
Attention Needed:
🎯 Recommendations SummaryMust Fix Before Merge
Should Fix Before Merge
Consider for Follow-up PR
✨ Overall AssessmentThis is a high-quality implementation with excellent architecture, comprehensive tests, and proper separation of concerns. The FTS5 integration is particularly well done. The primary concern is ensuring privacy controls are consistently applied across all data capture points. Recommendation: Approve with request for privacy tag handling clarification/implementation. Great work! 🚀 |
Greptile OverviewGreptile SummaryThis PR implements a complete pipeline for capturing, storing, and searching Claude's internal thinking blocks from session transcripts. The implementation follows established patterns from the codebase for observations and summaries. Key Changes:
Implementation Quality: The code demonstrates strong consistency with existing patterns - migration structure matches Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Hook as Stop Hook
participant Extract as thoughts-extract Handler
participant Parser as extractThinkingBlocks
participant Transcript as JSONL Transcript
participant Worker as Worker API
participant Store as SessionStore
participant FTS as SQLite FTS5
participant Chroma as ChromaSync
participant SSE as SSE Broadcaster
Hook->>Extract: Execute (Phase 1.5)
Extract->>Parser: extractThinkingBlocks(transcriptPath)
Parser->>Transcript: Read JSONL line-by-line
Transcript-->>Parser: Raw transcript lines
Parser->>Parser: Parse JSON, filter type='thinking'
Parser-->>Extract: ThinkingBlock[] with text & messageIndex
Extract->>Worker: POST /api/sessions/thoughts
Note over Extract,Worker: Body: contentSessionId, thoughts[]
Worker->>Store: createSDKSession(contentSessionId)
Store-->>Worker: Resolve memorySessionId & project
Worker->>Store: storeThoughts(memorySessionId, thoughts)
Store->>FTS: INSERT triggers sync thoughts_fts
Store-->>Worker: thought IDs[]
Worker->>Store: getThoughtsByIds(ids)
Store-->>Worker: Full Thought records
par Async Operations
Worker->>Chroma: syncThoughts(thoughts)
Chroma->>Chroma: Create ChromaDocument[]
Note over Chroma: doc_type='thought', metadata
Chroma-->>Worker: Success (fire-and-forget)
and
Worker->>SSE: broadcastThoughtStored(thought)
SSE->>SSE: Broadcast 'thought_stored' event
Note over SSE: Preview: first 200 chars
end
Worker-->>Extract: { ids: number[] }
Extract-->>Hook: HookResult (continue=true)
Last reviewed commit: 67c7b63 |
Chriscross475
left a comment
There was a problem hiding this comment.
LGTM - Comprehensive thoughts timeline implementation with:
✅ Passing CI (Greptile + claude-review)
✅ Database migration (migration 21 with FTS5)
✅ Complete API layer (storage, retrieval, search)
✅ Vector integration (ChromaSync for semantic search)
✅ Real-time SSE (thought_stored broadcasts)
✅ Test coverage (9 new test files covering migration, storage, routes, SSE, Chroma)
Well-architected feature with clean separation of concerns. No TODO/FIXME markers found.
Summary
/api/thoughts,/api/sessions/thoughts)thought_storedSSE events in real-time to connected viewer clientsChanges
Database (migration 21):
thoughtstable with FTS5 virtual table and sync triggersStorage layer (
SessionStore):storeThoughts(),getThoughts(),getThoughtsByIds(),searchThoughts()Hook integration:
extractThinkingBlocks) reads JSONL line-by-lineAPI:
POST/GET /api/thoughts- store and retrieveGET /api/thoughts/search- FTS5 searchPOST /api/sessions/thoughts- session-scoped storage with auto-resolutionVector search:
ChromaSync.syncThought()/syncThoughts()with backfillThoughtSearchResulttypeSSE:
broadcastThoughtStored()on SessionEventBroadcasterTest plan
🤖 Generated with Claude Code