Claude Code Relevant Session Search

A pipeline for searching and analyzing previous Claude Code sessions to find relevant conversations using Haiku subagent delegation.

Overview

When working on long-term projects with Claude Code, valuable context gets scattered across multiple sessions. This tool enables semantic search across your Claude Code history, leveraging Haiku as a lightweight analyzer to find and summarize relevant past conversations.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         Main Model                              │
│                    (Sonnet/Opus)                                │
└─────────────────────┬───────────────────────────────────────────┘
                      │ Task tool invocation
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│              relevant-search Subagent                           │
│                                                                 │
│  1. Collect session file list (Bash: find)                     │
│  2. For each session:                                          │
│     └─→ Bash: python session_analyzer.py                       │
│         └─→ jq: extract .message.content[].text/thinking       │
│         └─→ Haiku API: analyze relevance                       │
│         └─→ stdout: session report                             │
│  3. Accumulate results                                         │
│  4. Stop-point check (every N sessions)                        │
│  5. Return consolidated report to main model                   │
└─────────────────────────────────────────────────────────────────┘

Why This Design?

Problem: Subagents cannot spawn other subagents in Claude Code.

Solution: Delegate per-session analysis to an external Python script that calls Haiku API independently. Each script execution is a separate process with isolated context, preventing token accumulation across sessions.

Components

1. Subagent Definition

~/.claude/agents/relevant-search.md

---
name: relevant-search
description: Search previous Claude Code sessions for relevant conversations. Use when user mentions "previous session", "we discussed before", "find where we talked about", etc.
tools: Bash, Read, Glob, Grep
---

# Relevant Session Search Agent

## Role
Search and analyze previous Claude Code sessions to find conversations relevant to the user's query.

## Process

1. Receive search query from main model
2. Determine search scope:
   - Directory: current project or all projects
   - Date range: default last 30 days
   - Max sessions: default 20

3. Collect target session files:
```bash
   find ~/.claude/projects/[PROJECT_DIR] -name "*.jsonl" -mtime -30 -type f | sort -r
```

4. For each session, invoke analyzer:
```bash
   python ~/.claude/scripts/session_analyzer.py \
     --session "[SESSION_PATH]" \
     --query "[SEARCH_QUERY]"
```

5. Accumulate results in memory

6. Stop-point check (every 5 sessions):
   - If sufficient relevant content found → prepare final report
   - If insufficient → continue iteration
   - If scope exhausted → suggest expanding scope or conclude

7. Return consolidated report to main model

## Output Format

For each relevant session found:
- Session ID and timestamp
- Session context (what was this session about overall)
- Relevant content summary
- Original quotes

## Scope Control

User can specify:
- `--dir [PROJECT_PATH]`: specific project directory
- `--days [N]`: date range in days
- `--max [N]`: maximum sessions to analyze
- `--exhaustive`: search all available sessions

2. Session Analyzer Script

~/.claude/scripts/session_analyzer.py

#!/usr/bin/env python3
"""
Analyze a single Claude Code session for relevance to a search query.
Calls Haiku API for semantic analysis.
"""

import argparse
import json
import subprocess
import os
from anthropic import Anthropic

def extract_session_content(session_path: str) -> str:
    """Extract text and thinking content from session JSONL using jq."""
    jq_filter = '''
    .message.content[]? | 
    select(.type == "text" or .type == "thinking") | 
    .text // .thinking
    '''
    
    result = subprocess.run(
        ['jq', '-r', jq_filter, session_path],
        capture_output=True,
        text=True
    )
    
    return result.stdout

def analyze_with_haiku(content: str, query: str, session_id: str) -> dict:
    """Send content to Haiku for relevance analysis."""
    
    client = Anthropic()
    
    prompt = f"""Analyze this Claude Code session transcript and determine if it contains content relevant to the following search query.

<search_query>
{query}
</search_query>

<session_transcript>
{content[:150000]}  # Truncate to fit context
</session_transcript>

If relevant content exists, respond in this JSON format:
{{
    "relevant": true,
    "session_context": "Brief description of what this session was about overall",
    "relevant_summary": "Summary of content related to the search query",
    "quotes": ["Direct quote 1", "Direct quote 2"]
}}

If no relevant content exists:
{{
    "relevant": false,
    "session_context": "Brief description of what this session was about overall"
}}

Respond with JSON only, no other text."""

    response = client.messages.create(
        model="claude-haiku-4-5-20241022",
        max_tokens=2000,
        messages=[{"role": "user", "content": prompt}]
    )
    
    try:
        return json.loads(response.content[0].text)
    except json.JSONDecodeError:
        return {"relevant": False, "error": "Failed to parse response"}

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--session', required=True, help='Path to session JSONL file')
    parser.add_argument('--query', required=True, help='Search query')
    args = parser.parse_args()
    
    session_id = os.path.basename(args.session).replace('.jsonl', '')
    
    # Extract content
    content = extract_session_content(args.session)
    
    if not content.strip():
        print(json.dumps({"session_id": session_id, "relevant": False, "error": "Empty session"}))
        return
    
    # Analyze with Haiku
    result = analyze_with_haiku(content, args.query, session_id)
    result["session_id"] = session_id
    
    # Get timestamp from file
    stat = os.stat(args.session)
    result["timestamp"] = stat.st_mtime
    
    print(json.dumps(result))

if __name__ == "__main__":
    main()

3. Installation

# Create directories
mkdir -p ~/.claude/agents
mkdir -p ~/.claude/scripts

# Copy agent definition
cp relevant-search.md ~/.claude/agents/

# Copy and make script executable
cp session_analyzer.py ~/.claude/scripts/
chmod +x ~/.claude/scripts/session_analyzer.py

# Install dependencies
pip install anthropic

# Ensure jq is installed
# Ubuntu/Debian: sudo apt install jq
# macOS: brew install jq

# Set up authentication (choose one)
export ANTHROPIC_API_KEY="your-api-key"
# OR use OAuth token from Claude Code subscription
# claude setup-token
# export CLAUDE_CODE_OAUTH_TOKEN="your-oauth-token"

Usage

Automatic Invocation

The subagent is automatically invoked when you mention past sessions:

You: "Find where we discussed the coordinate transformation methods"
Claude: [Invokes relevant-search subagent]
        [Returns consolidated report of matching sessions]

Explicit Invocation

You: "Use the relevant-search agent to find discussions about Firebase authentication from the last 60 days"

With Scope Control

You: "Search my trimfh project sessions from November for layout editor discussions"

Session JSONL Structure

Claude Code stores sessions in ~/.claude/projects/[encoded-path]/[session-id].jsonl

Each line is a JSON object with:

{
  "type": "user" | "assistant",
  "sessionId": "uuid",
  "timestamp": "ISO-8601",
  "message": {
    "content": [
      {"type": "text", "text": "..."},
      {"type": "thinking", "thinking": "..."},  // CoT
      {"type": "tool_use", "name": "...", "input": {...}}
    ]
  }
}

The analyzer extracts text and thinking fields for semantic search.

Limitations

No nested subagents: Claude Code subagents cannot spawn other subagents. This pipeline works around this by using Bash to call external scripts.
Context window: Each session's extracted content must fit in Haiku's context (~200K tokens). Very long sessions are truncated.
API costs: Each session analysis requires one Haiku API call.

Future Improvements

Pre-indexing with embeddings for faster initial filtering
Caching analyzed sessions to avoid re-processing
Support for searching across multiple projects simultaneously
Integration with /resume for one-click session continuation

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Claude Code Relevant Session Search

Overview

Architecture

Why This Design?

Components

1. Subagent Definition

2. Session Analyzer Script

3. Installation

Usage

Automatic Invocation

Explicit Invocation

With Scope Control

Session JSONL Structure

Limitations

Future Improvements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Claude Code Relevant Session Search

Overview

Architecture

Why This Design?

Components

1. Subagent Definition

2. Session Analyzer Script

3. Installation

Usage

Automatic Invocation

Explicit Invocation

With Scope Control

Session JSONL Structure

Limitations

Future Improvements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages