Find code you forgot, by describing what it did.
Dejavu is a semantic code search tool that lets you find code across your projects using natural language descriptions. Instead of remembering filenames, function names, or exact keywords, just describe what the code did:
dejavu "that drag and drop kanban board"
dejavu "CSV parser that grouped by date" --lang python
dejavu "animated sidebar component" --when "last summer"
- Index your code directories -- Dejavu uses tree-sitter AST parsing to extract functions, classes, and methods from 20+ languages
- Embed each code chunk using local vector embeddings via Ollama (no data leaves your machine)
- Search with natural language -- your query is embedded and matched against your code using vector similarity
Everything runs locally. Your code never leaves your machine.
pip install dejavu-code- Python 3.10+
- Ollama running locally
Pull the embedding model:
ollama pull nomic-embed-codeFor large codebases, install the sqlite-vec extension for hardware-accelerated KNN search:
pip install "dejavu-code[vec]"Without it, Dejavu falls back to numpy-based cosine similarity (works fine for most codebases).
# 1. Initialize config
dejavu init
# 2. Edit ~/.dejavu/config.toml to set your code directories
# (defaults: ~/code, ~/projects, ~/dev, ~/src, ~/repos, ~/work)
# 3. Index your code
dejavu index
# 4. Search!
dejavu "that function that parsed CSV files and grouped them by date"dejavu index ~/projects/my-appdejavu "auth middleware" --lang python
dejavu "React component with tabs" --when "last summer"
dejavu "deployment script" --path workdejavu "auth middleware" --jsonReturns structured JSON with all result metadata -- useful for piping into other tools or agent workflows.
dejavu "CSV parser" --explainShows how each result was scored:
#1 parse_csv (Function) — 87%
/home/user/projects/etl/parsers.py
python | 2025-08-14 | lines 42-78
scores: vector=82.3% keyword_boost=+4.5% combined=87%
dejavu statusDejavu includes an MCP server that gives Claude direct access to your code search index. This is the primary way to use Dejavu -- Claude can find code you've written before without you needing to remember where it lives.
Run this from your terminal:
claude mcp add dejavu -- dejavu-mcpThat's it. Claude Code will now have access to the dejavu_search, dejavu_reindex, dejavu_status, and dejavu_forget tools.
Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"dejavu": {
"command": "dejavu-mcp"
}
}
}Once connected, you can ask Claude things like:
- "Search my code for that CSV parser I wrote last year"
- "Find the React component that had the animated sidebar"
- "Look for any auth middleware I wrote in Python"
- "Reindex my projects directory"
| Tool | Description |
|---|---|
dejavu_search |
Search indexed code by natural language description. Supports language filters, temporal hints, and path filters. |
dejavu_reindex |
Index or re-index code directories. Incremental -- only processes modified files. |
dejavu_status |
Show index statistics: repo count, chunk count, languages, and configured paths. |
dejavu_forget |
Remove a repository/directory from the index. Source files are never modified. |
Config lives at ~/.dejavu/config.toml. Created by dejavu init.
[paths]
roots = ["~/code", "~/projects"]
[index]
db_path = "~/.dejavu/index.db"
max_file_size_kb = 500
[embedding]
provider = "ollama"
model = "nomic-embed-code"
fallback_model = "nomic-embed-text"
batch_size = 32
[embedding.ollama]
base_url = "http://localhost:11434"
[search]
default_limit = 10
keyword_boost = 0.15| Variable | Description |
|---|---|
DEJAVU_DB |
Override database path |
OLLAMA_HOST |
Override Ollama URL |
Tree-sitter AST parsing (extracts functions, classes, methods):
Python, JavaScript, TypeScript, TSX, Rust, Go, Ruby, Java, Kotlin, C, C++, PHP, Bash, Swift
Sliding-window fallback (indexes file contents in chunks):
SQL, HTML, CSS, SCSS, Svelte, Vue, TOML, YAML, JSON, Protobuf, Lua, Julia, Scala, Zig, Elixir, and more.
┌─────────────┐
│ Claude Code │
│ (MCP client)│
└──────┬───────┘
│
┌──────▼───────┐
┌────┤ server.py ├────┐
│ │ (MCP tools) │ │
│ └──────────────┘ │
┌────▼─────┐ ┌──────▼──────┐
│ cli.py │ │ search.py │
│ (Click) │ │ (pipeline) │
└────┬──────┘ └──────┬───────┘
│ │
┌──────────▼──────────┐ ┌────────▼────────┐
│ indexer.py │ │ embedder.py │
│ (orchestrator) │ │ (Ollama client) │
└──┬──────────┬───┘ │ └────────┬─────────┘
│ │ │ │
┌────────▼──┐ ┌────▼─────┐ ┌────────▼─────────┐
│discovery.py│ │extractor │ │ Ollama (local) │
│(find repos)│ │(tree-sit)│ │ nomic-embed-code │
└────────────┘ └──────────┘ └──────────────────┘
│
┌──────▼───────┐
│ db.py │
│ (SQLite + │
│ sqlite-vec) │
└──────────────┘
- Parse query -- extract language hints ("in python"), temporal hints ("last summer"), path filters
- Clean & embed -- strip hints from query text, generate vector embedding via Ollama
- Vector search -- KNN lookup in sqlite-vec (or numpy fallback) with filters applied
- Keyword boost -- bonus score for results whose name/signature/docstring match query terms
- Rank & deduplicate -- sort by combined score, remove overlapping chunks from same file
- Discover -- walk configured root paths, find repos by project markers (.git, package.json, etc.)
- Filter -- skip binary files, node_modules, .gitignore'd paths, files over 500KB
- Extract -- tree-sitter AST parsing pulls out functions, classes, methods with names and docstrings
- Embed -- batch-generate vector embeddings via Ollama's local API
- Store -- write chunks and embeddings to SQLite (incremental: only re-processes modified files)
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.