Skip to content

[BUG] Search Loads ALL Embeddings Into Memory - OOM Risk #140

@EnthusiasticTech

Description

@EnthusiasticTech

Project

vgrep

Description

The search_similar() function in src/core/db.rs loads all embeddings matching the path prefix into memory before calculating similarity scores. For large codebases, this can cause Out-Of-Memory (OOM) crashes.

Each embedding is ~1.5KB (384 dimensions × 4 bytes), so:

  • 10,000 chunks = ~15 MB
  • 100,000 chunks = ~150 MB
  • 1,000,000 chunks = ~1.5 GB (common in large monorepos)

The entire result set is loaded into a Vec<SearchResult> before any filtering or limiting occurs.

Affected Files

  • src/core/db.rs (lines 155-199)

Evidence

    pub fn search_similar(
        &self,
        query_embedding: &[f32],
        path_prefix: &Path,
        limit: usize,
    ) -> Result<Vec<SearchResult>> {
        let path_prefix_str = path_prefix.to_string_lossy();
        let like_pattern = format!("{}%", path_prefix_str);

        let mut stmt = self.conn.prepare(
            r"SELECT c.id, c.file_id, f.path, c.content, c.start_line, c.end_line, c.embedding
              FROM chunks c
              JOIN files f ON c.file_id = f.id
              WHERE f.path LIKE ?",  // No LIMIT here!
        )?;

        let mut results: Vec<SearchResult> = stmt
            .query_map([&like_pattern], |row| {
                let embedding_blob: Vec<u8> = row.get(6)?;  // Each ~1.5KB
                let embedding = bytes_to_embedding(&embedding_blob);
                let similarity = cosine_similarity(query_embedding, &embedding);

                Ok(SearchResult {
                    chunk_id: row.get(0)?,
                    file_id: row.get(1)?,
                    path: PathBuf::from(row.get::<_, String>(2)?),
                    content: row.get(3)?,  // Full content also loaded
                    start_line: row.get(4)?,
                    end_line: row.get(5)?,
                    similarity,
                })
            })?
            .filter_map(Result::ok)  // ALL results collected here
            .collect();  // <-- OOM can happen here!

        // Sort and truncate AFTER loading everything
        results.sort_by(...);
        results.truncate(limit * 3);
        Ok(results)
    }

Memory Calculation:

For each chunk loaded:

  • embedding: 384 × 4 = 1,536 bytes
  • content: ~500 bytes average (chunk_size / 2)
  • path: ~100 bytes average
  • SearchResult struct overhead: ~80 bytes

Total per chunk: ~2.2 KB

Chunks Memory Required
10,000 22 MB
50,000 110 MB
100,000 220 MB
500,000 1.1 GB
1,000,000 2.2 GB

Error Message

Debug Logs

System Information

Bounty Version: 0.1.0
OS: Ubuntu 24.04 LTS
CPU: AMD EPYC-Genoa Processor (8 cores)
RAM: 15 GB

Screenshots

No response

Steps to Reproduce

# 1. Index a large codebase (e.g., Linux kernel, Chromium)
cd /path/to/large/codebase
vgrep index

# 2. Check the chunk count
sqlite3 ~/.vgrep/projects/*.db "SELECT COUNT(*) FROM chunks;"
# Output: 500000+

# 3. Run a search (will try to load all 500K+ embeddings)
vgrep "function"

# 4. Watch memory usage spike, potentially OOM

Expected Behavior

  1. Search should work efficiently regardless of index size
  2. Memory usage should be bounded and predictable
  3. Only top-K results should be fully loaded into memory

Actual Behavior

  1. ALL matching chunks are loaded into memory
  2. Memory grows linearly with index size
  3. Large codebases cause OOM crashes
  4. No streaming or pagination

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingvalidValid issuevgrep

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions