Project
vgrep
Description
The should_index() function in src/core/indexer.rs line 310 includes an empty string "" in the list of indexable file extensions. This causes files without extensions (including compiled binaries, executables, and other non-text files) to be indexed, leading to errors or corrupted embeddings.
Error Message
Error: Failed to read file
Caused by: stream did not contain valid UTF-8
Or silently produces garbage embeddings for binary content.
Debug Logs
System Information
- Bounty Version: 0.1.0
- OS: Ubuntu 24.04 LTS
- Rust: 1.75+
Screenshots
No response
Steps to Reproduce
- Create a project with compiled binaries:
cd /tmp/test_project
echo 'fn main() { println!("hello"); }' > main.rs
rustc main.rs -o my_binary
- Run indexer:
vgrep index
- Observe that
my_binary (the compiled executable) is attempted to be indexed
Expected Behavior
Files without extensions should NOT be indexed by default, except for specific known filenames (Makefile, Dockerfile, etc.) which are already handled in the filename check.
Actual Behavior
All files without extensions are considered indexable, causing:
- UTF-8 decode errors for binary files
- Wasted processing time attempting to read binaries
- Potentially corrupted embeddings if binary content is partially UTF-8 valid
- Index bloat from non-code files
Additional Context
The same bug exists in:
src/core/indexer.rs:686 (ServerIndexer)
src/watcher.rs:244 (FileWatcher)
All three locations need to be fixed consistently.
Project
vgrep
Description
The
should_index()function insrc/core/indexer.rsline 310 includes an empty string""in the list of indexable file extensions. This causes files without extensions (including compiled binaries, executables, and other non-text files) to be indexed, leading to errors or corrupted embeddings.Error Message
Debug Logs
System Information
Screenshots
No response
Steps to Reproduce
vgrep indexmy_binary(the compiled executable) is attempted to be indexedExpected Behavior
Files without extensions should NOT be indexed by default, except for specific known filenames (Makefile, Dockerfile, etc.) which are already handled in the filename check.
Actual Behavior
All files without extensions are considered indexable, causing:
Additional Context
The same bug exists in:
src/core/indexer.rs:686(ServerIndexer)src/watcher.rs:244(FileWatcher)All three locations need to be fixed consistently.