Project
vgrep
Description
The Config::hash_path() function uses only the first 8 bytes (64 bits) of a SHA256 hash to generate unique database filenames for different projects. While 64 bits provides ~18 quintillion possibilities, the birthday paradox means collisions become increasingly likely as the number of projects grows. Two different project paths with a hash collision would share the same database, causing data corruption.
Error Message
No error - silent data corruption when collision occurs.
Debug Logs
System Information
- Bounty Version: 0.1.0
- OS: Ubuntu 24.04 LTS
- Rust: 1.75+
Screenshots
No response
Steps to Reproduce
- Find two paths that produce the same 8-byte SHA256 prefix (requires brute force or luck)
- Index project A at path X
- Index project B at path Y (where hash(X)[..8] == hash(Y)[..8])
- Project B's data overwrites Project A's data
- Search in Project A returns results from Project B
Expected Behavior
- Each project should have a guaranteed unique database file
- Hash collisions should be detected or impossible
- If collision occurs, warn user or use different naming scheme
Actual Behavior
- 64-bit hash provides ~1 in 2^32 collision chance after 2^32 projects
- Collisions cause silent database sharing/corruption
- No detection mechanism exists
Additional Context
No response
Project
vgrep
Description
The
Config::hash_path()function uses only the first 8 bytes (64 bits) of a SHA256 hash to generate unique database filenames for different projects. While 64 bits provides ~18 quintillion possibilities, the birthday paradox means collisions become increasingly likely as the number of projects grows. Two different project paths with a hash collision would share the same database, causing data corruption.Error Message
Debug Logs
System Information
Screenshots
No response
Steps to Reproduce
Expected Behavior
Actual Behavior
Additional Context
No response