Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Requests: A dictionary can store 1 billion of entries and under 100GB. #959

Open
ghost opened this issue Oct 12, 2024 · 1 comment

Comments

@ghost
Copy link

ghost commented Oct 12, 2024

  • A dictionary can store 1 billion of entries and under 100GB

Approximately 100 bytes per line of text, what is the total storage requirement for 1 billion lines of text?

$$10^9 \times 100 \approx 10^{11}$$ bytes

Converted to GB, it is approximately:

$$\frac{10^{11}}{1024 \times 1024 \times 1024} \approx 93.13$$ GB

@Nickersoft
Copy link
Member

This hasn't officially been tested, but it could probably be very easily (take the size in bytes of a dictionary with 1 entry and multiply it by 1B), minus a few extra bytes for the file header information. ODict is a very efficient format, so it could most likely very easily store a dictionary of this size. However, I'm not sure how fast the lookup times will be (would need testing).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant