Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ [DRAKEN] TO DO #1673

Closed
1 of 9 tasks
joocer opened this issue May 21, 2024 · 0 comments
Closed
1 of 9 tasks

✨ [DRAKEN] TO DO #1673

joocer opened this issue May 21, 2024 · 0 comments

Comments

@joocer
Copy link
Contributor

joocer commented May 21, 2024

  • Write the InList and Range searches
  • Design the API, we should provide results back in the form: { "{filename}": [rows] }
  • Write AccumulatorMemtable, which instead of tracking the last entry for a key, we accumulate rows
  • Add in some thresholds
    • more than 100k keys don't use the BloomFilter (this is 10% FP rate), the searches should not use the BF if it's zero bytes
    • more than a threshold (1000 rows?), don't collect any more rows - hint to the engine that it's not specific enough to be useful
  • Write the Python parts of the API (the FastAPI and storage access bits)
  • Write the Client part, that will be used by Opteryx and Mabel (Opteryx should also be able to build an index)
  • Exemplar use-case will be asset searching (about 50k unique values in a 3m row dataset)
@joocer joocer changed the title ✨ [DRAKEN] TO ✨ [DRAKEN] TO DO May 22, 2024
@joocer joocer closed this as not planned Won't fix, can't repro, duplicate, stale Aug 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant