Skip to content

Conversation

@AestheticAkhmad
Copy link
Contributor

@AestheticAkhmad AestheticAkhmad commented Jul 31, 2025

This is a draft of implementation of MCOL-5758 feature.

  1. Bloom filter implementation:

    • Main methods
    • Multiply-hashing
  2. Query transformation:

    • Update small side processing - TupleHashJoinStep
    • Update small side processing - TupleJoiner
    • Update large side processing in BPP
  3. Statistics and Analysis: HyperLogLog to determine NDV

  4. Testing:

    • Prepare data for testing
    • Preliminary tests (in progress)
    • Final tests

@AestheticAkhmad AestheticAkhmad marked this pull request as ready for review August 6, 2025 13:29
@AestheticAkhmad AestheticAkhmad force-pushed the MCOL-5758-BLOOM-FILTER-PRE-JOIN-GSOC-2025-REDESIGNED branch from 18b0b0e to 4895f26 Compare August 13, 2025 20:33
@mariadb-LeonidFedorov
Copy link
Collaborator

@AestheticAkhmad how is it going?
The has been released boost:bloom
https://www.boost.org/news/entry/new_library_boost_bloom/
WDYT, is it applicable for the task?

@AestheticAkhmad
Copy link
Contributor Author

@AestheticAkhmad how is it going? The has been released boost:bloom https://www.boost.org/news/entry/new_library_boost_bloom/ WDYT, is it applicable for the task?

Hello, all going good, thanks. I will check it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants