Feature Request: Vectorized API

I have tested rbloom, and it is really fast. However, it would be beneficial if it provided a vectorized insert and query API. For example, it should accept an array of NumPy arrays or PyArrow arrays and return an array as well.

```python
import time
import uuid

from rbloom import Bloom

print("generating data")
N = 1000000
data = [uuid.uuid4() for i in range(N)]
testdata = [uuid.uuid4() for i in range(N)]

print("Number of keys", len(data))

bf = Bloom(len(data), 0.00001)
for d in data:
    bf.add(d)

for d in data:
    assert d in bf

count = 0
start = time.time()
for x in testdata:
    count += x in bf
end = time.time()
querytime = end - start
fpp = count / N * 100.0
print(
    "false positive rate",
    "{:.5f}".format(fpp),
    "%",
    ", memory per key",
    "{:.1f}".format(bf.size_in_bits / N),
    "bits",
    ", millions of queries per second: ",
    "{:.2f}".format(N / querytime / 1000000),
    ", total memory",
    "{:.2f}".format(bf.size_in_bits / 8 / 1024.0 / 1024.0),
    "MiB",
)
```

Output

```
generating data
Number of keys 1000000
false positive rate 0.00100 % , memory per key 24.0 bits , millions of queries per second:  8.89 , total memory 2.86 MiB
```

My test env is Apple M4 Pro.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Vectorized API #23

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feature Request: Vectorized API #23

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions