Skip to content

Commit

Permalink
BloomFilter 自动分批去重,防止一次性传输大量的数据,导致报错
Browse files Browse the repository at this point in the history
  • Loading branch information
Boris-code committed Dec 11, 2023
1 parent 6eb09ef commit 6e8a021
Showing 1 changed file with 12 additions and 1 deletion.
13 changes: 12 additions & 1 deletion feapder/dedup/bitarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,18 @@ def set(self, offsets, values):
@param values: 支持列表或单个值
@return: list / 单个值
"""
return self.redis_db.setbit(self.name, offsets, values)
# 对offsets进行分片,最大100000个
results = []
batch_size = 170000
for i in range(0, len(offsets), batch_size):
results.extend(
self.redis_db.setbit(
self.name,
offsets[i : i + batch_size],
values[i : i + batch_size] if isinstance(values, list) else values,
)
)
return results

def get(self, offsets):
return self.redis_db.getbit(self.name, offsets)
Expand Down

0 comments on commit 6e8a021

Please sign in to comment.