-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Labels
enhancementImprove an existing feature or componentImprove an existing feature or component
Description
Affected Component
ailego
Current Behavior
The current compute_one_to_many_* implementations in zvec::ailego::DistanceBatch process batches sequentially.
- Inner-product computation is vectorized (AVX2) but single-threaded.
- Throughput is limited by a single core when
BatchSizeis large.
Desired Improvement
For large batch sizes (e.g. hundreds or thousands of vectors), the outer loop over BatchSize / dp_batch becomes embarrassingly parallel and can benefit significantly from multi-core CPUs.
Introduce optional OpenMP parallelization when the batch size exceeds a configurable threshold.
Example strategy:
- Keep current behavior for small batches (to avoid OpenMP overhead).
- Use
#pragma omp parallel forover the batch dimension for large batches. - Guard with
#ifdef _OPENMPto preserve portability.
Impact
- Improved throughput for large-scale distance computations.
- No behavior change for small batch sizes.
- Backward compatible when OpenMP is not enabled.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementImprove an existing feature or componentImprove an existing feature or component
Type
Projects
Status
Backlog