utils: faster `add_multilinears` #55

tcoratger · 2025-09-23T10:58:33Z

Benchmarks when compared to before (on small dimensions, the very small degradation is noise, I tested various run and the results are kind of different each time but on large dimensions this is always big win).

add_multilinears_fn/optimized/10
                        time:   [50.002 µs 50.549 µs 51.090 µs]
                        thrpt:  [20.043 Melem/s 20.258 Melem/s 20.479 Melem/s]
                 change:
                        time:   [+1.2641% +3.4338% +5.7785%] (p = 0.00 < 0.05)
                        thrpt:  [−5.4629% −3.3198% −1.2483%]
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
add_multilinears_fn/optimized/14
                        time:   [76.380 µs 77.374 µs 78.297 µs]
                        thrpt:  [209.25 Melem/s 211.75 Melem/s 214.51 Melem/s]
                 change:
                        time:   [+1.7252% +3.9230% +6.2144%] (p = 0.00 < 0.05)
                        thrpt:  [−5.8508% −3.7749% −1.6960%]
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
add_multilinears_fn/optimized/18
                        time:   [112.78 µs 117.96 µs 125.47 µs]
                        thrpt:  [2.0893 Gelem/s 2.2223 Gelem/s 2.3244 Gelem/s]
                 change:
                        time:   [−18.608% −16.871% −14.746%] (p = 0.00 < 0.05)
                        thrpt:  [+17.297% +20.294% +22.862%]
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  4 (4.00%) high severe
add_multilinears_fn/optimized/20
                        time:   [315.30 µs 320.84 µs 329.73 µs]
                        thrpt:  [3.1801 Gelem/s 3.2682 Gelem/s 3.3257 Gelem/s]
                 change:
                        time:   [−52.253% −50.004% −47.859%] (p = 0.00 < 0.05)
                        thrpt:  [+91.787% +100.02% +109.44%]
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

add_multilinears_sparse/optimized/10
                        time:   [88.945 µs 90.308 µs 91.541 µs]
                        thrpt:  [715.92 Melem/s 725.70 Melem/s 736.81 Melem/s]
                 change:
                        time:   [−2.9823% +0.0009% +2.8453%] (p = 1.00 > 0.05)
                        thrpt:  [−2.7666% −0.0009% +3.0740%]
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  7 (7.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe
add_multilinears_sparse/optimized/50
                        time:   [88.014 µs 88.741 µs 89.512 µs]
                        thrpt:  [732.15 Melem/s 738.51 Melem/s 744.61 Melem/s]
                 change:
                        time:   [−9.0176% −7.0711% −5.2000%] (p = 0.00 < 0.05)
                        thrpt:  [+5.4853% +7.6092% +9.9113%]
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  4 (4.00%) low mild
  2 (2.00%) high mild
  3 (3.00%) high severe
add_multilinears_sparse/optimized/90
                        time:   [86.818 µs 89.123 µs 92.157 µs]
                        thrpt:  [711.13 Melem/s 735.34 Melem/s 754.87 Melem/s]
                 change:
                        time:   [−6.2371% −4.5692% −2.7610%] (p = 0.00 < 0.05)
                        thrpt:  [+2.8394% +4.7879% +6.6520%]
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild
  3 (3.00%) high severe
add_multilinears_sparse/optimized/100
                        time:   [85.672 µs 86.673 µs 87.685 µs]
                        thrpt:  [747.40 Melem/s 756.13 Melem/s 764.96 Melem/s]
                 change:
                        time:   [−3.8784% −1.3785% +0.8765%] (p = 0.27 > 0.05)
                        thrpt:  [−0.8689% +1.3978% +4.0349%]
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

When doing this

let batched_column_mixed = add_multilinears(
        &column_up(&batched_column),
        &scale_poly(&column_down(&batched_column), alpha),
    );

In order to avoid double allocations:

One vec is allocated as a result of column_up(&batched_column)
One vec is allocated as a result of add_multilinears,

You could do in place operations:

let mut col_up = column_up(&batched_column);
let batched_column_mixed = add_multilinears_inplace(
        &col_up
        &scale_poly(&column_down(&batched_column), alpha),
    );

with

pub fn add_multilinears_inplace<F: Field>(dst: &mut [F], src: &[F]) {
    assert_eq!(dst.len(), src.len());

    dst.par_iter_mut()
        .zip(src.par_iter())
        .filter(|(_, b)| !b.is_zero())
        .for_each(|(a, b)| *a += *b);
}

should be much more efficient.

TomWambsgans · 2025-09-23T12:20:29Z

I am not sure of the:

           if a.is_zero() {
                *b
            } else if b.is_zero() {
                *a
            }  else {...}

Because this assumes the polynomial is sparse, which in some case we know for sure it's not the case, so I would suggest to remove the case disjonction (assuming not sparse) or alternatively 2 separate functions ?

also, add_multilinears_inplace seems a good idea to me

tcoratger · 2025-09-23T14:47:37Z

pub fn add_multilinears_inplace<F: Field>(dst: &mut [F], src: &[F]) {
    assert_eq!(dst.len(), src.len());

    dst.par_iter_mut()
        .zip(src.par_iter())
        .filter(|(_, b)| !b.is_zero())
        .for_each(|(a, b)| *a += *b);
}

Just replaced by in place version so that this removes all the problems

tcoratger added 2 commits September 23, 2025 12:52

utils: faster add_multilinears

8e9ecf9

cleanup

c9da978

utils: add add_multilinears_inplace

257bca4

Merge branch 'main' into add_multilinears

9c3a435

TomWambsgans merged commit 53afb35 into main Sep 23, 2025
3 checks passed

tcoratger deleted the add_multilinears branch September 23, 2025 15:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

utils: faster `add_multilinears` #55

utils: faster `add_multilinears` #55

Uh oh!

tcoratger commented Sep 23, 2025

Uh oh!

TomWambsgans commented Sep 23, 2025

Uh oh!

tcoratger commented Sep 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

utils: faster add_multilinears #55

utils: faster add_multilinears #55

Uh oh!

Conversation

tcoratger commented Sep 23, 2025

Uh oh!

TomWambsgans commented Sep 23, 2025

Uh oh!

tcoratger commented Sep 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

utils: faster `add_multilinears` #55

utils: faster `add_multilinears` #55