-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider using Fast Fourier Transform for polynomial multiplication. #13
Comments
I'll have a look if this can be done using |
That would be great, but I doubt that it's possible: We need to do FFT in a finite field and most implementations seem to be written for floating point numbers only. |
libqfft doesn't seem to expose a pure C API. Creating a wrapper for a C++-only library could be tricky and messy. Might be easier to just Rewrite It In Rust™. |
Let me know when when the time comes to implement this on the GPU :) |
…and for the instance in Edit: Actually, |
I read up on it a bit. Here's my understanding so far (most of this is just from various Wikipedia articles): Fourier TransformLet 𝔽 be a field and 𝕍 a vector space over it, N a natural number and α ∈ 𝔽 a primitive N-th root of unity, i.e. αk = 1 for k = N, but not for 0 < k < N. The Fourier transform 𝕍N → 𝕍N is the linear map defined by the matrix W, where Wi,j = αij. Its inverse is given by W-1i,j = α-ij / N. (Note that I generalize matrix multiplication a bit here: AB can be defined in exactly the same way if the entries of A are in 𝔽, and the entries in B are in 𝕍.) We can interpret each a ∈ 𝕍N as the coefficients of a polynomial So the Fourier transform of a is the vector of values of fa at the N-th roots of unity. And the inverse Fourier transform of a are the coefficients of the polynomial whose values at the N-th roots of unity are given by a: In that sense, W is evaluation and W-1 is interpolation. Fast Fourier TransformLet W' be the Fourier transform for 2N instead of N, with some 2N-th root of unity β such that β2 = α. Let c ∈ 𝕍2N, and let a, b ∈ 𝕍N be the even- resp. odd-numbered entries of c (starting at 0). Then So the Fourier transform in 𝕍2N can be computed in linear time from two Fourier transforms in 𝕍N! That means if N is a power of 2, the Fourier transform can be computed in O(N log(N)). (But the same trick also works for factors other than 2, of course.) The pairing crate knows for which powers N of 2 there are N-th roots of unity, and BLS12-381 was designed to have 2s-th roots of unity for large s. Fast MultiplicationLet fa and fb be polynomials with degrees deg fa + deg fb < N, i.e. such that there is a c ∈ 𝕍N with fc = fa fb. Naively, computing c, i.e. computing the product of the two polynomials, takes O(N2) time. But W c is just the pointwise product of W a and W b (because these represent values of the polynomials) and computable in O(N). So (at least for large N) transforming, multiplying pointwise, and transforming back is faster than naive multiplication, namely O(N log(N))! Our ImplementationThat is already useful: Maybe we should have two kinds of structures for our polynomials (i.e. 𝔽 = 𝕍 = I have no idea whether it's worth it, i.e. whether that would speed up our arithmetic for N > 5 or only for N > 100… we'll need to either try it out or at least count how many additions and multiplications the transform would actually require. And there are also other fast Fourier transform algorithms that might do better in practice than the one I described. Interpolation in GeneralI don't understand yet whether and how this can be applied to interpolation in general: For threshold decryption and signatures we need to interpolate given values in other places than the N-th roots of unity. E.g. libfqfft seems to also only allow fast interpolation if the sample points are in a specific domain: roots of unity or arithmetic or geometric progressions. |
Key Share ComputationWe should also use f(αk) for the k-th key share, since these values can all be computed in a single Fourier transform in O(N log(N)) time, whereas computing them separately takes O(N2). However, N would always have to be a suitable number, for which a root of unity and a good FFT algorithm is known: probably a power of two. So it would be the smallest power of two greater than the number of nodes. If we require that N is strictly greater than the number of nodes, we could also make the master key f(1) instead of f(0), so it's part of the Fourier transform, too. |
The bellman crate's FFT implementation follows exactly the algorithm described above. It's a bit hard to read, though: It swaps the elements in such a way that the bits of an element's index get flipped, so that in each step of the iteration, W a and W b are consecutive contiguous slices. |
I wrote a crude FFT implementation in the It might still turn out to be worth it, of course:
Anyway, I'm unassigning this for now; it's not a low-hanging fruit, at least. |
For proper benchmarking, it would be required to
|
We're currently doing polynomial multiplication and interpolation the naive way (also interpolation, and for a single value), which is O(n²). With Fast Fourier Transform (FFT), multiplication (but probably not interpolation, in general?) should be possible in O(n log(n)). Let's try to estimate the actual count of arithmetic operations first, to make sure it's worth doing this in practice, and for all network sizes.
The text was updated successfully, but these errors were encountered: