-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arb mulhigh #1802
base: main
Are you sure you want to change the base?
Arb mulhigh #1802
Conversation
Yeah, I wouldn't do this before having a generic C fallback for and fixing the performance issues for flint_mpn_mulhigh / flint_mpn_mulhigh_normalised, so that these functions can be relied upon without conditional directives creeping into higher level code. It will certainly not work to just take the shorter of the input lengths as the precision in arf_mul_rnd_sloppy. In general, the inputs can be shorter or longer than the precision, or a mixture. A bit of logic is needed for this. It will even be optimal to zero-pad operands in some cases. Note that it is also necessary to normalise the arf_t output by removing trailing zero limbs. |
I think this work would be aided by having a general Such a method would be a bit complex, but it would offload complexity from the |
In general, isn't it true that an |
mulhi? I agree arbs would like this, but Newton's method likes mulmid more.
which requires space for all |
Yep, that should improve the division and square root code quite a bit. mulhigh is already doing good things for |
Depends on what Newton's method you are talking about. I believe that precomputed inverses and reciprocal square roots via Newton's method really favors mulhi. |
mulhi is a special case of mulmid, and some of your mulhis for inversion don't need all of the high bits. It is even in the TODO.md:
|
You can see this in flint/src/gr_poly/inv_series_newton.c Line 57 in 80e9b24
m coefficients of the mullow output are never used.
(Of course high/low Newton divison are the same but reversed.) |
For beyond 20M bits, I am seeing a ~30% improvement in arb_inv with this mulmid over what is currently in flint. (Still have to tune it in the medium range.) |
That sounds excellent. Of course, the current |
I don't think we should have different precisions on different systems (currently
flint_mpn_mulhigh
is only available on some x86-64 systems), but this is my very preliminary draft on how I wantarb_mul
to look like.Left to fix:
flint_mpn_mulhigh
available on all systems._arb_mul_special
arf_mul_rnd_sloppy
rounds input in order to be able to perform a n-by-n high multiplication. So one has to account for rounding before the multiplication is done as well as the fact that the multiplication is inexact.arf_mul_rnd_sloppy
is used) -- should we round the result or not?