Add modular splitting #1542

David-Berghaus · 2023-10-20T16:10:33Z

Added modular splitting as described in Brent, Zimmermann: Modern Computer Arithmetic, Section 4.4.3.

Remarks:
-I ended up not using set_shallow because the swap-approach has the convenience that all coefficients that are out of bounds are simply zero entries which are ignored by the dot product routines.
-I started working on gr_poly_evaluate_other_modular but stopped since I noticed that gr_dot_other still uses the naive method. We therefore cannot get a performance advantage over gr_poly_evaluate_other_rectangular but potentially worse numerical stability.

fredrik-johansson · 2023-10-21T14:31:37Z

This will be a nice algorithm to have!

The code needs a little bit of work. You can't safely swap entries from poly because this is a read-only argument: it may be read simultaneously from another thread, for example. So tmp should actually be allocated shallowly and set_shallow should be used. See gr_mat/nonsingular_solve_tril.c for an example. For the dot product where you're currently zero-padding, you should be able to calculate the actual length.

Indeed, this algorithm could be implemented more cleanly if gr_dot supported strided access which is unfortunately not the case.

-I started working on gr_poly_evaluate_other_modular but stopped since I noticed that gr_dot_other still uses the naive method. We therefore cannot get a performance advantage over gr_poly_evaluate_other_rectangular but potentially worse numerical stability.

Yes, this will make sense to add when we have an overloadable gr_dot_other / gr_other_dot.

David-Berghaus · 2023-10-21T14:58:39Z

Oh yes, I was stupid and didn't think of potential multi-threading!

I replaced swap with set_shallow now.

fredrik-johansson · 2023-10-21T16:25:43Z

Just some minor style issues to fix: use /* */ comments, { on new lines.

fredrik-johansson · 2023-10-21T16:28:19Z

BTW, gr_fmpz_poly_evaluate_modular is documented but does not seem to be part of the PR.

David-Berghaus · 2023-10-23T10:38:38Z

Done and fixed!

fredrik-johansson · 2023-10-24T17:38:03Z

Nice!

fredrik-johansson · 2023-10-24T17:41:51Z

I pushed a small optimization: the variable tmp_gr is not needed if you compute one more entry in the x vector.

David-Berghaus added 3 commits October 20, 2023 15:34

First code dump

66c3ecd

Cleaned up code

bf3066f

Added docs

a7a3490

Replaced swap with set_shallow

c30843f

David-Berghaus added 2 commits October 23, 2023 12:34

Fix formatting

1d2c4aa

Removed undefined documentation

818b930

fredrik-johansson merged commit 52cf466 into flintlib:trunk Oct 24, 2023
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add modular splitting #1542

Add modular splitting #1542

David-Berghaus commented Oct 20, 2023 •

edited

Loading

fredrik-johansson commented Oct 21, 2023

David-Berghaus commented Oct 21, 2023

fredrik-johansson commented Oct 21, 2023

fredrik-johansson commented Oct 21, 2023

David-Berghaus commented Oct 23, 2023

fredrik-johansson commented Oct 24, 2023

fredrik-johansson commented Oct 24, 2023

Add modular splitting #1542

Add modular splitting #1542

Conversation

David-Berghaus commented Oct 20, 2023 • edited Loading

fredrik-johansson commented Oct 21, 2023

David-Berghaus commented Oct 21, 2023

fredrik-johansson commented Oct 21, 2023

fredrik-johansson commented Oct 21, 2023

David-Berghaus commented Oct 23, 2023

fredrik-johansson commented Oct 24, 2023

fredrik-johansson commented Oct 24, 2023

David-Berghaus commented Oct 20, 2023 •

edited

Loading