Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add modular splitting #1542

Merged

Conversation

David-Berghaus
Copy link
Contributor

@David-Berghaus David-Berghaus commented Oct 20, 2023

Added modular splitting as described in Brent, Zimmermann: Modern Computer Arithmetic, Section 4.4.3.

Remarks:
-I ended up not using set_shallow because the swap-approach has the convenience that all coefficients that are out of bounds are simply zero entries which are ignored by the dot product routines.
-I started working on gr_poly_evaluate_other_modular but stopped since I noticed that gr_dot_other still uses the naive method. We therefore cannot get a performance advantage over gr_poly_evaluate_other_rectangular but potentially worse numerical stability.

@fredrik-johansson
Copy link
Collaborator

This will be a nice algorithm to have!

The code needs a little bit of work. You can't safely swap entries from poly because this is a read-only argument: it may be read simultaneously from another thread, for example. So tmp should actually be allocated shallowly and set_shallow should be used. See gr_mat/nonsingular_solve_tril.c for an example. For the dot product where you're currently zero-padding, you should be able to calculate the actual length.

Indeed, this algorithm could be implemented more cleanly if gr_dot supported strided access which is unfortunately not the case.

-I started working on gr_poly_evaluate_other_modular but stopped since I noticed that gr_dot_other still uses the naive method. We therefore cannot get a performance advantage over gr_poly_evaluate_other_rectangular but potentially worse numerical stability.

Yes, this will make sense to add when we have an overloadable gr_dot_other / gr_other_dot.

@David-Berghaus
Copy link
Contributor Author

Oh yes, I was stupid and didn't think of potential multi-threading!

I replaced swap with set_shallow now.

@fredrik-johansson
Copy link
Collaborator

Just some minor style issues to fix: use /* */ comments, { on new lines.

@fredrik-johansson
Copy link
Collaborator

BTW, gr_fmpz_poly_evaluate_modular is documented but does not seem to be part of the PR.

@David-Berghaus
Copy link
Contributor Author

Done and fixed!

@fredrik-johansson
Copy link
Collaborator

Nice!

@fredrik-johansson fredrik-johansson merged commit 52cf466 into flintlib:trunk Oct 24, 2023
16 checks passed
@fredrik-johansson
Copy link
Collaborator

I pushed a small optimization: the variable tmp_gr is not needed if you compute one more entry in the x vector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants