-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/factor fix gnu tests with num prime #6110
Feature/factor fix gnu tests with num prime #6110
Conversation
impressive. how many lines it remove ? |
GNU testsuite comparison:
|
@sylvestre It removed ~1400 lines of code. As you requested, I also looked into performance now. PR branch with num_prime crate: At first glance it might look promising, but actually there is a factor 1000 difference (us vs. ms) :-/ thats sad |
are you sure you built in release mode ? thanks |
OK, I found the issue with the performance comparison. I adapted the performance test such that it now compares the full facorization of |
@sylvestre what can I do about this error in the cargo-deny step? I've never had this issue before.
|
Ignore the warning in deny.toml |
maybe update the benchmark accordingly ? |
I submitted the changes I did for the direct comparison here: You can see that I also tested different other third-party crates:
But none of it had better performance results than |
a44499d
to
c06c580
Compare
GNU testsuite comparison:
|
c06c580
to
43c6a91
Compare
GNU testsuite comparison:
|
0f6d4d1
to
44eb3da
Compare
GNU testsuite comparison:
|
Cool! For performance, it would be interesting to see if we can use |
Just a humble user of uutils' factor here, but I would not recommend removing all of this code. @nbraud put years of work into this and it looks like the
Considering that the performance of uutils' factor is already up to 15× slower than GNU factor and over 20× slower other programs (see the recent benchmarks in #1456), if the performance of this |
@tdulcet You raise some good points. The work by @nbraud is awesome, as shown by the fact that most of the libraries out there are slower than their implementation. However, I don't think anybody currently maintaining the project has the expertise/time to really improve this further within this repository. Therefore, I think we actually should rely on another library stands more on its own. The way I see it, there are several options:
I really hope that someone in the Rust ecosystem comes along who can write factor algorithms on par with GNU, but I'm pretty sure it's not going to be us.
This is why I mentioned |
Option 1 would be great if such a Rust library exists, but based on @cre4ture's benchmarks above, it does not seem like this is the case currently. I did benchmark a library in #1456 (comment) that is 3.5× faster than GNU factor for 64-bit numbers, 29× faster for 96-bit numbers and 334× faster for 127-bit numbers, but it is written in C++. I am not sure how difficult it might be to port to Rust. Option 2 may be the best option, as it would allow uutils to retain full control of the performance. If it were possible to mend fences with @nbraud, they would obviously be the most qualified contributor to develop this new library. Something like CodSpeed could be used to measure the performance in the CI.
It is definitely worth a try, but it looks like this function uses the same algorithms and thus is still missing several of those used by GNU factor.
The existing code actually only supports Note that GNU factor has three code paths, one for up 63-bit integers, one for up 127-bit integers and one for arbitrary precision integers using the GNU MP library, so I suspect uutils' factor may need to do the same for performance, similar to what you were proposing with |
I'm honestly still surprised that no good rust library has emerged yet. It feels like an obvious thing to tackle. |
I added In general I need to state that I'm far away from beeing an expert in primal factorization. So maybe I'm doing something wrong in my comparison. So everybody is welcome to do reviews/suggestions. My current branch for performance comparison is here: https://github.com/cre4ture/coreutils/tree/test/factor_compare_different_implementations . I'm with @tertsdiepraam. I think the coreutils crate is not an ideal place to maintain a factorisation algorithm implementation. There are many usecases out there (I assume) which could make use of a smart implementation such that its clearly worth to have it in an own crate. Of course, I respect the authors of the uutils implementation. I could never have done it like this. The good thing is, that the interface to all those libraries that I tested is small and always similar. An exchange of the library for the implementation of the algorithm is low effort and easy. So even if we drop the uutils implementation for now, we could easily re-introduce it in future if someone improved it. |
nope, we love this kind of work, thanks :) |
GNU testsuite comparison:
|
ef42dea
to
b34a099
Compare
GNU testsuite comparison:
|
this is terrific! |
b34a099
to
8bad2ef
Compare
GNU testsuite comparison:
|
@sylvestre how do I ignore the issue with the duplicate "hashbrown" crate? I think to fix we would need to upgrade both sides as both are using a outdate version of hashbrown. Latest is 14.3. |
@cre4ture you can add the |
GNU testsuite comparison:
|
df9b03c
to
f90166e
Compare
GNU testsuite comparison:
|
f90166e
to
a3639d4
Compare
GNU testsuite comparison:
|
a3639d4
to
c3c7ef8
Compare
GNU testsuite comparison:
|
c3c7ef8
to
d202bab
Compare
@sylvestre can we merge this PR? or is there something open to do? |
yeah, let's take it |
GNU testsuite comparison:
|
and CI fully green! |
well done! |
I'm honestly surprised: I hadn't found a not-horribly-slow and actively maintained lib for this, back when I started working on I had planned to implement some methods based on square forms, elliptic curves (EECM and HECM) and possibly the quadratic sieve once I got around to implementing arbitrary-sized inputs... but instead discovered that the I/O path was the main perf. issue at the time. I still fixed the most glaring issues there, but the lack of deep interest in this particular bit of I/O plumbing, and hostile maintainer, killed all the momentum and interest I had going. PS: Thanks for the kind words, though. ❤️
I could potentially be interested in doing just that, either within an existing lib or extracting my last commit from |
this addresses #6109 which is about failing gnu test
tests/factor/factor.pl
changes:
num_prime
(Apache-2.0 license).this allows for input numbers > u128