-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update sparse-dot-topn to v1 #77
base: master
Are you sure you want to change the base?
Conversation
Thanks for sharing this, I completely missed this new release!
Thanks for already sorting this, I might have missed it otherwise. Indeed, the code expects it to be left sorted. I see that the tests fail but they also use quite old Python versions which have minimal/no support anymore. Could that be the issue? |
Ah yes, I hadn't realized. We don't support 3.7 through the binding library which is 3.8+. We could condition the minimum version based on the python version and add a thin wrapper around the old API to make it compatible with the new one. |
Yes, my bad I changed the filename last minute. I'll push a fix in a bit. |
6a99cdb
to
ae1b93b
Compare
Thanks for the changes, it seems that the pipeline has problems importing the function you created. |
ae1b93b
to
1ff33b4
Compare
Sorry, my laptop doesn't support 3.7 so I'd perhaps relied too much on the CICD for this work smoothly. I figured out the issue, apparently |
It seems that there are some tests failing. Not sure why that is happening though. |
Hi @MaartenGr, sorry for the stall on this, I'm hoping to pick this back up soon. |
Hi @MaartenGr,
We recently refactored sparse-dot-topn significantly and released v1.
The most significant improvements are:
The changes are significant enough that we released a new major version which deprecates
awsome_cossim_topn
.I also noticed that you encountered a bug when top-n is 1, I added a test-case for this and the issue no longer exists.
The new implementation does not sort the scores but rather returns the matrix in the order as if you didn't select the top-n,
i.e.
sp_matmul(A, B) == sp_matmul_topn(A, B, B.shape[1])
.It wasn't directly clear to me if you (implicitly) depend on the result being sorted so I left sorting on (it has no performance penalty).