We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I found test_sampling.cu, there is only for FP32 test。I try use FP16, It not work.
The text was updated successfully, but these errors were encountered:
It's easy to add support for fp16:
In
flashinfer/python/csrc/sampling.cu
Lines 39 to 40 in d81af97
Lines 33 to 40 in d81af97
flashinfer/python/csrc/pytorch_extension_utils.h
Line 26 in d81af97
But as you mentioned, fp16 might fail some extreme cases because the fp16 probabilities might not sum up to 1 anymore.
Sorry, something went wrong.
No branches or pull requests
I found test_sampling.cu, there is only for FP32 test。I try use FP16, It not work.
The text was updated successfully, but these errors were encountered: