You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to try out word-level knowledge distillation (https://arxiv.org/abs/1606.07947) and for this I need the probabilities of all tokens (or at least the top-k ones) at each decoding step. I see that it's already possible to print the probability of the generated token with the with_token_level flag but it is not clear to me how to modify the code to get the probabilities of the top-k tokens at each step.
Any help is appreciated,
Thanks,
Z
The text was updated successfully, but these errors were encountered:
Hello,
I would like to try out word-level knowledge distillation (https://arxiv.org/abs/1606.07947) and for this I need the probabilities of all tokens (or at least the top-k ones) at each decoding step. I see that it's already possible to print the probability of the generated token with the
with_token_level
flag but it is not clear to me how to modify the code to get the probabilities of the top-k tokens at each step.Any help is appreciated,
Thanks,
Z
The text was updated successfully, but these errors were encountered: