[Feature] how to get the token score(logprob) of greedy decoder？ #2652

Wondersui · 2024-10-24T11:59:39Z

Motivation

I want to get the token score(logprob) of output tokens when use greedy search decode

Related resources

If you want to get the score(logprob) of the output token, you can only use tok sampling. If you set k=1, the output score will be 1. If it is greater than 1, topk sampling will output different results, which is not in line with expectations. Is there any way to get the score of the output token by greedy decode?

Additional context

none

irexyc · 2024-10-25T07:34:26Z

Without considering efficiency, you can inference it twice and compute it manually with existing apis.

import torch
from lmdeploy import pipeline, GenerationConfig
pipe = pipeline('/nvme/shared/vicuna-7b-v1.5/', log_level='INFO')
messages = 'hello' # or openai format

output = pipe(messages, gen_config=GenerationConfig(top_k=1))

decorated = pipe.chat_template.messages2prompt(messages)
prompt_tokens = pipe.tokenizer.encode(decorated)
all_tokens = prompt_tokens + output.token_ids
all_logits = pipe.get_logits(all_tokens)
all_scores = torch.softmax(all_logits, dim=-1)
gen_scores = all_scores[0, len(prompt_tokens) - 1:-1]
for i, tk in enumerate(output.token_ids):
    print(i, tk, gen_scores[i, tk])

lvhan028 assigned irexyc Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] how to get the token score(logprob) of greedy decoder？ #2652

[Feature] how to get the token score(logprob) of greedy decoder？ #2652

Wondersui commented Oct 24, 2024

irexyc commented Oct 25, 2024

[Feature] how to get the token score(logprob) of greedy decoder？ #2652

[Feature] how to get the token score(logprob) of greedy decoder？ #2652

Comments

Wondersui commented Oct 24, 2024

Motivation

Related resources

Additional context

irexyc commented Oct 25, 2024