TensorRT - avg_logprob #22

OValery16 · 2024-02-20T13:52:58Z

Thanks for your really impressive work.

I was wondering how to extract the token probability with TensorRT (a little bit lit what you did in this example with ctranslate2)

import whisper_s2t

model = whisper_s2t.load_model(model_identifier="large-v2", backend='CTranslate2')

files = ['data/KINCAID46/audio/1.wav']
lang_codes = ['en']
tasks = ['transcribe']
initial_prompts = [None]

out = model.transcribe_with_vad(files,
                                lang_codes=lang_codes,
                                tasks=tasks,
                                initial_prompts=initial_prompts,
                                batch_size=32)

print(out[0][0])
"""
[Console Output]

{'text': "Let's bring in Phil Mackie who is there at the palace. We're looking at Teresa and Philip May. Philip, can you see how he's being transferred from the helicopters? It looks like, as you said, the beast. It's got its headlights on because the sun is beginning to set now, certainly sinking behind some clouds. It's about a quarter of a mile away down the Grand Drive",
 'avg_logprob': -0.25426941679184695,
 'no_speech_prob': 8.147954940795898e-05,
 'start_time': 0.0,
 'end_time': 24.8}
"""

The text was updated successfully, but these errors were encountered:

OValery16 · 2024-02-20T21:35:38Z

I'm inquiring about this because having access to this information (the scores) would enable us to implement a language detection method.

shashikg · 2024-02-21T17:43:27Z

@OValery16 NVIDIA/TensorRT-LLM#1127

yuekaizhang · 2024-02-22T05:45:51Z

Thanks for your really impressive work.

I was wondering how to extract the token probability with TensorRT (a little bit lit what you did in this example with ctranslate2)

import whisper_s2t

model = whisper_s2t.load_model(model_identifier="large-v2", backend='CTranslate2')

files = ['data/KINCAID46/audio/1.wav']
lang_codes = ['en']
tasks = ['transcribe']
initial_prompts = [None]

out = model.transcribe_with_vad(files,
                                lang_codes=lang_codes,
                                tasks=tasks,
                                initial_prompts=initial_prompts,
                                batch_size=32)

print(out[0][0])
"""
[Console Output]

{'text': "Let's bring in Phil Mackie who is there at the palace. We're looking at Teresa and Philip May. Philip, can you see how he's being transferred from the helicopters? It looks like, as you said, the beast. It's got its headlights on because the sun is beginning to set now, certainly sinking behind some clouds. It's about a quarter of a mile away down the Grand Drive",
 'avg_logprob': -0.25426941679184695,
 'no_speech_prob': 8.147954940795898e-05,
 'start_time': 0.0,
 'end_time': 24.8}
"""

You may check this https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/runtime/generation.py#L2267. You may also need to register logit as one of the output tensors.

OValery16 · 2024-03-01T10:40:40Z

Thanks for your tips. I don't really see how to do it without modifying TensorRT-llm library. Do you know how to do it ?

shashikg added the enhancement New feature or request label Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TensorRT - avg_logprob #22

TensorRT - avg_logprob #22

OValery16 commented Feb 20, 2024

OValery16 commented Feb 20, 2024

shashikg commented Feb 21, 2024

yuekaizhang commented Feb 22, 2024

OValery16 commented Mar 1, 2024

TensorRT - avg_logprob #22

TensorRT - avg_logprob #22

Comments

OValery16 commented Feb 20, 2024

OValery16 commented Feb 20, 2024

shashikg commented Feb 21, 2024

yuekaizhang commented Feb 22, 2024

OValery16 commented Mar 1, 2024