Return result with best log prob when all temperature fallbacks failed #356

hoonlight · 2023-07-17T07:43:17Z

hoonlight · 2023-07-19T11:43:13Z

The improvements suggested in openai/whisper#1377 (comment) have been applied.

I'm still testing, but there seems to be a noticeable reduction in hallucinations and an overall improvement in transcription quality.

guillaumekln

Since this PR only changes the temperature fallback final result, which would be very different than openai/whisper anyway, we could merge it before it is accepted in openai/whisper.

Here are some comments:

faster_whisper/transcribe.py

hoonlight · 2023-07-19T15:33:07Z

Thank you, I have changed the code according to your advice.

faster_whisper/transcribe.py

hoonlight · 2023-07-19T16:43:30Z

Please review again. Also, I think it would be nice if we renamed the PR to be more intuitive, please rename it to something you think is appropriate.

hoonlight · 2023-07-20T07:41:44Z

There seems to be another benefit to this change.

Before this change, there were quite a few cases where a sentence was not bad even if it had avg_logprob < -1.
After this change, along with an overall improvement in transcription quality, fewer sentences with avg_logprob < -1 were output, and sentences with avg_logprob < -1 were much more likely to be hallucinations.

So I was able to filter out the segments with avg_logprob < -1.5 and further reduce the hallucinations.
For users who want to reduce hallucinations at the cost of missing some sentences, this might be helpful.

Of course I haven't tested all cases, so there may be many different cases and this won't always guarantee better results.

guillaumekln

Looks good to me. Thanks!

* Fix broken prompt_reset_on_temperature Fixing: #603 Broken because `generate_with_fallback()` doesn't return final temperature. Regression since PR356 -> #356

* Fix broken prompt_reset_on_temperature Fixing: SYSTRAN#603 Broken because `generate_with_fallback()` doesn't return final temperature. Regression since PR356 -> SYSTRAN#356

* Fix broken prompt_reset_on_temperature Fixing: SYSTRAN/faster-whisper#603 Broken because `generate_with_fallback()` doesn't return final temperature. Regression since PR356 -> SYSTRAN/faster-whisper#356

hoonlight and others added 4 commits July 17, 2023 16:42

Resolve Inference Selection Bug

c860004

Refactor for better readability

0e58b3f

Merge branch 'guillaumekln:master' into return-best-result

c9e07ab

Filter out results with compression_ratio

ca28641

guillaumekln reviewed Jul 19, 2023

View reviewed changes

faster_whisper/transcribe.py Outdated Show resolved Hide resolved

faster_whisper/transcribe.py Outdated Show resolved Hide resolved

faster_whisper/transcribe.py Outdated Show resolved Hide resolved

Refactor to avoid variable repetition

30e0efe

hoonlight requested a review from guillaumekln July 19, 2023 15:34

guillaumekln reviewed Jul 19, 2023

View reviewed changes

faster_whisper/transcribe.py Outdated Show resolved Hide resolved

faster_whisper/transcribe.py Outdated Show resolved Hide resolved

Fix incorrect index and perform minor refactoring

19c867a

hoonlight requested a review from guillaumekln July 19, 2023 16:43

Remove final_temperature variable

a578a18

guillaumekln approved these changes Jul 20, 2023

View reviewed changes

guillaumekln changed the title ~~Resolve Inference Selection Bug~~ Return result with best log prob when all temperature fallbacks failed Jul 20, 2023

guillaumekln merged commit e786e26 into SYSTRAN:master Jul 20, 2023
3 checks passed

hoonlight deleted the return-best-result branch July 20, 2023 14:23

Purfview mentioned this pull request Dec 5, 2023

Fix broken prompt_reset_on_temperature #604

Merged

TheoBoyer mentioned this pull request Dec 18, 2023

Resolve Inference Selection Bug Affecting Transcription Quality openai/whisper#1377

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return result with best log prob when all temperature fallbacks failed #356

Return result with best log prob when all temperature fallbacks failed #356

hoonlight commented Jul 17, 2023 •

edited

Loading

hoonlight commented Jul 19, 2023 •

edited

Loading

guillaumekln left a comment •

edited

Loading

hoonlight commented Jul 19, 2023

hoonlight commented Jul 19, 2023 •

edited

Loading

hoonlight commented Jul 20, 2023 •

edited

Loading

guillaumekln left a comment

Return result with best log prob when all temperature fallbacks failed #356

Return result with best log prob when all temperature fallbacks failed #356

Conversation

hoonlight commented Jul 17, 2023 • edited Loading

hoonlight commented Jul 19, 2023 • edited Loading

guillaumekln left a comment • edited Loading

Choose a reason for hiding this comment

hoonlight commented Jul 19, 2023

hoonlight commented Jul 19, 2023 • edited Loading

hoonlight commented Jul 20, 2023 • edited Loading

guillaumekln left a comment

Choose a reason for hiding this comment

hoonlight commented Jul 17, 2023 •

edited

Loading

hoonlight commented Jul 19, 2023 •

edited

Loading

guillaumekln left a comment •

edited

Loading

hoonlight commented Jul 19, 2023 •

edited

Loading

hoonlight commented Jul 20, 2023 •

edited

Loading