Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return result with best log prob when all temperature fallbacks failed #356

Merged
merged 7 commits into from
Jul 20, 2023

Conversation

hoonlight
Copy link
Contributor

@hoonlight hoonlight commented Jul 17, 2023

Related to openai/whisper#1377

@hoonlight
Copy link
Contributor Author

hoonlight commented Jul 19, 2023

The improvements suggested in openai/whisper#1377 (comment) have been applied.

I'm still testing, but there seems to be a noticeable reduction in hallucinations and an overall improvement in transcription quality.

Copy link
Contributor

@guillaumekln guillaumekln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR only changes the temperature fallback final result, which would be very different than openai/whisper anyway, we could merge it before it is accepted in openai/whisper.

Here are some comments:

faster_whisper/transcribe.py Outdated Show resolved Hide resolved
faster_whisper/transcribe.py Outdated Show resolved Hide resolved
faster_whisper/transcribe.py Outdated Show resolved Hide resolved
@hoonlight
Copy link
Contributor Author

Thank you, I have changed the code according to your advice.

faster_whisper/transcribe.py Outdated Show resolved Hide resolved
faster_whisper/transcribe.py Outdated Show resolved Hide resolved
@hoonlight
Copy link
Contributor Author

hoonlight commented Jul 19, 2023

Please review again. Also, I think it would be nice if we renamed the PR to be more intuitive, please rename it to something you think is appropriate.

@hoonlight
Copy link
Contributor Author

hoonlight commented Jul 20, 2023

There seems to be another benefit to this change.

Before this change, there were quite a few cases where a sentence was not bad even if it had avg_logprob < -1.
After this change, along with an overall improvement in transcription quality, fewer sentences with avg_logprob < -1 were output, and sentences with avg_logprob < -1 were much more likely to be hallucinations.

So I was able to filter out the segments with avg_logprob < -1.5 and further reduce the hallucinations.
For users who want to reduce hallucinations at the cost of missing some sentences, this might be helpful.

Of course I haven't tested all cases, so there may be many different cases and this won't always guarantee better results.

Copy link
Contributor

@guillaumekln guillaumekln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks!

@guillaumekln guillaumekln changed the title Resolve Inference Selection Bug Return result with best log prob when all temperature fallbacks failed Jul 20, 2023
@guillaumekln guillaumekln merged commit e786e26 into SYSTRAN:master Jul 20, 2023
3 checks passed
@hoonlight hoonlight deleted the return-best-result branch July 20, 2023 14:23
nguyendc-systran pushed a commit that referenced this pull request Dec 13, 2023
* Fix broken prompt_reset_on_temperature

Fixing: #603

Broken because `generate_with_fallback()` doesn't return final temperature.

Regression since PR356 -> #356
metame-none pushed a commit to metame-ai/faster-distil-whisper that referenced this pull request Jan 2, 2024
* Fix broken prompt_reset_on_temperature

Fixing: SYSTRAN#603

Broken because `generate_with_fallback()` doesn't return final temperature.

Regression since PR356 -> SYSTRAN#356
Aiurus added a commit to Aiurus/faster-whisper that referenced this pull request May 25, 2024
* Fix broken prompt_reset_on_temperature

Fixing: SYSTRAN/faster-whisper#603

Broken because `generate_with_fallback()` doesn't return final temperature.

Regression since PR356 -> SYSTRAN/faster-whisper#356
coolCatalyst added a commit to coolCatalyst/faster-whisper that referenced this pull request Jun 1, 2024
* Fix broken prompt_reset_on_temperature

Fixing: SYSTRAN/faster-whisper#603

Broken because `generate_with_fallback()` doesn't return final temperature.

Regression since PR356 -> SYSTRAN/faster-whisper#356
AIXerum added a commit to AIXerum/faster-whisper that referenced this pull request Oct 18, 2024
* Fix broken prompt_reset_on_temperature

Fixing: SYSTRAN/faster-whisper#603

Broken because `generate_with_fallback()` doesn't return final temperature.

Regression since PR356 -> SYSTRAN/faster-whisper#356
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants