Confidence-score per word #18

raffiniert · 2024-09-14T09:06:30Z

Very nice library, thank you so much for all the efforts!

For my project, I would need a confidence score not only on the final result of f.e. a complete sentence, but on a per-word basis. Currently, the confidence score per word is always returned as 0.

Is there any way you could implement this in an upcoming release? I would buy you coffee/beer in large amounts for it :-D

jamsch · 2024-09-14T11:20:12Z

Hi @raffiniert, you can get the confidence scores per word segment through event.results[x].segments, however there's a few things to note:

On Android, your mileage may vary based on the recognition service you're using. This seems to be the only public Github repo that implements this functionality for RecognizerIntent.EXTRA_REQUEST_WORD_CONFIDENCE -- https://github.com/search?q=EXTRA_REQUEST_WORD_CONFIDENCE&type=code

Also of note for Android:

Segment confidences seem to only return back as -1 (RecognitionPart.CONFIDENCE_LEVEL_UNKNOWN) at least on my Samsung Galaxy S23 Ultra. -- this will likely have to be fixed upstream on Google's end.
This feature is only available on SDK 34+ (Android 14+)
I've only verified that segments work with the com.google.android.as recognition service (using on device speech recognition). I couldn't verify it to work with com.google.android.tts and com.samsung.android.bixby.agent.
Segments are only available during the final result
The segment parts are split up by words.
The segments are only available for the first transcript

For iOS:

The confidence score per segment will be 0 on partial results

So to wrap it up, not really working for Android, but does work for iOS. Android will need upstream fixes for this feature to work. In the meantime if you do need to get this consistent across devices, there's two options you could explore:

You can record the recognized audio by calling ExpoSpeechRecognitionModule.start() with recordingOptions: { persist: true } and then send the recognized audio to that service when speech recognition has ended.
Otherwise, if space isn't a concern you may want to consider bundling your app with an on-device speech recognition model like whisper.rn

raffiniert · 2024-09-14T11:33:52Z

man you're awesome. I don't have time to deep-dive rn but just wanted to thank you for your reply immediately.

Hope Android will fix it upstream soon and will look into the iOS-solution asap. Thanks again!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confidence-score per word #18

Confidence-score per word #18

raffiniert commented Sep 14, 2024

jamsch commented Sep 14, 2024

raffiniert commented Sep 14, 2024

Confidence-score per word #18

Confidence-score per word #18

Comments

raffiniert commented Sep 14, 2024

jamsch commented Sep 14, 2024

raffiniert commented Sep 14, 2024