TTS responses with word alignment #894
-
Hey there, I wanted to reach out and ask if there's a way to get word alignment information for the generated audio from the TTS API? So when providing "This is a test" we'd like to know at which point in the audio "This" is basically spoken (start / end) - ideally this would be on individual character level. I didn't find that info anywhere in the docs for now... Mabye you can help me out there? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
Thanks for asking your question about Deepgram! If you didn't already include it in your post, please be sure to add as much detail as possible so we can assist you efficiently, such as:
|
Beta Was this translation helpful? Give feedback.
-
Hi @rose-m, we don't currently offer word alignment info on our TTS output audio. If word timings are essential, you can transcribe the audio with our pre-recorded STT, which does provide word-level timestamps. However, that would incur an additional transcription cost, which may not be acceptable for your use case. |
Beta Was this translation helpful? Give feedback.
Hi @rose-m, we don't currently offer word alignment info on our TTS output audio.
If word timings are essential, you can transcribe the audio with our pre-recorded STT, which does provide word-level timestamps.
However, that would incur an additional transcription cost, which may not be acceptable for your use case.