You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According to that tutorial, it uses the Wav2Vec2 model to do this and I successfully aligned multiple languages. There's an extra step involved in mapping the aligned words back to the original word (non-romanized), but that's pretty much it.
Thoughts?
The text was updated successfully, but these errors were encountered:
which model did you used can you tell me how to do this I wanna do it for Japanese language, because none of the japanese wav2vec2 I found working the english one works best, so it would be helpful if you share how did you used the multilingual one.
I don't know much about ML but I was able to use the following tutorial to do force aligment on multilingual transcription. The only requirement is to romanize the transcript which I did with the
uroman
package.https://pytorch.org/audio/stable/tutorials/forced_alignment_for_multilingual_data_tutorial.html
According to that tutorial, it uses the Wav2Vec2 model to do this and I successfully aligned multiple languages. There's an extra step involved in mapping the aligned words back to the original word (non-romanized), but that's pretty much it.
Thoughts?
The text was updated successfully, but these errors were encountered: