You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
VAD model plays a crucial role in the WhisperX pipeline and can significantly affect speech recognition performance and inference time. Thus, it is important to extend the application to accept alternative VAD methods. These methods do not necessarily have to emerge from pyannote-audio toolkit (as in the case of the default VAD model). Silero VAD is an ideal candidate for an alternative VAD option. It has excellent results on speech detection tasks running only on CPUs. In addition, it is considered a high-priority TODO item in WhisperX repository.
This feature includes:
Implementation of Silero VAD as an alternative VAD option.
Extension of WhisperX to accept VAD alternatives that do not have to necessarily emerge from pyannote-audio toolkit.
Fix in whisperx\__init__.py imports.
Implementation, description of tests as well as future work can be found in pull request #888 .
The text was updated successfully, but these errors were encountered:
VAD model plays a crucial role in the WhisperX pipeline and can significantly affect speech recognition performance and inference time. Thus, it is important to extend the application to accept alternative VAD methods. These methods do not necessarily have to emerge from
pyannote-audio
toolkit (as in the case of the default VAD model). Silero VAD is an ideal candidate for an alternative VAD option. It has excellent results on speech detection tasks running only on CPUs. In addition, it is considered a high-priority TODO item in WhisperX repository.This feature includes:
pyannote-audio
toolkit.whisperx\__init__.py
imports.Implementation, description of tests as well as future work can be found in pull request #888 .
The text was updated successfully, but these errors were encountered: