Skip to content

Commit

Permalink
Word timing tweaks (#616)
Browse files Browse the repository at this point in the history
  • Loading branch information
AIXerum committed Dec 13, 2023
1 parent c869be9 commit 8a58422
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions faster_whisper/transcribe.py
Original file line number Diff line number Diff line change
Expand Up @@ -908,6 +908,13 @@ def find_alignment(
words, word_tokens = tokenizer.split_to_word_tokens(
text_tokens + [tokenizer.eot]
)
if len(word_tokens) <= 1:
# return on eot only
# >>> np.pad([], (1, 0))
# array([0.])
# This results in crashes when we lookup jump_times with float, like
# IndexError: arrays used as indices must be of integer (or boolean) type
return []
word_boundaries = np.pad(np.cumsum([len(t) for t in word_tokens[:-1]]), (1, 0))
if len(word_boundaries) <= 1:
return []
Expand Down

0 comments on commit 8a58422

Please sign in to comment.