Quick and dirty bot (or more of a PoC) that transcribes Telegram voice messages using OpenAI's whisper models, using CPU inference thanks to whisper.cpp
- refactor
- possibly get rid of ffmpeg lib?
- fix error handling & generating
You can check out the example Dockerfile, but TL;DR:
git clone [email protected]:chinese-soup/cbot-telegram-whisper.git && \
cd cbot-telegram-whisper && \
git clone https://github.com/ggerganov/whisper.cpp.git && \
cd whisper.cpp/bindings/go && \
make whisper && \
cd ../../.. && \
go get
C_INCLUDE_PATH=/app/whisper.cpp/ LIBRARY_PATH=/app/whisper.cpp/ go build -o whisperbot
bash whisper.cpp/models/download-ggml-model.sh tiny.en
Check out whisper.cpp's README for more info.
export TELEGRAM_APITOKEN=<your bot token here>
export MODELPATH=whisper.cpp/models/ggml-tiny.en.bin
$ which ffmpeg
/usr/bin/ffmpeg
If it isn't install a binary or compile and put it in path.
./whisperbot