Piper inference in Rust (without ONNX Runtime) #504

robertknight · 2024-05-18T12:28:29Z

robertknight
May 18, 2024

Thank-you for the work on this project. It is great to have a high quality open source TTS system.

I have been working on a new pure Rust runtime for ONNX models. Last week I thought it would be an interesting project to get it running Piper voice models. I now have a working demo. Performance is a little slower than ONNX Runtime (about ~1.3x on my Intel i5, but it will vary depending on hardware), but still comfortably realtime even on low-powered devices (Raspberry Pi etc.). The binary size is ~2MB (stripped), or 1.6MB if you include only the used operators.

The demo can be run as follows:

# Convert Piper voice model (the output is a format similar to onnxruntime's `.ort`,
# optimized for efficient loading)
pip install rten-convert
rten-convert voice-model.onnx

# Build and run. At the end of the "cargo run" command you can
# add an argument that is a string of phonemes to convert.
git clone https://github.com/robertknight/rten
cd rten
cargo run -p rten-examples --release --bin piper voice-model.onnx voice-model.onnx.json

# Play audio
ffplay output.wav

What is currently missing is the text-to-phoneme conversion step. The demo just uses some phoneme sequences pre-generated with piper_phonemize (or supplied on the command line). If good non-espeak models become available in future, that will make easier to get a fully working TTS system.

synesthesiam · 2024-05-18T17:02:15Z

synesthesiam
May 18, 2024
Maintainer

Thanks for sharing, this is very interesting! For the next version of Piper, I'm rewriting the text-to-phoneme engine as a standalone C++ library with a C API. Being able to run Piper fully within WebAssembly or at least without the ONNX runtime would be awesome.

0 replies

robertknight · 2024-05-18T20:42:05Z

robertknight
May 18, 2024
Author

Being able to run Piper fully within WebAssembly or at least without the ONNX runtime would be awesome.

This does indeed support WebAssembly, provided the runtime has SIMD support (all modern browsers and Node do). Using wasmtime for example (nb. single-threaded):

RUSTFLAGS="-C target-feature=+simd128" cargo build -p rten-examples --target wasm32-wasi -r --bin piper
wasmtime --dir . target/wasm32-wasi/release/piper.wasm voice.rten voice.onnx.json

A browser/Node compatible build would use the wasm32-unknown-unknown target.

3 replies

guest271314 Aug 11, 2024

@robertknight Can you release pre-built wasm and rten files? I'm running out of disk space trying to build rten.

robertknight Aug 14, 2024
Author

How much disk space is the build using? I did a fresh build just now and the target directory used "only" 133M.

Publishing a build of this demo is not that useful because it is still just an initial demo, rather than a complete library that can easily be dropped into a JS project.

guest271314 Aug 14, 2024

Starting out with only ~850 MB of disk space I don't think I'll be able to build rten

I'm talking about the WASM build of piper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Piper inference in Rust (without ONNX Runtime) #504

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Piper inference in Rust (without ONNX Runtime) #504

robertknight May 18, 2024

Replies: 2 comments · 3 replies

synesthesiam May 18, 2024 Maintainer

robertknight May 18, 2024 Author

guest271314 Aug 11, 2024

robertknight Aug 14, 2024 Author

guest271314 Aug 14, 2024

robertknight
May 18, 2024

Replies: 2 comments 3 replies

synesthesiam
May 18, 2024
Maintainer

robertknight
May 18, 2024
Author

robertknight Aug 14, 2024
Author