# Piper — Fast Local Text-to-Speech Engine for 30+ Languages > Lightweight neural TTS system optimized for Raspberry Pi and edge devices with offline support and dozens of voice models. ## Install Save in your project root: # Piper — Fast Local Text-to-Speech Engine for 30+ Languages ## Quick Use ```bash # Install via pip: pip install piper-tts # Download a voice model: wget https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-lessac-medium.tar.gz tar xzf voice-en-us-lessac-medium.tar.gz # Generate speech: echo "Hello, this is Piper." | piper --model en_US-lessac-medium.onnx --output_file hello.wav # Or run with Docker: docker run -v $(pwd):/data rhasspy/piper --model /data/model.onnx < input.txt > output.wav ``` ## Introduction Piper is a fast, local text-to-speech system designed to run on low-power hardware like the Raspberry Pi. It uses VITS-based neural network models exported to ONNX format, enabling high-quality speech synthesis in over 30 languages without requiring cloud APIs or GPU acceleration. ## What Piper Does - Converts text to natural-sounding speech using neural network voice models - Runs entirely offline with no external API calls or internet connectivity required - Supports over 30 languages with multiple voice options per language - Provides both a command-line tool and a C library for integration into other applications - Generates audio fast enough for real-time use on single-board computers ## Architecture Overview Piper uses VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) models that have been exported to ONNX format. The inference runtime uses onnxruntime for cross-platform CPU execution. Text preprocessing including phonemization is handled by espeak-ng or language-specific tokenizers. The C++ core library can be called from Python, the command line, or embedded directly into applications. Models are compact, typically 50-100 MB per voice. ## Self-Hosting & Configuration - Install the Python package via pip or use pre-built binaries from GitHub releases - Download voice models from the Piper releases page or Hugging Face - Integrate into Home Assistant for local voice assistant capabilities - Use the C shared library (libpiper) for embedding into C/C++ or other language applications - Configure speech rate, volume, and phoneme overrides via command-line flags ## Key Features - Runs on Raspberry Pi 4 and similar ARM devices at real-time speed - No GPU or cloud API required for inference - Compact ONNX models that are easy to distribute and deploy - Extensive language coverage with community-contributed voice models - Simple command-line interface that reads from stdin and writes WAV to stdout ## Comparison with Similar Tools - **Coqui TTS** — Research-oriented with more model architectures; Piper prioritizes deployment simplicity and edge performance - **Kokoro** — Lightweight 82M parameter model; Piper offers broader language coverage with per-language models - **espeak-ng** — Rule-based synthesis with robotic quality; Piper produces natural neural speech - **OpenAI TTS API** — Cloud-based with high quality; Piper runs locally with no API costs or latency ## FAQ **Q: What hardware does Piper require?** A: Piper runs on any device with a CPU. A Raspberry Pi 4 can generate speech in real-time. No GPU is needed. **Q: Can I train custom voice models?** A: Yes. Piper provides training scripts based on the VITS architecture. You need a dataset of audio recordings with transcriptions. **Q: How does Piper integrate with Home Assistant?** A: Piper is the default local TTS engine for the Home Assistant voice assistant pipeline. It can be installed as a Home Assistant add-on. **Q: What audio format does Piper output?** A: Piper outputs raw PCM or WAV audio by default. You can pipe the output to ffmpeg or sox for format conversion. ## Sources - https://github.com/rhasspy/piper - https://rhasspy.github.io/piper-samples --- Source: https://tokrepo.com/en/workflows/piper-fast-local-text-speech-engine-30-languages-e62067f0 Author: AI Open Source