# whisper.cpp — Local Speech-to-Text in Pure C/C++ > High-performance port of OpenAI Whisper in C/C++. No Python, no GPU required. Runs on CPU, Apple Silicon, CUDA, and even Raspberry Pi. Real-time transcription. ## Install Copy the content below into your project: # whisper.cpp — Local Speech-to-Text in Pure C/C++ ## Quick Use ```bash # Clone and build git clone https://github.com/ggerganov/whisper.cpp cd whisper.cpp cmake -B build cmake --build build --config Release # Download a model bash models/download-ggml-model.sh base.en # Transcribe audio ./build/bin/whisper-cli -m models/ggml-base.en.bin -f samples/jfk.wav ``` **macOS (Homebrew):** ```bash brew install whisper-cpp whisper-cpp -m /path/to/model.bin -f audio.wav ``` **Real-time microphone transcription:** ```bash ./build/bin/whisper-stream -m models/ggml-base.en.bin # Speak into your microphone — see text appear in real-time ``` ## Intro whisper.cpp is a high-performance C/C++ port of OpenAI's Whisper speech recognition model by Georgi Gerganov (creator of llama.cpp). It runs entirely locally with zero dependencies — no Python, no PyTorch, no internet connection needed. The key advantage: it runs efficiently on CPU. Apple Silicon gets 4-8x speedup via Core ML and Metal. NVIDIA GPUs work via CUDA. Even a Raspberry Pi can transcribe audio. Real-time streaming transcription works on modern laptops. With 37,000+ stars and the same developer behind llama.cpp, whisper.cpp is the go-to solution for local, private speech recognition. ## Details ### Model Sizes | Model | Disk | RAM | Speed (CPU) | Quality | |-------|------|-----|-------------|---------| | **tiny** | 75 MB | ~390 MB | ~32x real-time | Good for drafts | | **base** | 142 MB | ~500 MB | ~16x real-time | Solid accuracy | | **small** | 466 MB | ~1 GB | ~6x real-time | Very good | | **medium** | 1.5 GB | ~2.6 GB | ~2x real-time | Excellent | | **large-v3** | 3.1 GB | ~4.8 GB | ~1x real-time | Best quality | ### Hardware Acceleration | Platform | Backend | Speedup | |----------|---------|---------| | Apple Silicon | Core ML + Metal | 4-8x | | NVIDIA GPU | CUDA | 5-10x | | Intel CPU | OpenVINO | 2-3x | | Any CPU | AVX2/NEON | Baseline | | Raspberry Pi 4 | ARM NEON | Usable (tiny model) | ### Language Bindings whisper.cpp has community bindings for virtually every language: - **Python**: `pywhispercpp` — `pip install pywhispercpp` - **Node.js**: `whisper-node` — `npm install whisper-node` - **Go**: `go-whisper` - **Rust**: `whisper-rs` - **C#/.NET**: `Whisper.net` - **Java**: `whisper-jni` - **Ruby**: `ruby-whisper` ### Features - **Streaming** — real-time transcription from microphone - **Timestamps** — word-level and segment-level timing - **Translation** — transcribe and translate to English simultaneously - **VAD** — voice activity detection to skip silence - **Speaker diarization** — basic speaker identification - **Output formats** — TXT, SRT, VTT, JSON, CSV ## Frequently Asked Questions **Q: How does accuracy compare to the original Whisper?** A: Identical. whisper.cpp uses the same model weights. Output quality is the same — only the inference engine differs. **Q: Can it run on a phone?** A: Yes. There are iOS and Android ports. The tiny and base models work well on modern phones. **Q: Does it support real-time transcription?** A: Yes. `whisper-stream` does real-time microphone transcription. Works well with the base model on modern laptops. **Q: What audio formats are supported?** A: WAV (16-bit, 16kHz) natively. Use ffmpeg to convert other formats: `ffmpeg -i input.mp3 -ar 16000 -ac 1 output.wav` ## Works With - Any audio file (WAV, MP3 via ffmpeg) - macOS, Linux, Windows, Raspberry Pi, iOS, Android - CPU (any modern processor), Apple Silicon, NVIDIA CUDA - Python, Node.js, Go, Rust, C#, Java, Ruby bindings ## Source & Thanks - **GitHub**: [ggerganov/whisper.cpp](https://github.com/ggerganov/whisper.cpp) — 37,000+ stars, MIT License - By Georgi Gerganov (also creator of llama.cpp) --- # whisper.cpp — 纯 C/C++ 本地语音识别 ## 快速使用 ```bash git clone https://github.com/ggerganov/whisper.cpp cd whisper.cpp && cmake -B build && cmake --build build --config Release bash models/download-ggml-model.sh base.en ./build/bin/whisper-cli -m models/ggml-base.en.bin -f audio.wav ``` ## 介绍 OpenAI Whisper 的高性能 C/C++ 移植版。零依赖,不需要 Python 和 GPU。在 CPU 上高效运行,Apple Silicon 加速 4-8 倍。37,000+ stars,llama.cpp 同一作者。支持实时麦克风转录。 ## 来源与感谢 - **GitHub**: [ggerganov/whisper.cpp](https://github.com/ggerganov/whisper.cpp) — 37,000+ stars, MIT 许可证 --- Source: https://tokrepo.com/en/workflows/e1fd7c46-bbda-4956-8649-9c3ed579ff25 Author: Script Depot