代码Apr 2, 2026·2 min read

whisper.cpp — Local Speech-to-Text in Pure C/C++

High-performance port of OpenAI Whisper in C/C++. No Python, no GPU required. Runs on CPU, Apple Silicon, CUDA, and even Raspberry Pi. Real-time transcription.

SC
Script Depot · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

# Clone and build
git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp
cmake -B build
cmake --build build --config Release

# Download a model
bash models/download-ggml-model.sh base.en

# Transcribe audio
./build/bin/whisper-cli -m models/ggml-base.en.bin -f samples/jfk.wav

macOS (Homebrew):

brew install whisper-cpp
whisper-cpp -m /path/to/model.bin -f audio.wav

Real-time microphone transcription:

./build/bin/whisper-stream -m models/ggml-base.en.bin
# Speak into your microphone — see text appear in real-time
Intro

whisper.cpp is a high-performance C/C++ port of OpenAI's Whisper speech recognition model by Georgi Gerganov (creator of llama.cpp). It runs entirely locally with zero dependencies — no Python, no PyTorch, no internet connection needed.

The key advantage: it runs efficiently on CPU. Apple Silicon gets 4-8x speedup via Core ML and Metal. NVIDIA GPUs work via CUDA. Even a Raspberry Pi can transcribe audio. Real-time streaming transcription works on modern laptops.

With 37,000+ stars and the same developer behind llama.cpp, whisper.cpp is the go-to solution for local, private speech recognition.

Model Sizes

Model Disk RAM Speed (CPU) Quality
tiny 75 MB ~390 MB ~32x real-time Good for drafts
base 142 MB ~500 MB ~16x real-time Solid accuracy
small 466 MB ~1 GB ~6x real-time Very good
medium 1.5 GB ~2.6 GB ~2x real-time Excellent
large-v3 3.1 GB ~4.8 GB ~1x real-time Best quality

Hardware Acceleration

Platform Backend Speedup
Apple Silicon Core ML + Metal 4-8x
NVIDIA GPU CUDA 5-10x
Intel CPU OpenVINO 2-3x
Any CPU AVX2/NEON Baseline
Raspberry Pi 4 ARM NEON Usable (tiny model)

Language Bindings

whisper.cpp has community bindings for virtually every language:

  • Python: pywhispercpppip install pywhispercpp
  • Node.js: whisper-nodenpm install whisper-node
  • Go: go-whisper
  • Rust: whisper-rs
  • C#/.NET: Whisper.net
  • Java: whisper-jni
  • Ruby: ruby-whisper

Features

  • Streaming — real-time transcription from microphone
  • Timestamps — word-level and segment-level timing
  • Translation — transcribe and translate to English simultaneously
  • VAD — voice activity detection to skip silence
  • Speaker diarization — basic speaker identification
  • Output formats — TXT, SRT, VTT, JSON, CSV
🙏

Source & Thanks

  • GitHub: ggerganov/whisper.cpp — 37,000+ stars, MIT License
  • By Georgi Gerganov (also creator of llama.cpp)

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets