Esta página se muestra en inglés. Una traducción al español está en curso.
代码Apr 2, 2026·2 min de lectura

whisper.cpp — Local Speech-to-Text in Pure C/C++

High-performance port of OpenAI Whisper in C/C++. No Python, no GPU required. Runs on CPU, Apple Silicon, CUDA, and even Raspberry Pi. Real-time transcription.

Introducción

whisper.cpp is a high-performance C/C++ port of OpenAI's Whisper speech recognition model by Georgi Gerganov (creator of llama.cpp). It runs entirely locally with zero dependencies — no Python, no PyTorch, no internet connection needed.

The key advantage: it runs efficiently on CPU. Apple Silicon gets 4-8x speedup via Core ML and Metal. NVIDIA GPUs work via CUDA. Even a Raspberry Pi can transcribe audio. Real-time streaming transcription works on modern laptops.

With 37,000+ stars and the same developer behind llama.cpp, whisper.cpp is the go-to solution for local, private speech recognition.

Model Sizes

Model Disk RAM Speed (CPU) Quality
tiny 75 MB ~390 MB ~32x real-time Good for drafts
base 142 MB ~500 MB ~16x real-time Solid accuracy
small 466 MB ~1 GB ~6x real-time Very good
medium 1.5 GB ~2.6 GB ~2x real-time Excellent
large-v3 3.1 GB ~4.8 GB ~1x real-time Best quality

Hardware Acceleration

Platform Backend Speedup
Apple Silicon Core ML + Metal 4-8x
NVIDIA GPU CUDA 5-10x
Intel CPU OpenVINO 2-3x
Any CPU AVX2/NEON Baseline
Raspberry Pi 4 ARM NEON Usable (tiny model)

Language Bindings

whisper.cpp has community bindings for virtually every language:

  • Python: pywhispercpppip install pywhispercpp
  • Node.js: whisper-nodenpm install whisper-node
  • Go: go-whisper
  • Rust: whisper-rs
  • C#/.NET: Whisper.net
  • Java: whisper-jni
  • Ruby: ruby-whisper

Features

  • Streaming — real-time transcription from microphone
  • Timestamps — word-level and segment-level timing
  • Translation — transcribe and translate to English simultaneously
  • VAD — voice activity detection to skip silence
  • Speaker diarization — basic speaker identification
  • Output formats — TXT, SRT, VTT, JSON, CSV
🙏

Fuente y agradecimientos

  • GitHub: ggerganov/whisper.cpp — 37,000+ stars, MIT License
  • By Georgi Gerganov (also creator of llama.cpp)

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados