# whisper.cpp — Local Speech-to-Text in Pure C/C++

> High-performance port of OpenAI Whisper in C/C++. No Python, no GPU required. Runs on CPU, Apple Silicon, CUDA, and even Raspberry Pi. Real-time transcription.

## Install

Copy the content below into your project:

# whisper.cpp — Local Speech-to-Text in Pure C/C++

## Quick Use

```bash
# Clone and build
git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp
cmake -B build
cmake --build build --config Release

# Download a model
bash models/download-ggml-model.sh base.en

# Transcribe audio
./build/bin/whisper-cli -m models/ggml-base.en.bin -f samples/jfk.wav
```

**macOS (Homebrew):**
```bash
brew install whisper-cpp
whisper-cpp -m /path/to/model.bin -f audio.wav
```

**Real-time microphone transcription:**
```bash
./build/bin/whisper-stream -m models/ggml-base.en.bin
# Speak into your microphone — see text appear in real-time
```

## Intro

whisper.cpp is a high-performance C/C++ port of OpenAI's Whisper speech recognition model by Georgi Gerganov (creator of llama.cpp). It runs entirely locally with zero dependencies — no Python, no PyTorch, no internet connection needed.

The key advantage: it runs efficiently on CPU. Apple Silicon gets 4-8x speedup via Core ML and Metal. NVIDIA GPUs work via CUDA. Even a Raspberry Pi can transcribe audio. Real-time streaming transcription works on modern laptops.

With 37,000+ stars and the same developer behind llama.cpp, whisper.cpp is the go-to solution for local, private speech recognition.

## Details

### Model Sizes

| Model | Disk | RAM | Speed (CPU) | Quality |
|-------|------|-----|-------------|---------|
| **tiny** | 75 MB | ~390 MB | ~32x real-time | Good for drafts |
| **base** | 142 MB | ~500 MB | ~16x real-time | Solid accuracy |
| **small** | 466 MB | ~1 GB | ~6x real-time | Very good |
| **medium** | 1.5 GB | ~2.6 GB | ~2x real-time | Excellent |
| **large-v3** | 3.1 GB | ~4.8 GB | ~1x real-time | Best quality |

### Hardware Acceleration

| Platform | Backend | Speedup |
|----------|---------|---------|
| Apple Silicon | Core ML + Metal | 4-8x |
| NVIDIA GPU | CUDA | 5-10x |
| Intel CPU | OpenVINO | 2-3x |
| Any CPU | AVX2/NEON | Baseline |
| Raspberry Pi 4 | ARM NEON | Usable (tiny model) |

### Language Bindings

whisper.cpp has community bindings for virtually every language:

- **Python**: `pywhispercpp` — `pip install pywhispercpp`
- **Node.js**: `whisper-node` — `npm install whisper-node`
- **Go**: `go-whisper`
- **Rust**: `whisper-rs`
- **C#/.NET**: `Whisper.net`
- **Java**: `whisper-jni`
- **Ruby**: `ruby-whisper`

### Features

- **Streaming** — real-time transcription from microphone
- **Timestamps** — word-level and segment-level timing
- **Translation** — transcribe and translate to English simultaneously
- **VAD** — voice activity detection to skip silence
- **Speaker diarization** — basic speaker identification
- **Output formats** — TXT, SRT, VTT, JSON, CSV

## Frequently Asked Questions

**Q: How does accuracy compare to the original Whisper?**
A: Identical. whisper.cpp uses the same model weights. Output quality is the same — only the inference engine differs.

**Q: Can it run on a phone?**
A: Yes. There are iOS and Android ports. The tiny and base models work well on modern phones.

**Q: Does it support real-time transcription?**
A: Yes. `whisper-stream` does real-time microphone transcription. Works well with the base model on modern laptops.

**Q: What audio formats are supported?**
A: WAV (16-bit, 16kHz) natively. Use ffmpeg to convert other formats: `ffmpeg -i input.mp3 -ar 16000 -ac 1 output.wav`

## Works With

- Any audio file (WAV, MP3 via ffmpeg)
- macOS, Linux, Windows, Raspberry Pi, iOS, Android
- CPU (any modern processor), Apple Silicon, NVIDIA CUDA
- Python, Node.js, Go, Rust, C#, Java, Ruby bindings

## Source & Thanks

- **GitHub**: [ggerganov/whisper.cpp](https://github.com/ggerganov/whisper.cpp) — 37,000+ stars, MIT License
- By Georgi Gerganov (also creator of llama.cpp)

---
<!-- ZH -->
# whisper.cpp — 纯 C/C++ 本地语音识别

## 快速使用
```bash
git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp && cmake -B build && cmake --build build --config Release
bash models/download-ggml-model.sh base.en
./build/bin/whisper-cli -m models/ggml-base.en.bin -f audio.wav
```

## 介绍
OpenAI Whisper 的高性能 C/C++ 移植版。零依赖，不需要 Python 和 GPU。在 CPU 上高效运行，Apple Silicon 加速 4-8 倍。37,000+ stars，llama.cpp 同一作者。支持实时麦克风转录。

## 来源与感谢
- **GitHub**: [ggerganov/whisper.cpp](https://github.com/ggerganov/whisper.cpp) — 37,000+ stars, MIT 许可证


---
Source: https://tokrepo.com/en/workflows/e1fd7c46-bbda-4956-8649-9c3ed579ff25
Author: Script Depot