Is whisper.cpp — Local Speech-to-Text in Pure C/C++ free to use?

Yes. whisper.cpp — Local Speech-to-Text in Pure C/C++ is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install whisper.cpp — Local Speech-to-Text in Pure C/C++?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

代码Apr 2, 2026·2 min read

whisper.cpp — Local Speech-to-Text in Pure C/C++

High-performance port of OpenAI Whisper in C/C++. No Python, no GPU required. Runs on CPU, Apple Silicon, CUDA, and even Raspberry Pi. Real-time transcription.

Script Depot · Community

TL;DR

whisper.cpp runs OpenAI Whisper locally in C/C++ with no Python and no internet needed.

§01

What it is

whisper.cpp is a high-performance C/C++ port of OpenAI's Whisper speech recognition model by Georgi Gerganov (creator of llama.cpp). It runs entirely locally with zero dependencies: no Python, no PyTorch, no internet connection needed.

The key advantage: it runs efficiently on CPU. Apple Silicon gets 4-8x speedup via Core ML and Metal. NVIDIA GPUs work via CUDA. Even a Raspberry Pi can transcribe audio. Real-time streaming transcription works on modern laptops.

§02

How it saves time or tokens

whisper.cpp provides speech-to-text without cloud API costs or latency. Traditional Whisper requires Python, PyTorch, and ideally a GPU. whisper.cpp runs on any hardware with a single binary. For privacy-sensitive applications, all processing stays on-device. The tiny model (75 MB) transcribes at 32x real-time on CPU, making it practical for batch processing of audio archives.

§03

How to use

Clone, build, and download a model:

git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp
cmake -B build
cmake --build build --config Release
bash models/download-ggml-model.sh base.en

Transcribe an audio file:

./build/bin/whisper-cli -m models/ggml-base.en.bin -f samples/jfk.wav

Real-time microphone transcription:

./build/bin/whisper-stream -m models/ggml-base.en.bin
# Speak into your microphone -- text appears in real time

§04

Example

Model size comparison for different use cases:

| Model  | Disk    | RAM     | Speed (CPU)    | Quality          |
|--------|---------|---------|----------------|------------------|
| tiny   | 75 MB   | ~390 MB | ~32x real-time | Good for drafts  |
| base   | 142 MB  | ~500 MB | ~16x real-time | Solid accuracy   |
| small  | 466 MB  | ~1 GB   | ~6x real-time  | Good quality     |
| medium | 1.5 GB  | ~2.6 GB | ~2x real-time  | High quality     |
| large  | 2.9 GB  | ~4.7 GB | ~1x real-time  | Best quality     |

# Output formats
./build/bin/whisper-cli -m models/ggml-base.en.bin -f audio.wav -otxt   # Plain text
./build/bin/whisper-cli -m models/ggml-base.en.bin -f audio.wav -osrt   # SRT subtitles
./build/bin/whisper-cli -m models/ggml-base.en.bin -f audio.wav -ovtt   # VTT subtitles
./build/bin/whisper-cli -m models/ggml-base.en.bin -f audio.wav -ojson  # JSON with timestamps

§05

Related on TokRepo

AI tools for voice — More speech and voice tools on TokRepo.
Local LLM tools — Browse local AI inference tools.

§06

Common pitfalls

Using the large model on hardware without a GPU leads to very slow transcription. Start with base or small for CPU-only setups.
Audio files must be 16kHz 16-bit mono WAV. Convert other formats with ffmpeg before processing.
Real-time streaming requires a low-latency audio capture setup. Ensure your microphone input is configured correctly for the whisper-stream binary.

Frequently Asked Questions

Does whisper.cpp require a GPU?+

No. whisper.cpp runs on CPU by default. GPU acceleration via CUDA (NVIDIA), Metal (Apple), and Core ML (Apple) is optional and provides significant speedups. Even a Raspberry Pi can run the tiny model.

How does whisper.cpp compare to the Python Whisper?+

whisper.cpp provides the same transcription quality (it uses the same model weights) but runs without Python dependencies. It is faster on CPU and uses less memory. The tradeoff is that it requires manual compilation.

What audio formats does whisper.cpp support?+

whisper.cpp requires 16kHz 16-bit mono WAV input. Convert other formats using ffmpeg: ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav.

Can whisper.cpp do real-time transcription?+

Yes. The whisper-stream binary captures audio from your microphone and transcribes it in real time. This works with the tiny and base models on modern hardware.

What output formats are available?+

whisper.cpp outputs plain text, SRT subtitles, VTT subtitles, JSON with timestamps, and CSV. Choose the format with -otxt, -osrt, -ovtt, -ojson, or -ocsv flags.

Citations (3)

whisper.cpp GitHub— whisper.cpp C/C++ port of OpenAI Whisper
OpenAI Whisper— OpenAI Whisper speech recognition model
arXiv Whisper Paper— Whisper model architecture and training

Related on TokRepo

Voice tools Local LLM tools Featured workflows

🙏

Source & Thanks

GitHub: ggerganov/whisper.cpp — 37,000+ stars, MIT License
By Georgi Gerganov (also creator of llama.cpp)

Discussion

No comments yet. Be the first to share your thoughts.

whisper.cpp — Local Speech-to-Text in Pure C/C++

What it is

How it saves time or tokens

How to use

Example

Related on TokRepo

Common pitfalls

Frequently Asked Questions

Citations (3)

Related on TokRepo

Source & Thanks

Discussion

Related Assets

AlphaFold — AI-Powered 3D Protein Structure Prediction

Flash Attention — Fast Memory-Efficient Exact Attention for Transformers

ChatGLM — Open Bilingual Chat Model by Tsinghua KEG