# WhisperX — 70x Faster Speech Recognition > WhisperX provides 70x realtime speech recognition with word-level timestamps and speaker diarization. 21K+ GitHub stars. Batched inference, under 8GB VRAM. BSD-2-Clause. ## Install Save as a script file and run: ## Quick Use ```bash # Install pip install whisperx # Transcribe with word timestamps + speaker labels whisperx audio.mp3 --model large-v2 --diarize --language en # Or in Python python -c " import whisperx model = whisperx.load_model('large-v2', device='cuda') audio = whisperx.load_audio('audio.mp3') result = model.transcribe(audio, batch_size=16) # Align for word-level timestamps model_a, metadata = whisperx.load_align_model(language_code='en', device='cuda') result = whisperx.align(result['segments'], model_a, metadata, audio, device='cuda') print(result['segments']) " ``` --- ## Intro WhisperX is an automatic speech recognition tool that provides 70x realtime transcription using batched inference on OpenAI's Whisper large-v2 model, with accurate word-level timestamps via wav2vec2 alignment and speaker diarization. With 21,000+ GitHub stars and BSD-2-Clause license, it requires under 8GB GPU memory for large models, includes voice activity detection preprocessing for noise reduction, and outputs precise per-word timing and speaker labels. **Best for**: Developers building transcription, subtitling, or meeting analysis with speaker identification **Works with**: Claude Code, OpenAI Codex, Cursor, Gemini CLI, Windsurf **Performance**: 70x realtime with large-v2, under 8GB VRAM --- ## Key Features - **70x realtime**: Batched inference for dramatically faster transcription - **Word-level timestamps**: Precise timing via wav2vec2 alignment - **Speaker diarization**: Identify who said what using pyannote-audio - **VAD preprocessing**: Voice activity detection filters silence/noise - **Under 8GB VRAM**: Runs large models on consumer GPUs - **CLI + Python API**: Command-line tool and programmatic access --- ### FAQ **Q: What is WhisperX?** A: WhisperX is a speech recognition tool with 21K+ stars providing 70x realtime transcription, word timestamps, and speaker diarization. Under 8GB VRAM. BSD-2-Clause. **Q: How do I install WhisperX?** A: `pip install whisperx`. Run `whisperx audio.mp3 --model large-v2 --diarize` for full pipeline. --- ## Source & Thanks > Created by [Max Bain](https://github.com/m-bain). Licensed under BSD-2-Clause. > [m-bain/whisperX](https://github.com/m-bain/whisperX) — 21,000+ GitHub stars --- Source: https://tokrepo.com/en/workflows/c43ad870-8c99-471a-898e-b07140faf532 Author: Script Depot