# Whisper — OpenAI Speech-to-Text > OpenAI's open-source speech recognition model. Transcribe audio/video to text with word-level timestamps in 99 languages. Essential for subtitle generation. ## Install Save as a script file and run: ## Quick Use ```bash pip install openai-whisper whisper audio.mp3 --model medium --language en --output_format srt ``` --- ## Intro OpenAI's Whisper is an open-source speech recognition model trained on 680,000 hours of multilingual data. It transcribes audio to text with word-level timestamps in 99 languages, generates SRT/VTT subtitles, and handles accents, background noise, and technical jargon. 75,000+ GitHub stars. The foundation for most AI subtitle generation pipelines. **Best for**: Subtitle generation, podcast transcription, video content indexing, multilingual transcription **Works with**: Python 3.8+, FFmpeg **Setup time**: 3 minutes (+ model download) --- ## Models | Model | Parameters | Speed | Accuracy | VRAM | |-------|-----------|-------|----------|------| | tiny | 39M | ~10x realtime | Good | ~1GB | | base | 74M | ~7x realtime | Better | ~1GB | | small | 244M | ~4x realtime | Good+ | ~2GB | | medium | 769M | ~2x realtime | Great | ~5GB | | large-v3 | 1.5B | ~1x realtime | Best | ~10GB | ## Python API ```python import whisper model = whisper.load_model("medium") result = model.transcribe("audio.mp3", word_timestamps=True) for segment in result["segments"]: print(f"[{segment['start']:.1f}s - {segment['end']:.1f}s] {segment['text']}") ``` ## Output Formats ```bash whisper audio.mp3 --output_format srt # SubRip subtitles whisper audio.mp3 --output_format vtt # WebVTT subtitles whisper audio.mp3 --output_format json # Detailed JSON with word timestamps whisper audio.mp3 --output_format txt # Plain text ``` ### FAQ **Q: What is Whisper?** A: OpenAI's open-source speech recognition model that transcribes audio to text in 99 languages with word-level timestamps. 75,000+ GitHub stars. **Q: Is Whisper free?** A: Yes. Whisper is MIT-licensed and runs locally on your machine. No API costs. **Q: What languages does Whisper support?** A: 99 languages including English, Chinese, Spanish, French, German, Japanese, Korean, Arabic, and more. --- ## Source & Thanks > Created by [OpenAI](https://github.com/openai). Licensed under MIT. > [whisper](https://github.com/openai/whisper) — ⭐ 75,000+ --- Source: https://tokrepo.com/en/workflows/eb0f9dd6-2172-4c9f-aca9-97846b0f4d86 Author: Script Depot