# Whisper — OpenAI Speech-to-Text

> OpenAI's open-source speech recognition model. Transcribe audio/video to text with word-level timestamps in 99 languages. Essential for subtitle generation.

## Install

Save as a script file and run:

## Quick Use

```bash
pip install openai-whisper
whisper audio.mp3 --model medium --language en --output_format srt
```

---

## Intro

OpenAI's Whisper is an open-source speech recognition model trained on 680,000 hours of multilingual data. It transcribes audio to text with word-level timestamps in 99 languages, generates SRT/VTT subtitles, and handles accents, background noise, and technical jargon. 75,000+ GitHub stars. The foundation for most AI subtitle generation pipelines.

**Best for**: Subtitle generation, podcast transcription, video content indexing, multilingual transcription
**Works with**: Python 3.8+, FFmpeg
**Setup time**: 3 minutes (+ model download)

---

## Models

| Model | Parameters | Speed | Accuracy | VRAM |
|-------|-----------|-------|----------|------|
| tiny | 39M | ~10x realtime | Good | ~1GB |
| base | 74M | ~7x realtime | Better | ~1GB |
| small | 244M | ~4x realtime | Good+ | ~2GB |
| medium | 769M | ~2x realtime | Great | ~5GB |
| large-v3 | 1.5B | ~1x realtime | Best | ~10GB |

## Python API

```python
import whisper

model = whisper.load_model("medium")
result = model.transcribe("audio.mp3", word_timestamps=True)

for segment in result["segments"]:
    print(f"[{segment['start']:.1f}s - {segment['end']:.1f}s] {segment['text']}")
```

## Output Formats

```bash
whisper audio.mp3 --output_format srt   # SubRip subtitles
whisper audio.mp3 --output_format vtt   # WebVTT subtitles
whisper audio.mp3 --output_format json  # Detailed JSON with word timestamps
whisper audio.mp3 --output_format txt   # Plain text
```

### FAQ

**Q: What is Whisper?**
A: OpenAI's open-source speech recognition model that transcribes audio to text in 99 languages with word-level timestamps. 75,000+ GitHub stars.

**Q: Is Whisper free?**
A: Yes. Whisper is MIT-licensed and runs locally on your machine. No API costs.

**Q: What languages does Whisper support?**
A: 99 languages including English, Chinese, Spanish, French, German, Japanese, Korean, Arabic, and more.

---

## Source & Thanks

> Created by [OpenAI](https://github.com/openai). Licensed under MIT.
> [whisper](https://github.com/openai/whisper) — ⭐ 75,000+

---
Source: https://tokrepo.com/en/workflows/eb0f9dd6-2172-4c9f-aca9-97846b0f4d86
Author: Script Depot