Cette page est affichée en anglais. Une traduction française est en cours.
ScriptsMar 29, 2026·2 min de lecture

Whisper — OpenAI Speech-to-Text

OpenAI's open-source speech recognition model. Transcribe audio/video to text with word-level timestamps in 99 languages. Essential for subtitle generation.

Introduction

OpenAI's Whisper is an open-source speech recognition model trained on 680,000 hours of multilingual data. It transcribes audio to text with word-level timestamps in 99 languages, generates SRT/VTT subtitles, and handles accents, background noise, and technical jargon. 75,000+ GitHub stars. The foundation for most AI subtitle generation pipelines.

Best for: Subtitle generation, podcast transcription, video content indexing, multilingual transcription Works with: Python 3.8+, FFmpeg Setup time: 3 minutes (+ model download)


Models

Model Parameters Speed Accuracy VRAM
tiny 39M ~10x realtime Good ~1GB
base 74M ~7x realtime Better ~1GB
small 244M ~4x realtime Good+ ~2GB
medium 769M ~2x realtime Great ~5GB
large-v3 1.5B ~1x realtime Best ~10GB

Python API

import whisper

model = whisper.load_model("medium")
result = model.transcribe("audio.mp3", word_timestamps=True)

for segment in result["segments"]:
    print(f"[{segment['start']:.1f}s - {segment['end']:.1f}s] {segment['text']}")

Output Formats

whisper audio.mp3 --output_format srt   # SubRip subtitles
whisper audio.mp3 --output_format vtt   # WebVTT subtitles
whisper audio.mp3 --output_format json  # Detailed JSON with word timestamps
whisper audio.mp3 --output_format txt   # Plain text

FAQ

Q: What is Whisper? A: OpenAI's open-source speech recognition model that transcribes audio to text in 99 languages with word-level timestamps. 75,000+ GitHub stars.

Q: Is Whisper free? A: Yes. Whisper is MIT-licensed and runs locally on your machine. No API costs.

Q: What languages does Whisper support? A: 99 languages including English, Chinese, Spanish, French, German, Japanese, Korean, Arabic, and more.


🙏

Source et remerciements

Created by OpenAI. Licensed under MIT. whisper — ⭐ 75,000+

Discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires