# F5-TTS — Flow Matching Text-to-Speech

> F5-TTS is a diffusion transformer TTS system with flow matching. 14.3K+ GitHub stars. Multi-speaker, voice chat, Gradio UI, CLI inference, 0.04 RTF on L20 GPU. MIT code.

## Install

Save as a script file and run:

## Quick Use

```bash
# Install
pip install f5-tts

# CLI inference
f5-tts_infer-cli --model F5TTS_v1_Base --ref_audio ref.wav --ref_text "Reference text" --gen_text "Text to generate"

# Or launch Gradio web UI
f5-tts_infer-gradio

# Voice chat with Qwen2.5
f5-tts_infer-gradio --voicechat
```

---

## Intro

F5-TTS is a diffusion transformer-based text-to-speech system using flow matching with ConvNeXt V2 architecture, optimized for fast training and inference. With 14,300+ GitHub stars, F5-TTS delivers multi-speaker and multi-style speech synthesis, voice chat powered by Qwen2.5-3B-Instruct, a Gradio web interface for inference and fine-tuning, and CLI inference. With Triton/TensorRT-LLM optimization, it achieves 0.0394 real-time factor on L20 GPU. MIT licensed code with CC-BY-NC pre-trained models.

**Best for**: Researchers and developers needing high-quality multi-speaker TTS with voice cloning
**Works with**: Claude Code, OpenAI Codex, Cursor, Gemini CLI, Windsurf
**Optimized**: 0.04 RTF on L20 GPU with TensorRT-LLM

---

## Key Features

- **Flow matching**: Diffusion transformer with ConvNeXt V2 for natural speech
- **Multi-speaker**: Multiple voices and speaking styles
- **Voice chat**: Interactive voice conversation powered by Qwen2.5-3B
- **Gradio UI**: Web interface for inference and fine-tuning
- **CLI inference**: Command-line tool with custom configs
- **Ultra-fast**: 0.0394 RTF on L20 GPU with TensorRT-LLM
- **Docker support**: Containerized deployment ready

---

### FAQ

**Q: What is F5-TTS?**
A: F5-TTS is a diffusion transformer TTS with 14.3K+ stars using flow matching. Multi-speaker, voice chat, Gradio UI, 0.04 RTF on L20 GPU. MIT code, CC-BY-NC models.

**Q: How do I install F5-TTS?**
A: Run `pip install f5-tts`. Use `f5-tts_infer-cli` for command-line or `f5-tts_infer-gradio` for web UI.

---

## Source & Thanks

> Created by [SWivid](https://github.com/SWivid). Code: MIT, Models: CC-BY-NC.
> [SWivid/F5-TTS](https://github.com/SWivid/F5-TTS) — 14,300+ GitHub stars

---
Source: https://tokrepo.com/en/workflows/093755c4-a497-4f6d-9e00-4c41cbd49c90
Author: Script Depot