Is F5-TTS — Flow Matching Text-to-Speech free to use?

Yes. F5-TTS — Flow Matching Text-to-Speech is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install F5-TTS — Flow Matching Text-to-Speech?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Scripts2026年4月1日·1 分钟阅读

F5-TTS — Flow Matching Text-to-Speech

Name: F5-TTS — Flow Matching Text-to-Speech
Author: TokRepo精选

F5-TTS is a diffusion transformer TTS system with flow matching. 14.3K+ GitHub stars. Multi-speaker, voice chat, Gradio UI, CLI inference, 0.04 RTF on L20 GPU. MIT code.

TokRepo精选 · Community

快速使用

先拿来用，再决定要不要深挖

这里应该同时让用户和 Agent 知道第一步该复制什么、安装什么、落到哪里。

# Install
pip install f5-tts

# CLI inference
f5-tts_infer-cli --model F5TTS_v1_Base --ref_audio ref.wav --ref_text "Reference text" --gen_text "Text to generate"

# Or launch Gradio web UI
f5-tts_infer-gradio

# Voice chat with Qwen2.5
f5-tts_infer-gradio --voicechat

介绍

F5-TTS is a diffusion transformer-based text-to-speech system using flow matching with ConvNeXt V2 architecture, optimized for fast training and inference. With 14,300+ GitHub stars, F5-TTS delivers multi-speaker and multi-style speech synthesis, voice chat powered by Qwen2.5-3B-Instruct, a Gradio web interface for inference and fine-tuning, and CLI inference. With Triton/TensorRT-LLM optimization, it achieves 0.0394 real-time factor on L20 GPU. MIT licensed code with CC-BY-NC pre-trained models.

Best for: Researchers and developers needing high-quality multi-speaker TTS with voice cloning Works with: Claude Code, OpenAI Codex, Cursor, Gemini CLI, Windsurf Optimized: 0.04 RTF on L20 GPU with TensorRT-LLM

Key Features

Flow matching: Diffusion transformer with ConvNeXt V2 for natural speech
Multi-speaker: Multiple voices and speaking styles
Voice chat: Interactive voice conversation powered by Qwen2.5-3B
Gradio UI: Web interface for inference and fine-tuning
CLI inference: Command-line tool with custom configs
Ultra-fast: 0.0394 RTF on L20 GPU with TensorRT-LLM
Docker support: Containerized deployment ready

FAQ

Q: What is F5-TTS? A: F5-TTS is a diffusion transformer TTS with 14.3K+ stars using flow matching. Multi-speaker, voice chat, Gradio UI, 0.04 RTF on L20 GPU. MIT code, CC-BY-NC models.

Q: How do I install F5-TTS? A: Run pip install f5-tts. Use f5-tts_infer-cli for command-line or f5-tts_infer-gradio for web UI.

🙏

来源与感谢

Created by SWivid. Code: MIT, Models: CC-BY-NC. SWivid/F5-TTS — 14,300+ GitHub stars

F5-TTS — Flow Matching Text-to-Speech

先拿来用，再决定要不要深挖

Key Features

FAQ

来源与感谢

相关资产

Windmill — Open-Source Internal Tool Platform

Agno — Production AI Agent Runtime

Semantic Kernel — Microsoft AI Agent Framework