What is Coqui TTS — Deep Learning Text-to-Speech Engine?

Generate speech in 1100+ languages with voice cloning. XTTS v2 streams with under 200ms latency. 44K+ GitHub stars.

Is Coqui TTS — Deep Learning Text-to-Speech Engine free to use?

Yes. Coqui TTS — Deep Learning Text-to-Speech Engine is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Coqui TTS — Deep Learning Text-to-Speech Engine?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Coqui TTS — Deep Learning Text-to-Speech Engine

# 生成英文语音 tts --text "Hello, welcome to TokRepo." --out_path output.wav # XTTS v2 中文语音 + 声音克隆 tts --model_name tts_models/multilingual/multi-dataset/xtts_v2 \ --text "你好，欢迎来到TokRepo。" \ --speaker_wav reference_voice.wav \ --language_idx zh-cn \ --out_path output_zh.wav

简介

Coqui TTS 是最全面的开源语音合成库，拥有 44,900+ GitHub stars，支持 1,100+ 语言。旗舰 XTTS v2 模型仅需 6 秒参考音频即可实现声音克隆，流式延迟低于 200ms。实现了所有主流 TTS 架构（VITS、Tacotron 2、Bark、Tortoise），提供统一的 Python API 和 CLI。

适用于：Python、CUDA GPU、任何需要语音合成的应用。适合为 AI 代理、聊天机器人、无障碍工具或内容创作管线添加语音的开发者。

核心功能

XTTS v2 旗舰模型

支持 16 种语言，6 秒参考音频克隆声音，流式延迟低于 200ms。

丰富的模型库

VITS（超快）、YourTTS（多说话人）、Bark（富表现力）、Tortoise（最高质量）。

流式合成

实时流式输出音频块，适合对话场景。

微调训练

在自己的语音数据上微调模型，打造专属声音。

REST API 服务

一行命令启动 TTS 服务器，HTTP 接口生成语音。

Coqui TTS — Deep Learning Text-to-Speech Engine

先拿来用，再决定要不要深挖

简介

核心功能

XTTS v2 旗舰模型

丰富的模型库

流式合成

微调训练

REST API 服务

来源与感谢

讨论

相关资产

Pydantic — Data Validation for AI Agent Pipelines

Open WebUI — Self-Hosted ChatGPT Alternative

Anthropic Agent SDK — Build Production AI Agents