Question 1

What is CosyVoice — Multilingual Voice Generation with LLM-Based TTS?

Accepted Answer

CosyVoice is an open-source text-to-speech system built on large language models by Alibaba's FunAudioLLM team. It supports 9 languages and 18+ Chinese dialects with zero-shot voice cloning, streaming synthesis, and fine-grained prosody control.

Question 2

Is CosyVoice — Multilingual Voice Generation with LLM-Based TTS free to use?

Accepted Answer

Yes. CosyVoice — Multilingual Voice Generation with LLM-Based TTS is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

Question 3

How do I install CosyVoice — Multilingual Voice Generation with LLM-Based TTS?

Accepted Answer

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

CosyVoice — Multilingual Voice Generation with LLM-Based TTS

This asset can be read and installed directly by agents

Discussion

Related Assets

Zonos — Multilingual TTS with Voice Cloning

Fish Speech — Multilingual TTS for 80+ Languages

GPT-SoVITS — Few-Shot Voice Cloning and Text-to-Speech

Vapi — Voice AI Agent Platform with STT, LLM & TTS