What is ChatTTS — Expressive Text-to-Speech for Dialogue?

Generate natural conversational speech with laughter, pauses, and emotion. Optimized for dialogue scenarios. 39K+ GitHub stars.

Is ChatTTS — Expressive Text-to-Speech for Dialogue free to use?

Yes. ChatTTS — Expressive Text-to-Speech for Dialogue is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install ChatTTS — Expressive Text-to-Speech for Dialogue?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

ChatTTS — Expressive Text-to-Speech for Dialogue

ChatTTS is an open-source text-to-speech model with 39,000+ GitHub stars, specifically optimized for generating natural, expressive conversational speech. Unlike traditional TTS that sounds robotic, ChatTTS produces speech with natural laughter, pauses, filler words, and emotional variation — making it ideal for chatbots, virtual assistants, podcasts, and audiobooks. Trained on 100,000+ hours of dialogue data, it supports fine-grained prosody control through special tokens and generates 24kHz high-quality audio. Available in both English and Chinese. Works with: Python, PyTorch, CUDA GPUs (recommended), CPU (slower). Best for developers building conversational AI that needs natural-sounding speech output. Setup time: under 5 minutes. ---

## ChatTTS Features ### Natural Dialogue Speech ChatTTS excels at conversational scenarios: | Feature | Description | |---------|-------------| | **Laughter** | Insert `[laugh]` for natural laughing | | **Pauses** | Control pause duration with `[uv_break]` | | **Filler words** | Natural "um", "uh" generation | | **Emotion** | Convey happiness, surprise, thoughtfulness | | **Prosody** | Pitch, speed, and emphasis control | ### Prosody Control ```python # Control speaking style with parameters params_infer = ChatTTS.Chat.InferCodeParams( spk_emb=None, # Speaker embedding (None = random) temperature=0.3, # Lower = more stable, higher = more expressive top_P=0.7, top_K=20, ) # Refine prosody params_refine = ChatTTS.Chat.RefineTextParams( prompt='[oral_2][laugh_0][break_6]', # oral filler + no laugh + long breaks ) wavs = chat.infer( texts, params_infer_code=params_infer, params_refine_text=params_refine, ) ``` ### Speaker Consistency ```python # Generate a random speaker rand_spk = chat.sample_random_speaker() # Use the same speaker for multiple utterances params = ChatTTS.Chat.InferCodeParams(spk_emb=rand_spk) wavs = chat.infer( ["First sentence.", "Second sentence.", "Third sentence."], params_infer_code=params, ) # All 3 outputs sound like the same person ``` ### Performance - **Speed**: ~5x real-time on GPU (generates 5 seconds of audio per second) - **Quality**: 24kHz, natural prosody, MOS score competitive with commercial TTS - **Languages**: English and Chinese - **Model size**: ~800MB ### Special Tokens ``` [laugh] - Insert laughter [uv_break] - Insert a pause [oral_0-9] - Filler word frequency (0=none, 9=very frequent) [laugh_0-9] - Laughter frequency [break_0-9] - Pause frequency and duration ``` --- ## FAQ **Q: What is ChatTTS?** A: ChatTTS is an open-source TTS model with 39,000+ GitHub stars, optimized for natural conversational speech with laughter, pauses, and emotion. Trained on 100K+ hours of dialogue data. **Q: How is ChatTTS different from Coqui TTS or Bark?** A: ChatTTS is specifically optimized for dialogue — it excels at conversational prosody, laughter, and natural filler words. Coqui TTS is a general-purpose TTS toolkit. Bark generates creative audio but is slower. ChatTTS is the best choice for chatbot and assistant speech. **Q: Is ChatTTS free?** A: Open-source under AGPL-3.0. Free for non-commercial use. Commercial use requires compliance with AGPL or a commercial license. ---

ChatTTS — Expressive Text-to-Speech for Dialogue

Use it first, then decide how deep to go

Source & Thanks

Discussion

Related Assets

OpenLIT — OpenTelemetry LLM Observability

Agenta — Open-Source LLMOps Platform

Rerun — Visualize Multimodal AI Data in Real-Time