ChatTTS Features
Natural Dialogue Speech
ChatTTS excels at conversational scenarios:
| Feature | Description |
|---|---|
| Laughter | Insert [laugh] for natural laughing |
| Pauses | Control pause duration with [uv_break] |
| Filler words | Natural "um", "uh" generation |
| Emotion | Convey happiness, surprise, thoughtfulness |
| Prosody | Pitch, speed, and emphasis control |
Prosody Control
# Control speaking style with parameters
params_infer = ChatTTS.Chat.InferCodeParams(
spk_emb=None, # Speaker embedding (None = random)
temperature=0.3, # Lower = more stable, higher = more expressive
top_P=0.7,
top_K=20,
)
# Refine prosody
params_refine = ChatTTS.Chat.RefineTextParams(
prompt='[oral_2][laugh_0][break_6]', # oral filler + no laugh + long breaks
)
wavs = chat.infer(
texts,
params_infer_code=params_infer,
params_refine_text=params_refine,
)Speaker Consistency
# Generate a random speaker
rand_spk = chat.sample_random_speaker()
# Use the same speaker for multiple utterances
params = ChatTTS.Chat.InferCodeParams(spk_emb=rand_spk)
wavs = chat.infer(
["First sentence.", "Second sentence.", "Third sentence."],
params_infer_code=params,
)
# All 3 outputs sound like the same personPerformance
- Speed: ~5x real-time on GPU (generates 5 seconds of audio per second)
- Quality: 24kHz, natural prosody, MOS score competitive with commercial TTS
- Languages: English and Chinese
- Model size: ~800MB
Special Tokens
[laugh] - Insert laughter
[uv_break] - Insert a pause
[oral_0-9] - Filler word frequency (0=none, 9=very frequent)
[laugh_0-9] - Laughter frequency
[break_0-9] - Pause frequency and durationFAQ
Q: What is ChatTTS? A: ChatTTS is an open-source TTS model with 39,000+ GitHub stars, optimized for natural conversational speech with laughter, pauses, and emotion. Trained on 100K+ hours of dialogue data.
Q: How is ChatTTS different from Coqui TTS or Bark? A: ChatTTS is specifically optimized for dialogue — it excels at conversational prosody, laughter, and natural filler words. Coqui TTS is a general-purpose TTS toolkit. Bark generates creative audio but is slower. ChatTTS is the best choice for chatbot and assistant speech.
Q: Is ChatTTS free? A: Open-source under AGPL-3.0. Free for non-commercial use. Commercial use requires compliance with AGPL or a commercial license.