# Dia — Realistic Dialogue Text-to-Speech Model > Dia is a 1.6B parameter TTS model by Nari Labs that generates realistic dialogue audio from transcripts. 19.2K+ GitHub stars. Supports multi-speaker dialogue, non-verbal sounds, and voice cloning. Apa ## Install Save as a script file and run: ## Quick Use ```bash # Install pip install git+https://github.com/nari-labs/dia.git # Generate dialogue audio python -c " from dia.model import Dia model = Dia.from_pretrained('nari-labs/Dia-1.6B') text = '[S1] Hey, have you tried Dia yet? [S2] (laughs) Yeah, it sounds incredibly natural!' output = model.generate(text) model.save_audio('dialogue.wav', output) print('Saved dialogue.wav') " ``` Requires GPU with PyTorch 2.0+ and CUDA 12.6. On RTX 4090: 2.1x real-time, ~4.4GB VRAM. --- ## Intro Dia is a 1.6 billion parameter text-to-speech model by Nari Labs that directly generates highly realistic dialogue audio from transcripts in a single pass. With 19,200+ GitHub stars and Apache 2.0 license, Dia supports multi-speaker dialogue using `[S1]` and `[S2]` speaker tags, non-verbal sound generation (laughter, coughing, throat-clearing), and voice cloning through audio conditioning for emotion and tone control. It achieves 2.1x real-time speed on an RTX 4090 with just 4.4GB VRAM. **Best for**: Developers building conversational AI, podcast generation, audiobook creation, or voice interfaces **Works with**: Claude Code, OpenAI Codex, Cursor, Gemini CLI, Windsurf **Requirements**: GPU with PyTorch 2.0+, CUDA 12.6, English language --- ## Key Features - **Multi-speaker dialogue**: Use `[S1]` and `[S2]` tags to generate natural conversations - **Non-verbal sounds**: Laughter, coughing, sighing, throat-clearing built in - **Voice cloning**: Condition on reference audio to match emotion and tone - **Single-pass generation**: No multi-step pipeline, generates audio directly from text - **Fast inference**: 2.1x real-time on RTX 4090, 4.4GB VRAM (bfloat16 with compilation) - **1.6B parameters**: Large enough for quality, small enough to run locally --- ### FAQ **Q: What is Dia?** A: Dia is a 1.6B parameter text-to-speech model with 19.2K+ stars that generates realistic multi-speaker dialogue audio from transcripts. It supports non-verbal sounds and voice cloning. Apache 2.0 licensed by Nari Labs. **Q: How do I install Dia?** A: Run `pip install git+https://github.com/nari-labs/dia.git`. Requires a GPU with PyTorch 2.0+ and CUDA 12.6. **Q: What languages does Dia support?** A: Currently English only. The model generates dialogue audio with natural prosody, pauses, and non-verbal sounds. --- ## Source & Thanks > Created by [Nari Labs](https://github.com/nari-labs). Licensed under Apache 2.0. > [nari-labs/dia](https://github.com/nari-labs/dia) — 19,200+ GitHub stars --- Source: https://tokrepo.com/en/workflows/86148916-edf9-4ed9-8348-205c9b535810 Author: Script Depot