Quick Use
pip install livekit-plugins-<provider>for each you need- Construct
VoicePipelineAgent(stt=..., llm=..., tts=...)— swap class to swap provider - For fallback wrap in
FallbackAdapter([primary, backup, emergency])
Intro
LiveKit's plugin architecture decouples your voice agent from any specific STT, LLM, or TTS provider — every plugin implements the same interface, so swapping Deepgram → AssemblyAI → Groq Whisper is one line change. Run A/B tests across providers in parallel, build fallback chains, route specific rooms to specific stacks. Best for: optimizing voice agent quality and cost, vendor lock-in avoidance, regulated multi-provider deployments. Works with: every official LiveKit plugin (openai, anthropic, deepgram, assemblyai, cartesia, elevenlabs, silero, groq) plus community plugins. Setup time: 5 minutes.
Swap providers in one line
# Default stack
assistant = agents.VoicePipelineAgent(
stt=deepgram.STT(model="nova-3"),
llm=openai.LLM(model="gpt-4o-mini"),
tts=cartesia.TTS(voice="alloy"),
)
# Swap STT to AssemblyAI
assistant.stt = assemblyai.STT(model="universal-2")
# Swap LLM to Anthropic
assistant.llm = anthropic.LLM(model="claude-3-5-sonnet-20241022")
# Swap TTS to ElevenLabs
assistant.tts = elevenlabs.TTS(voice="Adam", model="eleven_turbo_v2_5")Fallback chain
from livekit.agents.tts import FallbackAdapter
primary_tts = cartesia.TTS(voice="alloy")
backup_tts = elevenlabs.TTS(voice="Adam")
emergency = openai.TTS(voice="alloy")
tts = FallbackAdapter([primary_tts, backup_tts, emergency], timeout=2.0)
# On primary timeout/error, drops to backup; on backup error, emergency.A/B test in production
import random
async def entrypoint(ctx: JobContext):
variant = "a" if random.random() < 0.5 else "b"
llm = (openai.LLM(model="gpt-4o-mini") if variant == "a"
else anthropic.LLM(model="claude-3-5-haiku-20241022"))
assistant = agents.VoicePipelineAgent(stt=..., llm=llm, tts=...)
assistant.start(ctx.room)
# Log variant for offline analysis
ctx.log.info("agent_started", variant=variant, room=ctx.room.name)Provider strengths cheat sheet
| Slot | Best for | Provider |
|---|---|---|
| STT cheap + fast | English call centers | Deepgram Nova-3 |
| STT multilingual | Global voice apps | AssemblyAI Universal-2 |
| LLM cheap | Routing, short replies | gpt-4o-mini |
| LLM smart | Tool use, complex agents | claude-3-5-sonnet |
| TTS lowest latency | Sub-second targets | Cartesia Sonic |
| TTS most natural | Long monologues, accents | ElevenLabs Turbo v2.5 |
FAQ
Q: Does swapping mid-call work?
A: Yes — you can reassign .stt, .llm, .tts after start. Existing audio in flight finishes on the old provider; new utterances route to the new one. Useful for routing high-value callers to a smarter LLM.
Q: How do I write a custom plugin?
A: Subclass livekit.agents.stt.STT, llm.LLM, or tts.TTS and implement the streaming methods. Most community plugins are <300 lines. Looking at livekit-plugins-deepgram is the fastest way to learn the interface.
Q: What about latency when fallback fires?
A: FallbackAdapter probes the primary for timeout seconds. If you set timeout=2.0, a failed primary adds up to 2s before backup kicks in. For tighter SLOs, use timeout=0.5 — false positives go up but tail latency drops.
Source & Thanks
Built by LiveKit. Licensed under Apache-2.0.
livekit/agents — ⭐ 4,500+