# LiveKit Plugin Architecture — Swap STT/LLM/TTS Providers > LiveKit Agents plugin system lets you swap any STT/LLM/TTS provider with one line. Mid-call switch, fallback chain, per-room routing. ## Install Save the content below to `.claude/skills/` or append to your `CLAUDE.md`: ## Quick Use 1. `pip install livekit-plugins-` for each you need 2. Construct `VoicePipelineAgent(stt=..., llm=..., tts=...)` — swap class to swap provider 3. For fallback wrap in `FallbackAdapter([primary, backup, emergency])` --- ## Intro LiveKit's plugin architecture decouples your voice agent from any specific STT, LLM, or TTS provider — every plugin implements the same interface, so swapping Deepgram → AssemblyAI → Groq Whisper is one line change. Run A/B tests across providers in parallel, build fallback chains, route specific rooms to specific stacks. Best for: optimizing voice agent quality and cost, vendor lock-in avoidance, regulated multi-provider deployments. Works with: every official LiveKit plugin (openai, anthropic, deepgram, assemblyai, cartesia, elevenlabs, silero, groq) plus community plugins. Setup time: 5 minutes. --- ### Swap providers in one line ```python # Default stack assistant = agents.VoicePipelineAgent( stt=deepgram.STT(model="nova-3"), llm=openai.LLM(model="gpt-4o-mini"), tts=cartesia.TTS(voice="alloy"), ) # Swap STT to AssemblyAI assistant.stt = assemblyai.STT(model="universal-2") # Swap LLM to Anthropic assistant.llm = anthropic.LLM(model="claude-3-5-sonnet-20241022") # Swap TTS to ElevenLabs assistant.tts = elevenlabs.TTS(voice="Adam", model="eleven_turbo_v2_5") ``` ### Fallback chain ```python from livekit.agents.tts import FallbackAdapter primary_tts = cartesia.TTS(voice="alloy") backup_tts = elevenlabs.TTS(voice="Adam") emergency = openai.TTS(voice="alloy") tts = FallbackAdapter([primary_tts, backup_tts, emergency], timeout=2.0) # On primary timeout/error, drops to backup; on backup error, emergency. ``` ### A/B test in production ```python import random async def entrypoint(ctx: JobContext): variant = "a" if random.random() < 0.5 else "b" llm = (openai.LLM(model="gpt-4o-mini") if variant == "a" else anthropic.LLM(model="claude-3-5-haiku-20241022")) assistant = agents.VoicePipelineAgent(stt=..., llm=llm, tts=...) assistant.start(ctx.room) # Log variant for offline analysis ctx.log.info("agent_started", variant=variant, room=ctx.room.name) ``` ### Provider strengths cheat sheet | Slot | Best for | Provider | |---|---|---| | STT cheap + fast | English call centers | Deepgram Nova-3 | | STT multilingual | Global voice apps | AssemblyAI Universal-2 | | LLM cheap | Routing, short replies | gpt-4o-mini | | LLM smart | Tool use, complex agents | claude-3-5-sonnet | | TTS lowest latency | Sub-second targets | Cartesia Sonic | | TTS most natural | Long monologues, accents | ElevenLabs Turbo v2.5 | --- ### FAQ **Q: Does swapping mid-call work?** A: Yes — you can reassign `.stt`, `.llm`, `.tts` after start. Existing audio in flight finishes on the old provider; new utterances route to the new one. Useful for routing high-value callers to a smarter LLM. **Q: How do I write a custom plugin?** A: Subclass `livekit.agents.stt.STT`, `llm.LLM`, or `tts.TTS` and implement the streaming methods. Most community plugins are <300 lines. Looking at `livekit-plugins-deepgram` is the fastest way to learn the interface. **Q: What about latency when fallback fires?** A: FallbackAdapter probes the primary for `timeout` seconds. If you set `timeout=2.0`, a failed primary adds up to 2s before backup kicks in. For tighter SLOs, use `timeout=0.5` — false positives go up but tail latency drops. --- ## Source & Thanks > Built by [LiveKit](https://github.com/livekit). Licensed under Apache-2.0. > > [livekit/agents](https://github.com/livekit/agents) — ⭐ 4,500+ --- ## 快速使用 1. 每家 `pip install livekit-plugins-` 2. 构造 `VoicePipelineAgent(stt=..., llm=..., tts=...)` —— 换类即换 provider 3. fallback 用 `FallbackAdapter([primary, backup, emergency])` 包起来 --- ## 简介 LiveKit 的 plugin 架构把你的语音 agent 跟具体 STT、LLM、TTS 提供商解耦 —— 每个 plugin 实现同一接口，所以 Deepgram → AssemblyAI → Groq Whisper 切换就一行代码。并行 A/B 测多家、搭 fallback 链、按 room 路由到不同栈。适合优化语音 agent 质量和成本、避免厂商锁定、合规多供应商部署。兼容所有官方 LiveKit plugin（openai / anthropic / deepgram / assemblyai / cartesia / elevenlabs / silero / groq）加社区 plugin。装机时间 5 分钟。 --- ### 一行换提供商 ```python # 默认栈 assistant = agents.VoicePipelineAgent( stt=deepgram.STT(model="nova-3"), llm=openai.LLM(model="gpt-4o-mini"), tts=cartesia.TTS(voice="alloy"), ) # STT 换成 AssemblyAI assistant.stt = assemblyai.STT(model="universal-2") # LLM 换成 Anthropic assistant.llm = anthropic.LLM(model="claude-3-5-sonnet-20241022") # TTS 换成 ElevenLabs assistant.tts = elevenlabs.TTS(voice="Adam", model="eleven_turbo_v2_5") ``` ### Fallback 链 ```python from livekit.agents.tts import FallbackAdapter primary_tts = cartesia.TTS(voice="alloy") backup_tts = elevenlabs.TTS(voice="Adam") emergency = openai.TTS(voice="alloy") tts = FallbackAdapter([primary_tts, backup_tts, emergency], timeout=2.0) # 主超时/错误就降级到备份；备份错误再降级到应急。 ``` ### 生产 A/B 测 ```python import random async def entrypoint(ctx: JobContext): variant = "a" if random.random() < 0.5 else "b" llm = (openai.LLM(model="gpt-4o-mini") if variant == "a" else anthropic.LLM(model="claude-3-5-haiku-20241022")) assistant = agents.VoicePipelineAgent(stt=..., llm=llm, tts=...) assistant.start(ctx.room) # 记录变体供离线分析 ctx.log.info("agent_started", variant=variant, room=ctx.room.name) ``` ### 提供商优势 cheat sheet | 槽位 | 最佳用途 | 提供商 | |---|---|---| | STT 便宜快 | 英语呼叫中心 | Deepgram Nova-3 | | STT 多语言 | 全球语音应用 | AssemblyAI Universal-2 | | LLM 便宜 | 路由、短回复 | gpt-4o-mini | | LLM 聪明 | tool use、复杂 agent | claude-3-5-sonnet | | TTS 最低延迟 | <1 秒目标 | Cartesia Sonic | | TTS 最自然 | 长独白、口音 | ElevenLabs Turbo v2.5 | --- ### FAQ **Q: 通话中途切换有效吗？** A: 有效 —— start 后可以重新赋值 `.stt` / `.llm` / `.tts`。在飞的音频在旧 provider 上跑完，新发言路由到新 provider。适合把高价值呼叫者路由到更聪明的 LLM。 **Q: 怎么写自定义 plugin？** A: 继承 `livekit.agents.stt.STT` / `llm.LLM` / `tts.TTS` 实现流式方法。大多数社区 plugin <300 行。看 `livekit-plugins-deepgram` 学接口最快。 **Q: fallback 触发时延迟如何？** A: FallbackAdapter 在主 provider 上探测 `timeout` 秒。`timeout=2.0` 时主失败给备份让出最多 2 秒。要更紧 SLO 用 `timeout=0.5` —— 误报上升但尾部延迟降。 --- ## 来源与感谢 > Built by [LiveKit](https://github.com/livekit). Licensed under Apache-2.0. > > [livekit/agents](https://github.com/livekit/agents) — ⭐ 4,500+ --- Source: https://tokrepo.com/en/workflows/livekit-plugin-architecture-swap-stt-llm-tts-providers Author: LiveKit