Cette page est affichée en anglais. Une traduction française est en cours.
SkillsMay 11, 2026·4 min de lecture

LiveKit Plugin Architecture — Swap STT/LLM/TTS Providers

LiveKit Agents plugin system lets you swap any STT/LLM/TTS provider with one line. Mid-call switch, fallback chain, per-room routing.

LiveKit
LiveKit · Community
Prêt pour agents

Cet actif peut être lu et installé directement par les agents

TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.

Needs Confirmation · 66/100Policy : confirmer
Surface agent
Tout agent MCP/CLI
Type
Skill
Installation
Single
Confiance
Confiance : New
Point d'entrée
Asset
Commande CLI universelle
npx tokrepo install dc087a87-ed99-4509-81c6-c84d16295672
Introduction

LiveKit's plugin architecture decouples your voice agent from any specific STT, LLM, or TTS provider — every plugin implements the same interface, so swapping Deepgram → AssemblyAI → Groq Whisper is one line change. Run A/B tests across providers in parallel, build fallback chains, route specific rooms to specific stacks. Best for: optimizing voice agent quality and cost, vendor lock-in avoidance, regulated multi-provider deployments. Works with: every official LiveKit plugin (openai, anthropic, deepgram, assemblyai, cartesia, elevenlabs, silero, groq) plus community plugins. Setup time: 5 minutes.


Swap providers in one line

# Default stack
assistant = agents.VoicePipelineAgent(
    stt=deepgram.STT(model="nova-3"),
    llm=openai.LLM(model="gpt-4o-mini"),
    tts=cartesia.TTS(voice="alloy"),
)

# Swap STT to AssemblyAI
assistant.stt = assemblyai.STT(model="universal-2")

# Swap LLM to Anthropic
assistant.llm = anthropic.LLM(model="claude-3-5-sonnet-20241022")

# Swap TTS to ElevenLabs
assistant.tts = elevenlabs.TTS(voice="Adam", model="eleven_turbo_v2_5")

Fallback chain

from livekit.agents.tts import FallbackAdapter

primary_tts = cartesia.TTS(voice="alloy")
backup_tts  = elevenlabs.TTS(voice="Adam")
emergency   = openai.TTS(voice="alloy")

tts = FallbackAdapter([primary_tts, backup_tts, emergency], timeout=2.0)
# On primary timeout/error, drops to backup; on backup error, emergency.

A/B test in production

import random

async def entrypoint(ctx: JobContext):
    variant = "a" if random.random() < 0.5 else "b"
    llm = (openai.LLM(model="gpt-4o-mini") if variant == "a"
           else anthropic.LLM(model="claude-3-5-haiku-20241022"))

    assistant = agents.VoicePipelineAgent(stt=..., llm=llm, tts=...)
    assistant.start(ctx.room)

    # Log variant for offline analysis
    ctx.log.info("agent_started", variant=variant, room=ctx.room.name)

Provider strengths cheat sheet

Slot Best for Provider
STT cheap + fast English call centers Deepgram Nova-3
STT multilingual Global voice apps AssemblyAI Universal-2
LLM cheap Routing, short replies gpt-4o-mini
LLM smart Tool use, complex agents claude-3-5-sonnet
TTS lowest latency Sub-second targets Cartesia Sonic
TTS most natural Long monologues, accents ElevenLabs Turbo v2.5

FAQ

Q: Does swapping mid-call work? A: Yes — you can reassign .stt, .llm, .tts after start. Existing audio in flight finishes on the old provider; new utterances route to the new one. Useful for routing high-value callers to a smarter LLM.

Q: How do I write a custom plugin? A: Subclass livekit.agents.stt.STT, llm.LLM, or tts.TTS and implement the streaming methods. Most community plugins are <300 lines. Looking at livekit-plugins-deepgram is the fastest way to learn the interface.

Q: What about latency when fallback fires? A: FallbackAdapter probes the primary for timeout seconds. If you set timeout=2.0, a failed primary adds up to 2s before backup kicks in. For tighter SLOs, use timeout=0.5 — false positives go up but tail latency drops.


Quick Use

  1. pip install livekit-plugins-<provider> for each you need
  2. Construct VoicePipelineAgent(stt=..., llm=..., tts=...) — swap class to swap provider
  3. For fallback wrap in FallbackAdapter([primary, backup, emergency])

Intro

LiveKit's plugin architecture decouples your voice agent from any specific STT, LLM, or TTS provider — every plugin implements the same interface, so swapping Deepgram → AssemblyAI → Groq Whisper is one line change. Run A/B tests across providers in parallel, build fallback chains, route specific rooms to specific stacks. Best for: optimizing voice agent quality and cost, vendor lock-in avoidance, regulated multi-provider deployments. Works with: every official LiveKit plugin (openai, anthropic, deepgram, assemblyai, cartesia, elevenlabs, silero, groq) plus community plugins. Setup time: 5 minutes.


Swap providers in one line

# Default stack
assistant = agents.VoicePipelineAgent(
    stt=deepgram.STT(model="nova-3"),
    llm=openai.LLM(model="gpt-4o-mini"),
    tts=cartesia.TTS(voice="alloy"),
)

# Swap STT to AssemblyAI
assistant.stt = assemblyai.STT(model="universal-2")

# Swap LLM to Anthropic
assistant.llm = anthropic.LLM(model="claude-3-5-sonnet-20241022")

# Swap TTS to ElevenLabs
assistant.tts = elevenlabs.TTS(voice="Adam", model="eleven_turbo_v2_5")

Fallback chain

from livekit.agents.tts import FallbackAdapter

primary_tts = cartesia.TTS(voice="alloy")
backup_tts  = elevenlabs.TTS(voice="Adam")
emergency   = openai.TTS(voice="alloy")

tts = FallbackAdapter([primary_tts, backup_tts, emergency], timeout=2.0)
# On primary timeout/error, drops to backup; on backup error, emergency.

A/B test in production

import random

async def entrypoint(ctx: JobContext):
    variant = "a" if random.random() < 0.5 else "b"
    llm = (openai.LLM(model="gpt-4o-mini") if variant == "a"
           else anthropic.LLM(model="claude-3-5-haiku-20241022"))

    assistant = agents.VoicePipelineAgent(stt=..., llm=llm, tts=...)
    assistant.start(ctx.room)

    # Log variant for offline analysis
    ctx.log.info("agent_started", variant=variant, room=ctx.room.name)

Provider strengths cheat sheet

Slot Best for Provider
STT cheap + fast English call centers Deepgram Nova-3
STT multilingual Global voice apps AssemblyAI Universal-2
LLM cheap Routing, short replies gpt-4o-mini
LLM smart Tool use, complex agents claude-3-5-sonnet
TTS lowest latency Sub-second targets Cartesia Sonic
TTS most natural Long monologues, accents ElevenLabs Turbo v2.5

FAQ

Q: Does swapping mid-call work? A: Yes — you can reassign .stt, .llm, .tts after start. Existing audio in flight finishes on the old provider; new utterances route to the new one. Useful for routing high-value callers to a smarter LLM.

Q: How do I write a custom plugin? A: Subclass livekit.agents.stt.STT, llm.LLM, or tts.TTS and implement the streaming methods. Most community plugins are <300 lines. Looking at livekit-plugins-deepgram is the fastest way to learn the interface.

Q: What about latency when fallback fires? A: FallbackAdapter probes the primary for timeout seconds. If you set timeout=2.0, a failed primary adds up to 2s before backup kicks in. For tighter SLOs, use timeout=0.5 — false positives go up but tail latency drops.


Source & Thanks

Built by LiveKit. Licensed under Apache-2.0.

livekit/agents — ⭐ 4,500+

🙏

Source et remerciements

Built by LiveKit. Licensed under Apache-2.0.

livekit/agents — ⭐ 4,500+

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires