SkillsMay 11, 2026·4 min read

LiveKit Plugin Architecture — Swap STT/LLM/TTS Providers

LiveKit Agents plugin system lets you swap any STT/LLM/TTS provider with one line. Mid-call switch, fallback chain, per-room routing.

Agent ready

This asset can be read and installed directly by agents

TokRepo exposes a universal CLI command, install contract, metadata JSON, adapter-aware plan, and raw content links so agents can judge fit, risk, and next actions.

Needs Confirmation · 66/100Policy: confirm
Agent surface
Any MCP/CLI agent
Kind
Skill
Install
Single
Trust
Trust: New
Entrypoint
Asset
Universal CLI install command
npx tokrepo install dc087a87-ed99-4509-81c6-c84d16295672
Intro

LiveKit's plugin architecture decouples your voice agent from any specific STT, LLM, or TTS provider — every plugin implements the same interface, so swapping Deepgram → AssemblyAI → Groq Whisper is one line change. Run A/B tests across providers in parallel, build fallback chains, route specific rooms to specific stacks. Best for: optimizing voice agent quality and cost, vendor lock-in avoidance, regulated multi-provider deployments. Works with: every official LiveKit plugin (openai, anthropic, deepgram, assemblyai, cartesia, elevenlabs, silero, groq) plus community plugins. Setup time: 5 minutes.


Swap providers in one line

# Default stack
assistant = agents.VoicePipelineAgent(
    stt=deepgram.STT(model="nova-3"),
    llm=openai.LLM(model="gpt-4o-mini"),
    tts=cartesia.TTS(voice="alloy"),
)

# Swap STT to AssemblyAI
assistant.stt = assemblyai.STT(model="universal-2")

# Swap LLM to Anthropic
assistant.llm = anthropic.LLM(model="claude-3-5-sonnet-20241022")

# Swap TTS to ElevenLabs
assistant.tts = elevenlabs.TTS(voice="Adam", model="eleven_turbo_v2_5")

Fallback chain

from livekit.agents.tts import FallbackAdapter

primary_tts = cartesia.TTS(voice="alloy")
backup_tts  = elevenlabs.TTS(voice="Adam")
emergency   = openai.TTS(voice="alloy")

tts = FallbackAdapter([primary_tts, backup_tts, emergency], timeout=2.0)
# On primary timeout/error, drops to backup; on backup error, emergency.

A/B test in production

import random

async def entrypoint(ctx: JobContext):
    variant = "a" if random.random() < 0.5 else "b"
    llm = (openai.LLM(model="gpt-4o-mini") if variant == "a"
           else anthropic.LLM(model="claude-3-5-haiku-20241022"))

    assistant = agents.VoicePipelineAgent(stt=..., llm=llm, tts=...)
    assistant.start(ctx.room)

    # Log variant for offline analysis
    ctx.log.info("agent_started", variant=variant, room=ctx.room.name)

Provider strengths cheat sheet

Slot Best for Provider
STT cheap + fast English call centers Deepgram Nova-3
STT multilingual Global voice apps AssemblyAI Universal-2
LLM cheap Routing, short replies gpt-4o-mini
LLM smart Tool use, complex agents claude-3-5-sonnet
TTS lowest latency Sub-second targets Cartesia Sonic
TTS most natural Long monologues, accents ElevenLabs Turbo v2.5

FAQ

Q: Does swapping mid-call work? A: Yes — you can reassign .stt, .llm, .tts after start. Existing audio in flight finishes on the old provider; new utterances route to the new one. Useful for routing high-value callers to a smarter LLM.

Q: How do I write a custom plugin? A: Subclass livekit.agents.stt.STT, llm.LLM, or tts.TTS and implement the streaming methods. Most community plugins are <300 lines. Looking at livekit-plugins-deepgram is the fastest way to learn the interface.

Q: What about latency when fallback fires? A: FallbackAdapter probes the primary for timeout seconds. If you set timeout=2.0, a failed primary adds up to 2s before backup kicks in. For tighter SLOs, use timeout=0.5 — false positives go up but tail latency drops.


Quick Use

  1. pip install livekit-plugins-<provider> for each you need
  2. Construct VoicePipelineAgent(stt=..., llm=..., tts=...) — swap class to swap provider
  3. For fallback wrap in FallbackAdapter([primary, backup, emergency])

Intro

LiveKit's plugin architecture decouples your voice agent from any specific STT, LLM, or TTS provider — every plugin implements the same interface, so swapping Deepgram → AssemblyAI → Groq Whisper is one line change. Run A/B tests across providers in parallel, build fallback chains, route specific rooms to specific stacks. Best for: optimizing voice agent quality and cost, vendor lock-in avoidance, regulated multi-provider deployments. Works with: every official LiveKit plugin (openai, anthropic, deepgram, assemblyai, cartesia, elevenlabs, silero, groq) plus community plugins. Setup time: 5 minutes.


Swap providers in one line

# Default stack
assistant = agents.VoicePipelineAgent(
    stt=deepgram.STT(model="nova-3"),
    llm=openai.LLM(model="gpt-4o-mini"),
    tts=cartesia.TTS(voice="alloy"),
)

# Swap STT to AssemblyAI
assistant.stt = assemblyai.STT(model="universal-2")

# Swap LLM to Anthropic
assistant.llm = anthropic.LLM(model="claude-3-5-sonnet-20241022")

# Swap TTS to ElevenLabs
assistant.tts = elevenlabs.TTS(voice="Adam", model="eleven_turbo_v2_5")

Fallback chain

from livekit.agents.tts import FallbackAdapter

primary_tts = cartesia.TTS(voice="alloy")
backup_tts  = elevenlabs.TTS(voice="Adam")
emergency   = openai.TTS(voice="alloy")

tts = FallbackAdapter([primary_tts, backup_tts, emergency], timeout=2.0)
# On primary timeout/error, drops to backup; on backup error, emergency.

A/B test in production

import random

async def entrypoint(ctx: JobContext):
    variant = "a" if random.random() < 0.5 else "b"
    llm = (openai.LLM(model="gpt-4o-mini") if variant == "a"
           else anthropic.LLM(model="claude-3-5-haiku-20241022"))

    assistant = agents.VoicePipelineAgent(stt=..., llm=llm, tts=...)
    assistant.start(ctx.room)

    # Log variant for offline analysis
    ctx.log.info("agent_started", variant=variant, room=ctx.room.name)

Provider strengths cheat sheet

Slot Best for Provider
STT cheap + fast English call centers Deepgram Nova-3
STT multilingual Global voice apps AssemblyAI Universal-2
LLM cheap Routing, short replies gpt-4o-mini
LLM smart Tool use, complex agents claude-3-5-sonnet
TTS lowest latency Sub-second targets Cartesia Sonic
TTS most natural Long monologues, accents ElevenLabs Turbo v2.5

FAQ

Q: Does swapping mid-call work? A: Yes — you can reassign .stt, .llm, .tts after start. Existing audio in flight finishes on the old provider; new utterances route to the new one. Useful for routing high-value callers to a smarter LLM.

Q: How do I write a custom plugin? A: Subclass livekit.agents.stt.STT, llm.LLM, or tts.TTS and implement the streaming methods. Most community plugins are <300 lines. Looking at livekit-plugins-deepgram is the fastest way to learn the interface.

Q: What about latency when fallback fires? A: FallbackAdapter probes the primary for timeout seconds. If you set timeout=2.0, a failed primary adds up to 2s before backup kicks in. For tighter SLOs, use timeout=0.5 — false positives go up but tail latency drops.


Source & Thanks

Built by LiveKit. Licensed under Apache-2.0.

livekit/agents — ⭐ 4,500+

🙏

Source & Thanks

Built by LiveKit. Licensed under Apache-2.0.

livekit/agents — ⭐ 4,500+

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets