ScriptsMar 31, 2026·2 min read

LiveKit Agents — Build Real-Time Voice AI Agents

Framework for building real-time voice AI agents. STT, LLM, TTS pipeline with sub-second latency. Supports OpenAI, Anthropic, Deepgram, ElevenLabs. 9.9K+ stars.

TO
TokRepo精选 · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

pip install livekit-agents livekit-plugins-openai livekit-plugins-silero
from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli
from livekit.agents.voice import AgentSession, Agent
from livekit.plugins import openai, silero

async def entrypoint(ctx: JobContext):
    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
    session = AgentSession(
        stt=openai.STT(),
        llm=openai.LLM(),
        tts=openai.TTS(),
        vad=silero.VAD.load(),
    )
    await session.start(ctx.room)

cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

Intro

LiveKit Agents is a framework for building real-time, multimodal voice AI agents. It provides a pipeline architecture for Speech-to-Text, LLM processing, and Text-to-Speech with sub-second end-to-end latency. Built on LiveKit's WebRTC infrastructure for production-grade voice communication. Supports OpenAI Realtime API, Anthropic, Deepgram, ElevenLabs, and more. 9,900+ GitHub stars.

Best for: Developers building voice assistants, phone agents, video call AI, and conversational interfaces Works with: OpenAI, Anthropic, Deepgram, ElevenLabs, Azure, Google, Cartesia, AssemblyAI


Key Features

Voice Pipeline

Modular STT → LLM → TTS pipeline with automatic voice activity detection (VAD):

  • STT: OpenAI Whisper, Deepgram, Google, Azure, AssemblyAI
  • LLM: OpenAI (including Realtime), Anthropic, Google, Ollama
  • TTS: OpenAI, ElevenLabs, Cartesia, Azure, Google

Sub-Second Latency

Optimized for real-time conversation with streaming at every stage. Turn detection and interruption handling built in.

Multimodal

Beyond voice — supports video input, screen sharing, and data channels for rich agent interactions.

Production Infrastructure

Built on LiveKit's WebRTC platform — handles scaling, room management, recording, and telephony (SIP/PSTN).

Function Calling

Agents can use tools mid-conversation:

@agent.tool
async def check_weather(city: str) -> str:
    return f"It's 72F and sunny in {city}"

FAQ

Q: What is LiveKit Agents? A: A framework for building real-time voice AI agents with STT/LLM/TTS pipelines and sub-second latency. Built on LiveKit's WebRTC infrastructure. 9.9K+ stars.

Q: Can I build a phone agent with LiveKit Agents? A: Yes, LiveKit supports SIP/PSTN telephony integration, so you can build agents that answer phone calls.


🙏

Source & Thanks

Created by LiveKit. Licensed under Apache 2.0. livekit/agents — 9,900+ GitHub stars

Related Assets