Skills2026年3月30日·1 分钟阅读

LiveKit Agents — Build Real-Time Voice AI Agents

Framework for building real-time voice AI agents. STT, LLM, TTS pipeline with sub-second latency. Supports OpenAI, Anthropic, Deepgram, ElevenLabs. 9.9K+ stars.

LiveKit · Community

Agent 就绪

Agent 可直接安装

这个资产可安装；Agent 先选择当前运行时、检查安装计划，再运行匹配命令。

Native · 98/100策略：允许

Agent 入口

任意 MCP/CLI Agent

类型

Skill

安装

Single

信任

信任等级：Community

入口

LiveKit Agents — Build Real-Time Voice AI Agents

直接安装命令

npx -y tokrepo@latest install 804ee888-b285-4369-891e-15f424f587ed --target codex

先 dry-run 确认安装计划，再运行此命令。

TL;DR

LiveKit Agents connects STT, LLM, and TTS into a real-time voice pipeline over WebRTC for building voice AI agents.

§01

What it is

LiveKit Agents is an open-source Python framework for building real-time voice AI agents. It provides a pipeline architecture that chains Speech-to-Text, LLM processing, and Text-to-Speech with sub-second end-to-end latency, all running over LiveKit's WebRTC infrastructure.

The framework is designed for developers building voice assistants, phone agents, video call AI participants, and conversational interfaces. It supports multiple providers including OpenAI, Anthropic, Deepgram, and ElevenLabs.

§02

How it saves time or tokens

Without LiveKit Agents, building a voice AI pipeline requires stitching together separate STT, LLM, and TTS services, managing WebRTC connections, handling voice activity detection, and dealing with audio streaming protocols. LiveKit Agents abstracts all of this into a modular pipeline where you pick your providers and the framework handles the real-time orchestration.

The plugin system means swapping providers is a one-line change. Moving from OpenAI TTS to ElevenLabs requires changing only the TTS parameter in your AgentSession constructor.

§03

How to use

Install the framework and your chosen plugins: pip install livekit-agents livekit-plugins-openai livekit-plugins-silero.
Define an entrypoint function that creates an AgentSession with your STT, LLM, TTS, and VAD providers.
Run the agent with cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint)) and connect clients via LiveKit rooms.

§04

Example

from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli
from livekit.agents.voice import AgentSession, Agent
from livekit.plugins import openai, silero

async def entrypoint(ctx: JobContext):
    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
    session = AgentSession(
        stt=openai.STT(),
        llm=openai.LLM(),
        tts=openai.TTS(),
        vad=silero.VAD.load(),
    )
    await session.start(ctx.room)

cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

§05

Related on TokRepo

AI agent tools -- frameworks and platforms for building AI agents
Voice tools -- speech and audio AI tools

§06

Common pitfalls

WebRTC requires proper TURN/STUN server configuration for production deployments behind firewalls or NAT; the local development setup may not reflect real network conditions.
Voice activity detection (VAD) tuning is critical -- too sensitive and the agent interrupts users mid-sentence, too conservative and response latency increases.
Each provider plugin has its own API key requirements; ensure all keys are set in environment variables before starting the agent.

常见问题

What AI providers does LiveKit Agents support?+

LiveKit Agents supports OpenAI (including the Realtime API), Anthropic, Deepgram, ElevenLabs, Azure, Google, Cartesia, AssemblyAI, and Silero for voice activity detection. The plugin architecture makes adding new providers straightforward.

What is the typical end-to-end latency?+

LiveKit Agents achieves sub-second end-to-end latency from user speech to agent response in typical configurations. The exact latency depends on your choice of STT, LLM, and TTS providers and their respective API response times.

Can I use LiveKit Agents for phone calls?+

Yes. LiveKit provides SIP integration that connects phone calls to LiveKit rooms. Your voice agent handles the audio the same way whether the caller is on a phone line or a WebRTC browser client.

Do I need to self-host LiveKit server?+

You can either self-host the open-source LiveKit server or use LiveKit Cloud as a managed service. The Agents framework works with both deployment options.

How does voice activity detection work?+

LiveKit Agents uses Silero VAD by default to detect when a user starts and stops speaking. This controls when audio is sent to the STT provider and prevents the agent from processing background noise or partial utterances.

引用来源 (3)

LiveKit Agents GitHub— LiveKit Agents is a framework for real-time voice AI agents
LiveKit Agents Documentation— Supports OpenAI, Anthropic, Deepgram, ElevenLabs providers
LiveKit GitHub— WebRTC-based real-time communication infrastructure

🙏

来源与感谢

Created by LiveKit. Licensed under Apache 2.0. livekit/agents — 9,900+ GitHub stars

讨论

登录后参与讨论。

还没有评论，来写第一条吧。

LiveKit Agents — Build Real-Time Voice AI Agents

Agent 可直接安装

What it is

How it saves time or tokens

How to use

Example

Related on TokRepo

Common pitfalls

常见问题

引用来源 (3)

TokRepo 相关

来源与感谢

讨论

相关资产

Moshi — Real-Time AI Voice Conversation Engine

LiveKit Agents — Python Framework for Voice AI

Apache Doris — Modern MPP Analytical Database for Real-Time Reporting

RethinkDB — The Real-Time Document Database