What is LiveKit Agents — Python Framework for Voice AI?

LiveKit Agents is a Python framework for real-time voice AI. Pluggable STT/LLM/TTS, VAD, barge-in. Run on LiveKit Cloud or self-host.

Is LiveKit Agents — Python Framework for Voice AI free to use?

Yes. LiveKit Agents — Python Framework for Voice AI is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install LiveKit Agents — Python Framework for Voice AI?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

LiveKit Agents — Python Framework for Voice AI

简介

LiveKit Agents 是专为实时语音 AI 打造的 Python 框架 —— STT、LLM、TTS 串起来，VAD、回合结束检测、打断处理开箱即用。在 LiveKit Cloud 上跑或自托管 LiveKit Server（WebRTC）。适合电话语音 agent、浏览器语音聊天、应用内语音 copilot —— 任何往返延迟 <1.5 秒重要的场景。兼容 Python 3.10+、任何 STT（Deepgram / AssemblyAI / Groq Whisper）、任何 LLM（OpenAI / Anthropic / Llama）、任何 TTS（Cartesia / ElevenLabs / Deepgram）。装机时间 10 分钟。

安装

pip install livekit-agents \
  livekit-plugins-openai livekit-plugins-deepgram livekit-plugins-cartesia livekit-plugins-silero

最小可用语音 agent

import asyncio
from livekit import agents
from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli
from livekit.plugins import openai, deepgram, cartesia, silero

async def entrypoint(ctx: JobContext):
    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)

    assistant = agents.VoicePipelineAgent(
        vad=silero.VAD.load(),
        stt=deepgram.STT(model="nova-3", language="multi"),
        llm=openai.LLM(model="gpt-4o-mini"),
        tts=cartesia.TTS(voice="alloy"),
        chat_ctx=agents.llm.ChatContext().append(
            role="system",
            text="你是一个有帮助的语音助手。回复短一些 —— 不超过 2 句。",
        ),
    )
    assistant.start(ctx.room)
    await assistant.say("你好！有什么可以帮你的？", allow_interruptions=True)

if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

Function calling（对话中途用工具）

from livekit.agents.llm import function_tool

@function_tool
async def get_weather(location: str) -> str:
    '''拿当前天气。'''
    return await my_weather_api(location)

assistant = agents.VoicePipelineAgent(
    ...,
    fnc_ctx=agents.llm.FunctionContext(tools=[get_weather]),
)

延迟预算

阶段	典型	紧
VAD 检测句末	200–500ms	200ms
STT（Deepgram Nova-3）	60–250ms	100ms
LLM（gpt-4o-mini 流式）	300–800ms	400ms
TTS 首音频（Cartesia）	75–200ms	100ms
网络 + WebRTC	50–150ms	80ms
总往返		~880ms

本地用 CLI 跑

python agent.py dev   # 连 LiveKit Cloud dev URL，监听代码变更
python agent.py start # 生产 worker 模式

FAQ

Q: LiveKit Agents vs Vapi vs Retell？ A: Vapi 和 Retell 是托管 turnkey 语音 agent 平台 —— 上线快、栈固定、灵活性低。LiveKit Agents 是自带组件 —— 自己挑 STT/LLM/TTS、部到自己基建、每阶段优化。要控制权或规模化成本优化就选 LiveKit。

Q: 不用 WebRTC 行吗？ A: 电话场景可以 —— LiveKit 有 SIP trunk。仅 HTTP 环境不行 —— 框架建在 LiveKit room 模型上。备选：直接在 Twilio Media Streams 上自建流水线，或用 Vapi 这种托管方案。

Q: 打断怎么处理？ A: VAD 检测用户开口；框架取消正在播的 TTS、把 assistant 最后未完成的发言从聊天历史里截断、把新用户音频路由到 STT。通过 silero.VAD.load(min_speech_duration=...) 调激进度。

LiveKit Agents — Python Framework for Voice AI

这个资产会安全暂存

简介

安装

最小可用语音 agent

Function calling（对话中途用工具）

延迟预算

本地用 CLI 跑

FAQ

来源与感谢

讨论

相关资产

LiveKit Agents — Build Real-Time Voice AI Agents

LiveKit Token Server — Sign JWTs for Room Access

LiveKit Plugin Architecture — Swap STT/LLM/TTS Providers

DearPyGui — High-Performance Python GUI Framework with GPU Rendering