Key Features
Voice Pipeline
Modular STT → LLM → TTS pipeline with automatic voice activity detection (VAD):
- STT: OpenAI Whisper, Deepgram, Google, Azure, AssemblyAI
- LLM: OpenAI (including Realtime), Anthropic, Google, Ollama
- TTS: OpenAI, ElevenLabs, Cartesia, Azure, Google
Sub-Second Latency
Optimized for real-time conversation with streaming at every stage. Turn detection and interruption handling built in.
Multimodal
Beyond voice — supports video input, screen sharing, and data channels for rich agent interactions.
Production Infrastructure
Built on LiveKit's WebRTC platform — handles scaling, room management, recording, and telephony (SIP/PSTN).
Function Calling
Agents can use tools mid-conversation:
@agent.tool
async def check_weather(city: str) -> str:
return f"It's 72F and sunny in {city}"FAQ
Q: What is LiveKit Agents? A: A framework for building real-time voice AI agents with STT/LLM/TTS pipelines and sub-second latency. Built on LiveKit's WebRTC infrastructure. 9.9K+ stars.
Q: Can I build a phone agent with LiveKit Agents? A: Yes, LiveKit supports SIP/PSTN telephony integration, so you can build agents that answer phone calls.