Esta página se muestra en inglés. Una traducción al español está en curso.
ScriptsApr 9, 2026·2 min de lectura

OpenAI Realtime Agents — Voice AI Agent Patterns

Advanced agentic patterns for voice AI built on OpenAI Realtime API. Chat-supervisor and sequential handoff patterns with WebRTC streaming. MIT, 6,800+ stars.

Introducción

OpenAI Realtime Agents is an official OpenAI demo showcasing advanced agentic patterns for voice AI with 6,800+ GitHub stars. It demonstrates two key patterns: Chat-Supervisor (a realtime voice agent delegates complex tasks to a smarter text model like GPT-4.1) and Sequential Handoff (specialized agents transfer users between each other based on intent). Built with the OpenAI Agents SDK and WebRTC voice streaming. Best for developers building voice-enabled AI applications, customer service bots, or multi-agent voice systems.

See more agent frameworks on TokRepo Agent Toolkit.


OpenAI Realtime Agents — Voice AI Agent Architecture

Two Agent Patterns

1. Chat-Supervisor Pattern

A realtime voice agent handles user conversation and basic tasks, while a more intelligent text-based supervisor model (GPT-4.1) handles complex tool calls and decision-making.

User ←→ [Voice Agent (realtime)] ←→ [Supervisor (GPT-4.1)]
              ↓                            ↓
         Basic tasks                 Complex reasoning
         Voice I/O                   Tool calls

When to use: When you need natural voice interaction but also complex reasoning that requires a more capable model.

2. Sequential Handoff Pattern

Specialized agents transfer users between them based on detected intent. Inspired by the OpenAI Swarm pattern.

User → [Greeter Agent] → [Sales Agent] → [Support Agent]
                ↓                ↓               ↓
          Route intent     Handle sales     Handle support

When to use: When different conversation stages require different expertise (e.g., routing → sales → support).

Key Technologies

Technology Purpose
OpenAI Realtime API Low-latency voice streaming
OpenAI Agents SDK Multi-agent orchestration
WebRTC Browser-based voice I/O
GPT-4.1 Text-based supervisor reasoning

Setup

git clone https://github.com/openai/openai-realtime-agents.git
cd openai-realtime-agents
npm i
export OPENAI_API_KEY=sk-your-key
npm run dev

FAQ

Q: What is OpenAI Realtime Agents? A: An official OpenAI demo showing how to build sophisticated voice AI agents using the Realtime API and Agents SDK, with patterns like chat-supervisor hierarchies and sequential handoffs.

Q: Is it free to use? A: The code is MIT licensed. You pay for OpenAI API usage (Realtime API pricing applies).

Q: Can I use this in production? A: It's a demo/reference implementation. Use the patterns and architecture in your own production applications.


🙏

Fuente y agradecimientos

Created by OpenAI. Licensed under MIT.

openai-realtime-agents — ⭐ 6,800+

Thanks to Noah MacCallum, Ilan Bigio, and the OpenAI team for demonstrating production voice AI patterns.

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados