AutoGen — Microsoft’s Conversation-Based Multi-Agent Framework
Microsoft AutoGen models multi-agent systems as conversations between roles (AssistantAgent, UserProxyAgent, CodeExecutorAgent). Flexible, well-researched, and the reference implementation for chat-based agent coordination.
Why AutoGen
AutoGen is the academic-to-industrial bridge. Microsoft Research published the framework in 2023 and has iterated aggressively since — the current v0.4 (late 2024) is a ground-up rewrite with async actors, strongly-typed messages, and first-class distributed execution. It’s the framework of choice when you care about research provenance and long-term stability from a major vendor.
The mental model is agents as participants in a group chat. An AssistantAgent is the LLM side. A UserProxyAgent executes tool calls (including running code). A GroupChat manages the loop — picking the next speaker, injecting termination conditions, capturing the conversation. You compose behavior by configuring speaker selection, rather than wiring an explicit graph like LangGraph.
AutoGen shines on coding and research tasks where the path to solution isn’t known upfront. Give a pair of agents a goal; they brainstorm, critique, execute code, and iterate until done. The downside is less determinism than role-based frameworks — for pipelines with fixed structure, CrewAI gets you there with less orchestration code.
Quick Start — AutoGen v0.4 AssistantAgent + RoundRobinGroupChat
v0.4 is async-first — note the asyncio.run(). RoundRobinGroupChat rotates speakers; SelectorGroupChat uses an LLM to pick the next speaker. Termination conditions are composable (TextMention, MaxMessage, FunctionCall, etc.). v0.2 (still widely used) has a different API — match your install version to docs carefully.
# pip install -U "autogen-agentchat>=0.4" "autogen-ext[openai]"
import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_ext.models.openai import OpenAIChatCompletionClient
async def main():
llm = OpenAIChatCompletionClient(model="gpt-4o-mini")
planner = AssistantAgent(
name="planner",
model_client=llm,
system_message="Break the user's goal into 2-3 concrete steps. Hand off to writer.",
)
writer = AssistantAgent(
name="writer",
model_client=llm,
system_message="Execute the plan in clear prose. Say 'APPROVE' when done.",
)
critic = AssistantAgent(
name="critic",
model_client=llm,
system_message="Find one concrete improvement or respond 'APPROVE' if good enough.",
)
team = RoundRobinGroupChat(
participants=[planner, writer, critic],
termination_condition=TextMentionTermination("APPROVE"),
max_turns=8,
)
result = await team.run(task="Write a 150-word intro to multi-agent frameworks.")
for msg in result.messages:
print(f"[{msg.source}] {msg.content[:120]}")
asyncio.run(main())Key Features
Conversation-based coordination
Agents exchange messages. The GroupChat decides who speaks next. Termination conditions stop the loop. Natural for open-ended tasks where the solution path emerges through dialogue.
v0.4 actor runtime
Production-grade async runtime: typed messages, distributed actors, tracing hooks. Built on lessons from v0.2 deployments inside Microsoft.
AutoGen Studio
Visual no-code UI for building and running agent workflows. Good for non-developer stakeholders and for quick prototyping before converting to code.
Code execution
CodeExecutorAgent (or UserProxyAgent with code_execution_config) runs generated Python in Docker sandboxes. Enables true "agent solves coding task" loops.
Selector-based speaker choice
SelectorGroupChat uses an LLM-based selector to pick the next speaker based on conversation state. More flexible than round-robin; non-deterministic.
Broad model support
First-class OpenAI, Azure OpenAI, Anthropic, Gemini, Ollama, LM Studio, Together, Fireworks. Configure via model_client; swap providers without changing agent logic.
Comparison
| Abstraction | Determinism | Code Execution | Maintainer | |
|---|---|---|---|---|
| AutoGenthis | Conversation | Low-medium | Built-in | Microsoft Research |
| CrewAI | Role + task | High | Via tools | CrewAI Inc |
| LangGraph | State graph | High | Via tool nodes | LangChain |
| OpenAI Swarm | Handoff tool calls | Medium | Via tools | OpenAI |
Use Cases
01. Agentic coding research
Plan → write code → run code → debug → repeat. AutoGen’s code execution + conversation model is the best-tested pattern for research prototypes and benchmark submissions.
02. Open-ended brainstorming
Debate-style tasks (critic vs. writer, bull vs. bear analyst) where the value is in multi-turn back-and-forth, not strict task chaining.
03. Distributed agent systems
v0.4’s actor runtime is production-ready for multi-process or multi-host deployments — rare in the multi-agent space. Useful when your agents call services that shouldn’t live in the same process.
Pricing & License
AutoGen: MIT licensed. Free. Maintained by Microsoft Research with active community contributions. No commercial SKU as of 2026.
Infra cost: your LLM API bills plus (optionally) a Docker host for code execution. AutoGen itself is a Python/TypeScript library; no server to operate.
Hidden cost: the conversation loop makes many LLM calls. Budget 5-20 calls per agent turn for realistic tasks. Use small/cheap models for intermediate turns and a capable one for final synthesis.
Related Assets on TokRepo
AgentOps — Observability for AI Agents
Python SDK for AI agent monitoring. LLM cost tracking, session replay, benchmarking, and error analysis. Integrates with CrewAI, LangChain, AutoGen, and more. 5.4K+ stars.
AutoGen — Multi-Agent Conversation Framework
Microsoft framework for building multi-agent conversational AI systems. Agents chat with each other to solve tasks. Supports tool use, code execution, and human feedback. 56K+ stars.
AutoGen — Microsoft Multi-Agent Conversation Framework
Framework by Microsoft Research for building multi-agent conversational AI systems. Agents chat with each other to solve tasks collaboratively. Supports human-in-the-loop and code execution. 40,000+ stars.
Frequently Asked Questions
AutoGen v0.2 vs v0.4 — which should I use?+
New projects: start with v0.4. It has the async runtime, typed messages, and better observability. v0.2 is still supported for existing projects and has a larger community example base — but Microsoft has signaled v0.4 is the future direction.
Does AutoGen support Claude / non-OpenAI models?+
Yes. autogen-ext packages provide model clients for OpenAI, Anthropic, Azure OpenAI, Gemini, Ollama, LM Studio, HuggingFace. Configure via model_client — agent logic is model-agnostic.
How deterministic is AutoGen?+
Less than CrewAI/LangGraph by design. The conversation loop is flexible, which means runs vary. For production: use RoundRobinGroupChat with max_turns caps and structured termination conditions. Log everything — non-determinism makes tracing critical.
Is AutoGen Studio for production?+
AutoGen Studio is a visual builder and demo tool. Use it to prototype and demo flows; convert to code for production. It is not designed as a runtime platform for customer-facing apps.
Can AutoGen agents execute shell commands?+
Yes via the LocalCommandLineCodeExecutor (dangerous — sandbox!) or DockerCommandLineCodeExecutor (recommended). Never give shell access to agents running untrusted instructions; the Docker executor is the default for a reason.