Multi-Agent Framework

camel-ai — 面向研究的角色扮演多 Agent 框架

camel-ai 是最早提出角色扮演多 Agent 协作的项目——两个 Agent 扮演 AI user 与 AI assistant 互相对话完成任务。研究导向，含丰富 Agent 社会模拟案例。

Why camel-ai

camel-ai’s 2023 paper introduced the role-playing multi-agent pattern that many later frameworks adopted: two LLMs each assigned a role — one as "AI user" issuing instructions, one as "AI assistant" executing — converse to complete a task without a human in the loop. The pattern surfaced both impressive cooperative behavior and the hallucination dynamics that the whole field has been trying to fix since.

The library has evolved beyond the original 2-agent pattern. It now includes agent societies (3-50+ agents with different personas), memory modules, tool integration, and experiments with more diverse communication topologies. camel-ai stays closer to research than CrewAI/LangGraph — expect frequent paper-to-code ports and occasional API churn.

Use camel-ai when you want to reproduce or extend research, run agent societies at scale, or experiment with role-playing dynamics. For straightforward production workflows, simpler frameworks win on stability and community support.

Quick Start — Role-Playing Assistant + User

RolePlaying is the canonical camel-ai abstraction: one assistant agent, one user agent, optionally a task-specify agent that refines the task prompt. step() advances one round. Termination by keyword ("CAMEL_TASK_DONE") or round cap. For larger societies, use the Workforce module.

# pip install camel-ai
from camel.agents import ChatAgent
from camel.configs import ChatGPTConfig
from camel.messages import BaseMessage
from camel.models import ModelFactory
from camel.societies import RolePlaying
from camel.types import ModelPlatformType, ModelType, TaskType

model = ModelFactory.create(
    model_platform=ModelPlatformType.OPENAI,
    model_type=ModelType.GPT_4O_MINI,
    model_config_dict=ChatGPTConfig(temperature=0.2).as_dict(),
)

society = RolePlaying(
    assistant_role_name="Python programmer",
    user_role_name="Senior data scientist",
    task_prompt="Write a script to fetch the weather for a city from a public API.",
    with_task_specify=True,
    assistant_agent_kwargs={"model": model},
    user_agent_kwargs={"model": model},
    task_specify_agent_kwargs={"model": model},
)

input_msg = society.init_chat()
for _ in range(6):  # cap rounds
    assistant_response, user_response = society.step(input_msg)
    print(f"[assistant] {assistant_response.msg.content[:200]}")
    print(f"[user] {user_response.msg.content[:200]}")
    if "CAMEL_TASK_DONE" in user_response.msg.content:
        break
    input_msg = assistant_response.msg

核心能力

Role-playing pattern

The original "AI user + AI assistant" setup with built-in task specification, role description, and termination keyword. Still the clearest experimental bed for studying cooperative LLM behavior.

Agent Workforce

Multi-agent orchestration for teams of specialist agents — each with its own role, tools, and memory. Supports hierarchical coordination and task routing.

Data generation

camel-ai ships utilities to generate synthetic instruction datasets using role-playing sessions — useful for fine-tuning smaller models on agent-style dialogue.

Rich model support

OpenAI, Anthropic, Google Gemini, Mistral, HuggingFace, Azure, Together, vLLM, Ollama. Switch via ModelFactory without touching agent code.

Tool ecosystem

Search, code execution, Python REPL, OpenAPI tools, RAG, embedding, and file-system tools. All wrappable into any ChatAgent.

Paper-driven development

New modules often implement a specific published pattern (task decomposition, society simulation, inter-agent communication). Unusually good if you want research-lineage features.

对比

	Origin	Primary Use	Research vs Production	Strengths
camel-ai本工具	2023 research paper	Role-playing, agent societies	Research-leaning	Simulations, data generation
AutoGen	Microsoft Research	Conversation coordination	Both (v0.4 is production)	Coding, open-ended
MetaGPT	DeepWisdom	Software dev SOPs	Both	Code generation
CrewAI	CrewAI Inc	Role-based pipelines	Production-first	Fast ship

实际用例

01. Agent-society research

Studying cooperation, disagreement, emergent norms in LLM agents — camel-ai has the longest-running example set in the field.

02. Synthetic instruction data

Generate task-style training data by running thousands of role-playing sessions, filter by quality, and use for fine-tuning smaller models.

03. Agent benchmark reproductions

Replicating or extending agent-benchmark results — many papers ship camel-ai-based reference implementations.

价格与许可

camel-ai: Apache 2.0 open source. Free. Maintained by a research community with core contributors across KAUST, UCL, and multiple universities.

Infra cost: LLM APIs you plug in. Running an agent society of 30 agents for 10 rounds can rack up meaningful cost — budget before kickoff and use cheap models for most agents.

Data generation cost: if you’re using camel-ai to generate fine-tuning data, expect $ per example at gpt-4o-class quality. Smaller models produce faster, cheaper data that’s usually still useful for distillation.

常见问题

Is camel-ai production-ready?+

Portions are (ChatAgent, Workforce), but the overall project leans research. If you need a stable production framework, CrewAI / LangGraph / AutoGen v0.4 are safer defaults; use camel-ai when you specifically want its research features.

camel-ai vs CrewAI?+

CrewAI is production-first with opinionated roles and tasks. camel-ai is research-first with flexible role-playing and agent societies. Use CrewAI to ship; use camel-ai to study behavior or generate training data.

What is Agent Workforce?+

camel-ai’s module for orchestrating teams of specialist agents with task routing, tool use, and shared memory. It’s camel-ai’s answer to CrewAI-style team coordination, layered on top of their role-playing primitives.

Does camel-ai generate training data automatically?+

Yes. The CAMEL paper introduced synthetic instruction data generation; the library has utilities to run role-playing sessions at scale and export the transcripts in SFT/DPO-ready formats.

Which models work best?+

gpt-4o, claude-3-5-sonnet, and similar frontier models yield the most useful role-playing dialogues. Smaller models work but often break role quickly — temperature and system prompt engineering matter more than with single-agent chat.