TOKREPO · ARSENAL
Stable

Multi-Agent Frameworks

CAMEL, LangGraph, DeepAgents, GPT Researcher — frameworks for orchestrating teams of agents in production.

7 assets

What's in this pack

This pack collects the seven multi-agent frameworks that teams actually ship to production in 2026, not the demos that look good on Twitter and explode under load. Four are headline frameworks, three are research/role-play templates that wrap them.

# Asset Type Best for
1 LangGraph stateful framework Production graph orchestration with checkpointing
2 CAMEL role-play framework Agent-to-agent dialogue, academic-grade
3 DeepAgents research framework Long-running planning + sub-agent spawning
4 GPT Researcher applied agent Topic in, research report out
5 Researcher swarm template CAMEL roles for parallel research
6 Critic-actor pair template One agent acts, one critiques — error correction
7 Hierarchical planner template Manager-spawns-workers pattern with budget

Why this pack matters

A single agent is a chat loop. Multi-agent is a system — and like every system, it needs structure (state machines, queues, retries) before it survives a real workload. The four frameworks here picked the structures that work. The three templates show you how to wire them together for the most common use cases.

The frameworks each pick a different abstraction:

  • LangGraph treats orchestration as a state graph. You declare nodes (agents/tools) and edges (when to transition), and LangGraph handles checkpointing so a 30-minute run can resume after a crash. The closest thing to a default standard for production.
  • CAMEL focuses on agent-to-agent dialogue with explicit roles. Two agents play "user" and "assistant" or "research lead" and "writer" and converse until a goal is met. Strong on reproducibility and academic benchmarks.
  • DeepAgents is built for long-horizon tasks. The top agent plans, delegates sub-tasks to spawned sub-agents, each with their own context window. Designed to avoid the "one giant context" failure mode.
  • GPT Researcher is the applied case study. You give it a research question, it runs a sub-agent swarm to gather evidence and produces a long-form report with citations. Useful as both a tool and a reference architecture.

Install in one command

# Install the entire pack
tokrepo install pack/multi-agent-frameworks

# Or install one at a time
tokrepo install langgraph
tokrepo install camel
tokrepo install deepagents
tokrepo install gpt-researcher

The TokRepo CLI installs each framework's adapter into your AI tool — Claude Code subagents into .claude/agents/, Cursor rules into .cursor/rules/, AGENTS.md entries for Codex CLI. Run pip / npm for the underlying libraries; TokRepo wires the prompts so your CLI knows when to invoke them.

Common pitfalls

  • Don't skip the budget. Multi-agent runs can fan out exponentially — one planner spawning five workers each spawning five sub-tasks burns 25× the tokens. Always cap depth and max-spawn count. DeepAgents bakes this in; with LangGraph and CAMEL you set it yourself.
  • Don't share an LLM client across threads naively. Most SDKs aren't fully thread-safe under high concurrency. Use process-level pools or async with bounded concurrency (e.g. asyncio.Semaphore(8)).
  • Trace everything. Multi-agent debugging without traces is impossible. Pair this pack with the LLM Observability pack — Langfuse and AgentOps both have first-class LangGraph integrations.
  • Beware role drift. In CAMEL-style dialogue, agents sometimes forget who they are around turn 8-10. Add a system reminder every N turns or pin the role in every message.
  • Multi-agent ≠ better. Try a single Claude Sonnet 4.5 with extended thinking before reaching for a multi-agent system. The 2025 Anthropic blog post on multi-agent research found that 60% of tasks people throw at multi-agent setups would do fine with one agent + tools.

When this pack alone isn't enough

Multi-agent shines on tasks with parallelizable subproblems (research, code review, content generation across topics). It loses on:

  • Sequential, deeply-stateful tasks. Refactoring a codebase end-to-end is one agent's job — splitting it across multiple agents creates more coordination overhead than it saves.
  • Latency-sensitive workflows. Each hop between agents adds a round-trip. If you're under a 5-second SLA, stay single-agent.
  • Cost-sensitive workflows. A multi-agent run typically costs 3-10× a single-agent run for the same task. Worth it for quality on hard problems; not worth it for "summarize this email."

The right way to adopt this pack: start with GPT Researcher as the simplest finished example, then graduate to LangGraph or DeepAgents when you need to write your own orchestration.

INSTALL · ONE COMMAND
$ tokrepo install pack/multi-agent-frameworks
hand it to your agent — or paste it in your terminal
What's inside

7 assets in this pack

Script#01
CAMEL — Multi-Agent Framework at Scale

CAMEL is a multi-agent framework for studying scaling laws of AI agents. 16.6K+ GitHub stars. Up to 1M agents, RAG, memory systems, data generation. Apache 2.0.

by Script Depot·267 views
$ tokrepo install camel-multi-agent-framework-scale-23732313
Script#02
LangGraph — Build Stateful AI Agents as Graphs

LangChain framework for building resilient, stateful AI agents as graphs. Supports cycles, branching, persistence, human-in-the-loop, and streaming. 28K+ stars.

by LangChain·238 views
$ tokrepo install langgraph-build-stateful-ai-agents-graphs-cc1a6ed2
Script#03
DeepAgents — Multi-Step Agent Framework by LangChain

Agent harness built on LangGraph by the LangChain team. Features planning tools, filesystem backend, and sub-agent spawning for complex multi-step tasks like codebase refactoring. 16,500+ stars.

by LangChain·174 views
$ tokrepo install deepagents-multi-step-agent-framework-langchain-ac820f80
Script#04
GPT Researcher — Autonomous Research Report Agent

AI agent that generates detailed research reports from a single query. Searches multiple sources, synthesizes findings, and cites references.

by TokRepo Curated·339 views
$ tokrepo install gpt-researcher-autonomous-research-report-agent-23330210
Script#05
Goose — AI Developer Agent by Block

Open-source AI developer agent by Block (Square). Goose automates coding tasks with extensible toolkits, session memory, and MCP server support in your terminal.

by Agent Toolkit·157 views
$ tokrepo install goose-ai-developer-agent-block-dedbb70b
Skill#06
Claude-Flow — Multi-Agent Orchestration for Claude Code

Layers swarm and hive-mind multi-agent orchestration on top of Claude Code with 64 specialized agents, SQLite memory, and parallel execution.

by Skill Factory·74 views
$ tokrepo install claude-flow-multi-agent-orchestration-claude-code-34ff4f3b
Script#07
OpenAI Agents SDK — Build Multi-Agent Systems in Python

Official OpenAI Python SDK for building multi-agent systems with handoffs, guardrails, and tracing. Agents delegate to specialists, enforce safety rules, and produce observable traces. 8,000+ stars.

by OpenAI·83 views
$ tokrepo install openai-agents-sdk-build-multi-agent-systems-python-38035d0b
FAQ

Frequently asked questions

Is LangGraph free?

Yes, LangGraph is open-source under MIT and you only pay for the LLM tokens. There's a paid LangGraph Cloud for managed deployment with checkpointing and traces, but the OSS library is fully featured. CAMEL, DeepAgents, and GPT Researcher are also OSS — no paid tier is required to ship.

Does this work with Cursor or Codex CLI?

The frameworks are language-level Python libraries, not Claude Code-specific. Any agent CLI that runs Python tools can drive them. The TokRepo CLI installs the right wiring for your tool — for Codex CLI it ships AGENTS.md instructions explaining when to invoke the framework, for Cursor it adds rules. The underlying Python install is unchanged.

How does LangGraph compare to CAMEL?

LangGraph is structure-first: you draw a state machine and the agents fit into it. CAMEL is dialogue-first: you assign roles and let agents converse. LangGraph wins for production reliability and checkpointing; CAMEL wins for research, simulations, and any case where the conversation itself is the artifact. Many production setups use LangGraph for orchestration and call CAMEL for specific dialogue tasks.

What's the difference vs the Memory Layer pack?

Memory is about what an agent remembers between sessions. Multi-agent is about how multiple agents coordinate within one task. They're orthogonal: a multi-agent system often needs a shared memory layer (Mem0/Zep) so the workers don't have to re-discover facts the planner already knew. We recommend installing both packs if you're building anything serious.

When should I NOT use a multi-agent framework?

When the task is sequential and stateful (refactor this file), latency-sensitive (chat UIs under 3s), or simple enough for one Claude/GPT call. Anthropic's own multi-agent research blog notes that single-agent + extended thinking beats most multi-agent setups on cost. Reach for multi-agent when the task naturally parallelizes (research many sources) or requires distinct expert roles.

MORE FROM THE ARSENAL

12 packs · 80+ hand-picked assets

Browse every curated bundle on the home page

Back to all packs