Multi-Agent Framework
LangGraph — State-Machine Framework for Production Agents logo

LangGraph — State-Machine Framework for Production Agents

LangGraph models agents as directed graphs of nodes and edges with explicit state — the most production-ready way to build reliable multi-step AI agents with checkpoints, human-in-the-loop, and deterministic control flow.

Why LangGraph

LangGraph trades a steeper learning curve for dramatically better production reliability. Instead of hoping an LLM loop terminates, you declare the graph: nodes (LLM calls, tools, routing functions), edges (static or conditional transitions), and a typed State that flows between them. The runtime handles checkpointing to any store (memory, SQLite, Postgres, Redis), resumption, and human-in-the-loop interrupts out of the box.

This is the framework you reach for when "my agent needs to pause for a human approval" or "my 20-step workflow must resume exactly where it left off after a crash". CrewAI and AutoGen can solve these problems; LangGraph is designed around them.

The ecosystem is notable too. LangGraph integrates natively with LangChain, LangSmith (observability), LangMem (memory), and the entire LangChain tool set. If your stack already has LangChain components, LangGraph is the zero-friction add-on. If it doesn’t, you get the full benefit but buy into a sizeable ecosystem.

Quick Start — Multi-Agent Supervisor Graph

Supervisor pattern: a routing node inspects state and picks the next worker, workers return control to the supervisor, supervisor eventually returns END. The InMemorySaver checkpoints state; swap for SqliteSaver / PostgresSaver in production for durable resume. Thread IDs isolate concurrent conversations.

# pip install langgraph langchain-openai
from typing import Literal, TypedDict, Annotated
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END, START
from langgraph.checkpoint.memory import InMemorySaver
from langchain_core.messages import HumanMessage, AIMessage
from langchain_core.tools import tool

llm = ChatOpenAI(model="gpt-4o-mini")

class State(TypedDict):
    messages: Annotated[list, lambda a, b: a + b]
    next: str

# Two specialized workers
def researcher(state: State) -> State:
    resp = llm.invoke([HumanMessage(f"Research briefly: {state['messages'][-1].content}")])
    return {"messages": [AIMessage(content=f"[researcher] {resp.content}")], "next": ""}

def writer(state: State) -> State:
    resp = llm.invoke([HumanMessage(f"Polish into 2 sentences: {state['messages'][-1].content}")])
    return {"messages": [AIMessage(content=f"[writer] {resp.content}")], "next": ""}

# Supervisor routes to the right worker
def supervisor(state: State) -> State:
    last = state["messages"][-1].content if state["messages"] else ""
    resp = llm.invoke([HumanMessage(
        f"Pick the next worker (researcher, writer, or FINISH): {last}")]).content.upper()
    next_ = "researcher" if "RESEARCH" in resp else ("writer" if "WRITE" in resp else "FINISH")
    return {"messages": state["messages"], "next": next_}

def route(state: State) -> Literal["researcher", "writer", "__end__"]:
    return state["next"] if state["next"] != "FINISH" else END

graph = StateGraph(State)
graph.add_node("supervisor", supervisor)
graph.add_node("researcher", researcher)
graph.add_node("writer", writer)
graph.add_edge(START, "supervisor")
graph.add_conditional_edges("supervisor", route,
                            {"researcher": "researcher", "writer": "writer", END: END})
graph.add_edge("researcher", "supervisor")
graph.add_edge("writer", "supervisor")

app = graph.compile(checkpointer=InMemorySaver())
cfg = {"configurable": {"thread_id": "demo"}}
for event in app.stream({"messages": [HumanMessage("The state of AI agents in 2026")], "next": ""},
                        config=cfg):
    print(event)

Key Features

Typed State

Define State as a TypedDict or Pydantic model. LangGraph tracks state transformations across nodes, catches schema drift at compile time, and lets you reason about what each node reads and writes.

Conditional edges

add_conditional_edges(source, router_fn, {label: target}) lets any node dynamically pick the next node. More flexible than static graphs; still auditable via the compiled graph structure.

Checkpointers

InMemory, SQLite, Postgres, Redis. State persists between invocations. Resume from any checkpoint — essential for long-running agents, retries, and debugging.

Human-in-the-loop

interrupt() pauses execution; resume via Command(resume=value). First-class approval flows, clarification requests, and state edits without bespoke machinery.

LangGraph Studio

Desktop app for visually debugging running graphs — see state, inspect messages, step through execution, edit state and re-run. Closest the LLM agent world has to a traditional debugger.

Streaming + subgraphs

Stream messages and state updates in real-time; compose graphs as subgraphs (a node can be another compiled graph). Supports deeply nested agent architectures.

Comparison

 Control FlowCheckpointingDebuggingLearning Curve
LangGraphthisGraph / state machineFirst-class (multiple backends)LangGraph StudioMedium
CrewAISequential / hierarchicalVia Crew FlowsTracesLow
AutoGenConversation loopv0.4 has it; older v0.2 limitedTrace logsMedium
OpenAI SwarmHandoff tool callsNot built-inLogsLow

Use Cases

01. Production agent services

User-facing agents that must survive crashes, retries, and deployments. Checkpointing makes LangGraph the default choice for agents in production SaaS apps.

02. Human-approval workflows

Compliance-sensitive tasks (purchases, emails, refunds) where an agent proposes and a human approves. interrupt() + Command resume is the cleanest pattern in the ecosystem.

03. Complex multi-agent architectures

Supervisor + workers, hierarchical teams, tournament-style agent pools. When your system has >3 agents with non-trivial control flow, graph-based modeling prevents emergent chaos.

Pricing & License

LangGraph: MIT open source. Free to self-host.

LangGraph Platform: managed deployment service from LangChain — hosted agents with auto-scaling, webhooks, cron triggers, and a studio UI. Paid tiers; see langchain.com/langgraph-platform.

LangSmith (optional): observability SaaS. Free dev tier. Complements but doesn’t replace LangGraph — deploying to LangGraph Platform does not require LangSmith.

Related Assets on TokRepo

Frequently Asked Questions

Do I need LangChain to use LangGraph?+

Not strictly — you can build nodes from bare functions and any LLM SDK. But LangChain integrations (tools, retrievers, loaders) make LangGraph far more productive; most real projects use both.

LangGraph vs CrewAI?+

Different bets. CrewAI: role-based, fast to ship, simpler. LangGraph: graph-based, more setup, production-grade reliability. If a non-engineer can describe your pipeline as "A then B then C with roles", start with CrewAI. If your pipeline has loops, conditional branches, or human approvals, start with LangGraph.

How hard is it to debug?+

Better than most. LangGraph Studio gives a step-through UI with state inspection. Combined with LangSmith traces, you can pinpoint which node produced which output. Still harder than debugging non-agent code — that is inherent to agentic systems.

Can LangGraph handle high throughput?+

Yes. The runtime is async and stateless between invocations (state lives in the checkpointer). Scale horizontally behind any load balancer; bind a Postgres checkpointer for durability.

What about the older LangChain Agents API?+

Use LangGraph for new agent work. LangChain Agents (initialize_agent, AgentExecutor) are in maintenance mode. LangChain itself remains the go-to for non-agent building blocks (prompts, retrievers, tools).

Compare Alternatives