LangGraph — State-Machine Framework for Production Agents
LangGraph models agents as directed graphs of nodes and edges with explicit state — the most production-ready way to build reliable multi-step AI agents with checkpoints, human-in-the-loop, and deterministic control flow.
Why LangGraph
LangGraph trades a steeper learning curve for dramatically better production reliability. Instead of hoping an LLM loop terminates, you declare the graph: nodes (LLM calls, tools, routing functions), edges (static or conditional transitions), and a typed State that flows between them. The runtime handles checkpointing to any store (memory, SQLite, Postgres, Redis), resumption, and human-in-the-loop interrupts out of the box.
This is the framework you reach for when "my agent needs to pause for a human approval" or "my 20-step workflow must resume exactly where it left off after a crash". CrewAI and AutoGen can solve these problems; LangGraph is designed around them.
The ecosystem is notable too. LangGraph integrates natively with LangChain, LangSmith (observability), LangMem (memory), and the entire LangChain tool set. If your stack already has LangChain components, LangGraph is the zero-friction add-on. If it doesn’t, you get the full benefit but buy into a sizeable ecosystem.
Quick Start — Multi-Agent Supervisor Graph
Supervisor pattern: a routing node inspects state and picks the next worker, workers return control to the supervisor, supervisor eventually returns END. The InMemorySaver checkpoints state; swap for SqliteSaver / PostgresSaver in production for durable resume. Thread IDs isolate concurrent conversations.
# pip install langgraph langchain-openai
from typing import Literal, TypedDict, Annotated
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END, START
from langgraph.checkpoint.memory import InMemorySaver
from langchain_core.messages import HumanMessage, AIMessage
from langchain_core.tools import tool
llm = ChatOpenAI(model="gpt-4o-mini")
class State(TypedDict):
messages: Annotated[list, lambda a, b: a + b]
next: str
# Two specialized workers
def researcher(state: State) -> State:
resp = llm.invoke([HumanMessage(f"Research briefly: {state['messages'][-1].content}")])
return {"messages": [AIMessage(content=f"[researcher] {resp.content}")], "next": ""}
def writer(state: State) -> State:
resp = llm.invoke([HumanMessage(f"Polish into 2 sentences: {state['messages'][-1].content}")])
return {"messages": [AIMessage(content=f"[writer] {resp.content}")], "next": ""}
# Supervisor routes to the right worker
def supervisor(state: State) -> State:
last = state["messages"][-1].content if state["messages"] else ""
resp = llm.invoke([HumanMessage(
f"Pick the next worker (researcher, writer, or FINISH): {last}")]).content.upper()
next_ = "researcher" if "RESEARCH" in resp else ("writer" if "WRITE" in resp else "FINISH")
return {"messages": state["messages"], "next": next_}
def route(state: State) -> Literal["researcher", "writer", "__end__"]:
return state["next"] if state["next"] != "FINISH" else END
graph = StateGraph(State)
graph.add_node("supervisor", supervisor)
graph.add_node("researcher", researcher)
graph.add_node("writer", writer)
graph.add_edge(START, "supervisor")
graph.add_conditional_edges("supervisor", route,
{"researcher": "researcher", "writer": "writer", END: END})
graph.add_edge("researcher", "supervisor")
graph.add_edge("writer", "supervisor")
app = graph.compile(checkpointer=InMemorySaver())
cfg = {"configurable": {"thread_id": "demo"}}
for event in app.stream({"messages": [HumanMessage("The state of AI agents in 2026")], "next": ""},
config=cfg):
print(event)Key Features
Typed State
Define State as a TypedDict or Pydantic model. LangGraph tracks state transformations across nodes, catches schema drift at compile time, and lets you reason about what each node reads and writes.
Conditional edges
add_conditional_edges(source, router_fn, {label: target}) lets any node dynamically pick the next node. More flexible than static graphs; still auditable via the compiled graph structure.
Checkpointers
InMemory, SQLite, Postgres, Redis. State persists between invocations. Resume from any checkpoint — essential for long-running agents, retries, and debugging.
Human-in-the-loop
interrupt() pauses execution; resume via Command(resume=value). First-class approval flows, clarification requests, and state edits without bespoke machinery.
LangGraph Studio
Desktop app for visually debugging running graphs — see state, inspect messages, step through execution, edit state and re-run. Closest the LLM agent world has to a traditional debugger.
Streaming + subgraphs
Stream messages and state updates in real-time; compose graphs as subgraphs (a node can be another compiled graph). Supports deeply nested agent architectures.
Comparison
| Control Flow | Checkpointing | Debugging | Learning Curve | |
|---|---|---|---|---|
| LangGraphthis | Graph / state machine | First-class (multiple backends) | LangGraph Studio | Medium |
| CrewAI | Sequential / hierarchical | Via Crew Flows | Traces | Low |
| AutoGen | Conversation loop | v0.4 has it; older v0.2 limited | Trace logs | Medium |
| OpenAI Swarm | Handoff tool calls | Not built-in | Logs | Low |
Use Cases
01. Production agent services
User-facing agents that must survive crashes, retries, and deployments. Checkpointing makes LangGraph the default choice for agents in production SaaS apps.
02. Human-approval workflows
Compliance-sensitive tasks (purchases, emails, refunds) where an agent proposes and a human approves. interrupt() + Command resume is the cleanest pattern in the ecosystem.
03. Complex multi-agent architectures
Supervisor + workers, hierarchical teams, tournament-style agent pools. When your system has >3 agents with non-trivial control flow, graph-based modeling prevents emergent chaos.
Pricing & License
LangGraph: MIT open source. Free to self-host.
LangGraph Platform: managed deployment service from LangChain — hosted agents with auto-scaling, webhooks, cron triggers, and a studio UI. Paid tiers; see langchain.com/langgraph-platform.
LangSmith (optional): observability SaaS. Free dev tier. Complements but doesn’t replace LangGraph — deploying to LangGraph Platform does not require LangSmith.
Related Assets on TokRepo
LangGraph — Build Stateful AI Agents as Graphs
LangChain framework for building resilient, stateful AI agents as graphs. Supports cycles, branching, persistence, human-in-the-loop, and streaming. 28K+ stars.
DeepAgents — Multi-Step Agent Framework by LangChain
Agent harness built on LangGraph by the LangChain team. Features planning tools, filesystem backend, and sub-agent spawning for complex multi-step tasks like codebase refactoring. 16,500+ stars.
LangGraph — Stateful AI Agent Graphs by LangChain
Framework for building stateful, multi-actor AI agent applications as directed graphs. Supports cycles, branching, persistence, and human-in-the-loop patterns. By LangChain. 8,000+ stars.
LangGraph — Build Stateful AI Agent Workflows
Framework for building stateful, multi-step AI agent workflows as graphs. LangGraph enables cycles, branching, human-in-the-loop, and persistent state for complex agent systems.
Frequently Asked Questions
Do I need LangChain to use LangGraph?+
Not strictly — you can build nodes from bare functions and any LLM SDK. But LangChain integrations (tools, retrievers, loaders) make LangGraph far more productive; most real projects use both.
LangGraph vs CrewAI?+
Different bets. CrewAI: role-based, fast to ship, simpler. LangGraph: graph-based, more setup, production-grade reliability. If a non-engineer can describe your pipeline as "A then B then C with roles", start with CrewAI. If your pipeline has loops, conditional branches, or human approvals, start with LangGraph.
How hard is it to debug?+
Better than most. LangGraph Studio gives a step-through UI with state inspection. Combined with LangSmith traces, you can pinpoint which node produced which output. Still harder than debugging non-agent code — that is inherent to agentic systems.
Can LangGraph handle high throughput?+
Yes. The runtime is async and stateless between invocations (state lives in the checkpointer). Scale horizontally behind any load balancer; bind a Postgres checkpointer for durability.
What about the older LangChain Agents API?+
Use LangGraph for new agent work. LangChain Agents (initialize_agent, AgentExecutor) are in maintenance mode. LangChain itself remains the go-to for non-agent building blocks (prompts, retrievers, tools).