[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"pack-detail-agent-deployment-templates-en":3,"seo:pack:agent-deployment-templates:en":104},{"code":4,"message":5,"data":6},200,"操作成功",{"pack":7},{"slug":8,"icon":9,"tone":10,"status":11,"status_label":12,"title":13,"description":14,"items":15,"install_cmd":103},"agent-deployment-templates","🚢","#1F2937","new","New · this week","Agent Deployment Templates","Ten picks for devs putting an AI agent into production: FastAPI agent skeletons (Agno, PydanticAI), serverless targets (Modal, Replicate), sandbox runtimes (E2B, Daytona), state store and queue (Upstash, LangGraph), and a Kubernetes deploy target — wired in a deliberate order so the agent survives the first 1,000 real requests.",[16,28,36,46,54,62,70,78,88,95],{"id":17,"uuid":18,"slug":19,"title":20,"description":21,"author_name":22,"view_count":23,"vote_count":24,"lang_type":25,"type":26,"type_label":27},299,"f73bc89d-cd16-46bb-af95-3a921a0de059","agno-production-ai-agent-runtime-f73bc89d","Agno — Production AI Agent Runtime","Agno is a runtime for building and managing agentic software at scale. 39.1K+ GitHub stars. Stateful agents, FastAPI serving, 100+ integrations, tracing. Apache 2.0.","Agno",144,0,"en","skill","Skill",{"id":29,"uuid":30,"slug":31,"title":32,"description":33,"author_name":34,"view_count":35,"vote_count":24,"lang_type":25,"type":26,"type_label":27},39,"0313bf39-8bbe-4a50-9445-e5ee8e7280fe","pydanticai-type-safe-ai-agent-framework-0313bf39","PydanticAI — Type-Safe AI Agent Framework","Build production-grade AI agents with type safety, structured outputs, and multi-model support. By the creators of Pydantic and FastAPI.","Pydantic",99,{"id":37,"uuid":38,"slug":39,"title":40,"description":41,"author_name":42,"view_count":43,"vote_count":24,"lang_type":25,"type":44,"type_label":45},2805,"fad6cf5b-d22c-4802-9762-ca47112a05ff","modal-sandboxes-secure-cloud-code-execution-for-ai-agents","Modal Sandboxes — Secure Cloud Code Execution for AI Agents","Modal Sandboxes spin up secure Linux environments for agent-generated code in seconds. Custom images, GPUs, persistent volumes from any Modal Function.","Modal",71,"agent","Agent",{"id":47,"uuid":48,"slug":49,"title":50,"description":51,"author_name":52,"view_count":53,"vote_count":24,"lang_type":25,"type":26,"type_label":27},3112,"154d3162-9681-440c-b4d9-825b073b04a5","modal-examples-serverless-llm-jobs-on-modal","modal-examples — Serverless LLM Jobs on Modal","Learn production patterns for serverless jobs (LLM inference, data pipelines) using Modal’s official examples. Run one and adapt it to your workload.","Script Depot",34,{"id":55,"uuid":56,"slug":57,"title":58,"description":59,"author_name":60,"view_count":61,"vote_count":24,"lang_type":25,"type":26,"type_label":27},2807,"406d216d-018b-4242-8a26-a4a8df47bb4c","replicate-cog-containerize-ml-models-with-one-yaml-file","Replicate Cog — Containerize ML Models with One YAML File","Cog is Replicate's open-source tool to wrap an ML model in a Docker container. One cog.yaml + predict.py gives you a portable, GPU-aware HTTP model.","Replicate",37,{"id":63,"uuid":64,"slug":65,"title":66,"description":67,"author_name":68,"view_count":69,"vote_count":24,"lang_type":25,"type":26,"type_label":27},3089,"d5ecafac-0501-4f42-adec-5d8b2ac6141a","e2b-secure-sandboxes-for-ai-code","E2B — Secure Sandboxes for AI Code","E2B runs AI-generated code in isolated cloud sandboxes. Install the Python\u002FJS SDK, set `E2B_API_KEY`, then execute commands safely inside a sandbox.","Agent Toolkit",92,{"id":71,"uuid":72,"slug":73,"title":74,"description":75,"author_name":76,"view_count":77,"vote_count":24,"lang_type":25,"type":26,"type_label":27},2813,"3b7e7e34-396e-424e-a2f8-e47decaee4cd","daytona-sdk-programmable-dev-sandboxes-for-ai-agents","Daytona SDK — Programmable Dev Sandboxes for AI Agents","Daytona SDK spawns Linux dev environments in 90 ms. Run agent-generated code, browser automation, ML jobs. Snapshot + fork to branch execution.","Daytona",76,{"id":79,"uuid":80,"slug":81,"title":82,"description":83,"author_name":84,"view_count":85,"vote_count":24,"lang_type":25,"type":86,"type_label":87},691,"e0ed3953-1666-435f-8a4b-f81b4d1447bb","upstash-mcp-serverless-redis-kafka-ai-agents-e0ed3953","Upstash MCP — Serverless Redis & Kafka for AI Agents","MCP server for Upstash serverless Redis and Kafka. Give AI agents access to caching, rate limiting, pub\u002Fsub, and message queues with zero infrastructure. Pay-per-request pricing. 2,000+ stars.","MCP Hub",110,"mcp","MCP",{"id":4,"uuid":89,"slug":90,"title":91,"description":92,"author_name":93,"view_count":94,"vote_count":24,"lang_type":25,"type":26,"type_label":27},"cc1a6ed2-0d82-4379-94f4-15632b4d4967","langgraph-build-stateful-ai-agents-graphs-cc1a6ed2","LangGraph — Build Stateful AI Agents as Graphs","LangChain framework for building resilient, stateful AI agents as graphs. Supports cycles, branching, persistence, human-in-the-loop, and streaming. 28K+ stars.","LangChain",452,{"id":96,"uuid":97,"slug":98,"title":99,"description":100,"author_name":101,"view_count":102,"vote_count":24,"lang_type":25,"type":26,"type_label":27},3150,"3f94c7c7-7f5e-4e7e-8a42-d3c4fd46eaff","agent-sandbox-run-agents-safely-on-kubernetes","Agent Sandbox — Run Agents Safely on Kubernetes","Agent Sandbox provides Kubernetes-first guardrails for agent workloads: resource limits, isolation, and repeatable environments so failures stay contained.","AI Open Source",58,"tokrepo install pack\u002Fagent-deployment-templates",{"pageType":105,"pageKey":8,"locale":25,"title":106,"metaDescription":107,"h1":13,"tldr":108,"bodyMarkdown":109,"faq":110,"schema":126,"internalLinks":131,"citations":144,"wordCount":157,"generatedAt":158},"pack","Agent Deployment Templates — 10 Picks for Shipping an AI Agent to Production","Agno, PydanticAI, Modal Sandboxes, Modal examples, Replicate Cog, E2B, Daytona, Upstash, LangGraph, Agent Sandbox on Kubernetes — a deliberate stack that wires agent skeleton → state store → sandbox runtime → queue → deploy target. Open-source first. Install via TokRepo.","Ten picks that take an AI agent from `python main.py` on your laptop to a serving HTTP endpoint that survives the first 1,000 real requests. Two FastAPI agent skeletons (Agno, PydanticAI), two serverless targets (Modal, Replicate Cog), two sandbox runtimes for untrusted tool calls (E2B, Daytona), a state store + queue (Upstash Redis\u002FKafka), a stateful-graph framework (LangGraph), and a Kubernetes deploy pattern (Agent Sandbox). Open-source-first; hosted SaaS only where it earns its bill.","## What's in this pack\n\nThis is the stack a working engineer would assemble the *week before* shipping an AI agent to real users — not the heroic post-launch scramble when the first OOM kill takes the service down. Every pick here is a **deployment template** in the literal sense: clone a repo, set a few env vars, and you have an agent that handles concurrent requests, persists state, sandboxes untrusted code, and recovers from process death. Open-source-first, runs cheaply, and each layer plugs into the next.\n\n| # | Pick | Layer | What it does |\n|---|---|---|---|\n| 1 | Agno | agent skeleton (FastAPI) | Production agent runtime with FastAPI serving, sessions, integrations |\n| 2 | PydanticAI | agent skeleton (typed) | Type-safe agent framework — Pydantic models as the I\u002FO contract |\n| 3 | Modal Sandboxes | serverless sandbox | Run agent-generated code in isolated cloud sandboxes |\n| 4 | modal-examples | serverless template | Reference repo for serverless LLM jobs on Modal |\n| 5 | Replicate Cog | serverless template (container) | One YAML file → containerized model with HTTP + webhook API |\n| 6 | E2B | sandbox runtime | Secure cloud sandboxes for AI-generated code — Python\u002FJS SDK |\n| 7 | Daytona SDK | sandbox runtime | Programmable dev sandboxes — snapshots, reproducible workspaces |\n| 8 | Upstash (Redis + Kafka) | state store + queue | Serverless Redis for sessions, Kafka for the work queue |\n| 9 | LangGraph | stateful agent graphs | Build agents as graphs with explicit state + checkpoints |\n| 10 | Agent Sandbox on Kubernetes | deploy target | Pattern + manifests for running agents safely on a k8s cluster |\n\n## Install in this order (skeleton → state → sandbox → queue → deploy target)\n\nThe order is deliberate. **Don't pick a deploy target first.** You'll end up rewriting the agent to fit the platform's quirks. Get the skeleton, state, and sandbox right locally; the deploy target is the last decision.\n\n1. **Pick one agent skeleton.** If you want a batteries-included runtime with FastAPI serving, sessions, and tracing already wired, pick **Agno**. If you want a smaller, type-first surface where Pydantic models are the I\u002FO contract and you assemble the HTTP layer yourself, pick **PydanticAI**. Either way, the goal is a `\u002Frun` endpoint that accepts a request and returns a typed response. Build this locally first.\n2. **Add a state store before you add tools.** As soon as an agent has a session, you need somewhere to put it — pick **Upstash Redis** (serverless, pay-per-request, no idle cost) for session\u002Fcache and **Upstash Kafka** (or any managed queue) for the work queue if turns can take more than 30 seconds. Don't write \"state\" to local disk; the next pod restart erases it.\n3. **Wrap untrusted tool calls in a sandbox.** The moment your agent executes generated code, runs shell commands, or browses the web, you need isolation. **E2B** is the lowest-friction choice — `from e2b import Sandbox; sbx.run_code(...)` and you're done. **Daytona SDK** is the alternative when you need persistent, snapshot-able dev workspaces (e.g., long-lived coding agents). **Modal Sandboxes** is the same primitive co-located with Modal compute, which matters if you're also deploying on Modal.\n4. **Pick a serverless template if requests are bursty.** **Modal** (and `modal-examples` as the reference repo) gives you a Python decorator that becomes an HTTP endpoint with GPU access, scale-to-zero, and per-second billing — ideal for agents whose requests arrive in clusters. **Replicate Cog** packages a model + handler as one container with `cog.yaml`; great if you also serve a model and want a single deploy artifact.\n5. **For long-running or stateful flows, use LangGraph.** When the agent is a multi-step graph (plan → search → reflect → answer) with branches and human-in-the-loop, **LangGraph** gives you explicit state + checkpoints — meaning a crashed turn can resume instead of restarting. Pair its checkpointer with your Redis from step 2.\n6. **Pick the deploy target last.** Three realistic paths: **(a) Serverless** — wrap the FastAPI app in a Modal `@asgi_app`, or `cog predict` for Replicate, ship as a container. **(b) PaaS** — push the FastAPI app to Fly\u002FRender\u002FRailway behind a simple Dockerfile; cheapest for steady traffic. **(c) Kubernetes** — when you need multi-tenancy, gVisor isolation, or you've outgrown the PaaS box, use **Agent Sandbox** as the reference pattern for running agents safely on k8s (pod-per-session, sandbox per tool call, network policies that deny by default).\n\n## How the pieces fit\n\n```\n[client]\n   │  HTTP \u002Frun\n   ▼\n[FastAPI agent skeleton]  ← Agno or PydanticAI\n   │\n   ├─ session\u002Fcache  ──▶  Upstash Redis\n   │\n   ├─ background work ──▶  Upstash Kafka  ──▶  worker (same image)\n   │\n   ├─ tool: run_code ──▶  E2B \u002F Daytona \u002F Modal Sandbox\n   │\n   ├─ graph state    ──▶  LangGraph checkpointer (Redis)\n   │\n   ▼\n[deploy target]  ── Modal @asgi_app  \u002F  Replicate Cog  \u002F  k8s (Agent Sandbox)\n```\n\nThe four-tool combo **agent skeleton + state store + sandbox + deploy target** is the minimum viable production agent. Skip any one and you'll feel it within a week: no state → users hate the amnesia, no sandbox → an `rm -rf` from a hallucinated tool call ruins your day, no skeleton → you reinvent FastAPI middleware badly, no deploy target → you can't actually serve traffic.\n\n## Tradeoffs you'll hit\n\n- **Agno vs PydanticAI** — Agno is the bigger framework with sessions, FastAPI app, integrations, tracing already wired; the cost is opinions you have to live with. PydanticAI is smaller and type-first; you bring the HTTP layer. For a team shipping in two weeks: Agno. For a team that already has FastAPI conventions: PydanticAI.\n- **E2B vs Daytona vs Modal Sandbox** — E2B is the fastest to integrate for ephemeral code execution (Python SDK, secure by default). Daytona shines when you need persistent, snapshot-able workspaces (long-lived coding agents). Modal Sandbox is the right pick if your compute already lives on Modal — same auth, same billing, lower latency to your model calls.\n- **Modal vs Replicate Cog vs k8s** — Modal scales to zero, bills per-second, and treats Python as the deploy unit; ideal for bursty agent traffic. Replicate Cog is one container with `cog.yaml`; ideal when you also serve a model with the agent. Kubernetes (via Agent Sandbox patterns) is the right answer when you need real multi-tenancy, gVisor-level isolation, or you've outgrown the managed-platform box.\n- **LangGraph vs handwritten state machine** — For one-shot agents (single LLM call + tool), don't pull in LangGraph; it's overhead. For multi-turn graphs with branches, retries, and human-in-the-loop, LangGraph's checkpointer earns its weight by making crashes resumable.\n- **Upstash Redis vs self-hosted Redis** — Upstash is serverless and pay-per-request; great until you exceed ~10M commands\u002Fmonth, where a $20 Redis VM gets cheaper. The migration is one URL change. Don't optimize early.\n\n## Common pitfalls\n\n- **Writing session state to local disk or memory.** The next pod restart erases it and users blame you for amnesia. State goes to Redis (or your DB) from day one, not after the first incident.\n- **Tool calls without a sandbox.** The first time an agent hallucinates `subprocess.run(['rm', '-rf', '\u002F'])` and your service runs it, you've lost the production cluster. E2B\u002FDaytona\u002FModal Sandbox is not optional once tools include shell or code execution.\n- **Serverless for long agent turns.** Most serverless platforms have a max execution time (Lambda 15min, Vercel 5min, Modal up to 24h). If your agent turn can take 30+ minutes, either pick Modal (long timeouts) or push the work to a queue and let the HTTP request return a job ID.\n- **No request-level timeout in the FastAPI skeleton.** Without a timeout, one hung LLM call exhausts your worker pool. Set explicit timeouts at the HTTP boundary, at the LLM client, and at the tool call — three layers.\n- **Logging the full prompt + response.** It feels useful in dev. In prod it leaks PII into log aggregators that aren't compliant. Truncate, redact, or sample before logging — and pair with LLM-observability tooling (Langfuse, Phoenix) for full traces under access control.",[111,114,117,120,123],{"q":112,"a":113},"Do I really need all ten of these? It looks like a lot.","You need one from each *layer*, not all ten. The pack lists alternatives within layers (two skeletons, three sandbox runtimes, three deploy paths) so you can pick what fits your scale. The minimum viable production agent for a solo dev is: Agno (skeleton) + Upstash Redis (state) + E2B (sandbox) + Modal (deploy) — four picks, deploys in an afternoon. Add LangGraph when the agent grows into a multi-step graph. Add Agent Sandbox on Kubernetes when you outgrow the managed platform box.",{"q":115,"a":116},"What does a realistic monthly bill look like for this stack?","For a small agent serving a few thousand requests a day: Modal ~$5-50\u002Fmo (pay-per-second compute, scales to zero), Upstash Redis free tier or ~$10\u002Fmo, E2B free tier (100h\u002Fmo) or ~$30\u002Fmo for steady use, no charge for the open-source skeleton or LangGraph. Total: $15-100\u002Fmo end-to-end. The variable is LLM cost, which usually dwarfs infra; the picks here are designed so infra stays a rounding error relative to model spend.",{"q":118,"a":119},"How does this overlap with the LLM Observability pack?","Different layers. This pack covers *deployment* — how the agent process exists, persists, and serves traffic. LLM Observability (Langfuse, Phoenix, AgentOps) covers *prompts, traces, and eval scores* — the application-semantic layer. Wire both. The agent skeleton emits OpenTelemetry from the start; the observability stack ingests it. Most teams add this pack first (you can't observe an agent you can't deploy) and the observability pack the same week.",{"q":121,"a":122},"Why pick E2B over Modal Sandboxes if I'm already on Modal?","If your compute is already on Modal, **Modal Sandboxes** is the right pick — same auth, same billing, lower latency to your model calls, no extra vendor. E2B wins when you're deploying on a different target (Fly, Replicate, k8s) and want a sandbox that doesn't drag a second cloud account behind it. Daytona wins when sandboxes need to live for hours\u002Fdays (persistent dev workspaces for coding agents) rather than seconds.",{"q":124,"a":125},"Can I use this stack for a long-running, multi-step research agent (not a chat agent)?","Yes — that's actually the case the stack is shaped for. Use LangGraph for the graph with checkpoints (so a crash mid-turn resumes), put the checkpointer state in Upstash Redis, push each long step into Upstash Kafka so it runs in a worker pod, sandbox any tool that executes code in E2B or Daytona, and deploy on Modal (long timeouts) or Kubernetes via Agent Sandbox if you need real multi-tenancy. The HTTP `\u002Frun` endpoint returns a job ID immediately; clients poll or subscribe for results.",{"@context":127,"@type":128,"name":13,"description":129,"numberOfItems":130,"inLanguage":25},"https:\u002F\u002Fschema.org","ItemList","Ten open-source-first picks that take an AI agent from laptop to a serving HTTP endpoint: skeleton, state store, sandbox runtime, queue, deploy target.",10,[132,136,140],{"url":133,"anchor":134,"reason":135},"\u002Fen\u002Fpacks\u002Fdeploy-monitor-observability","Deploy + Monitor + Observability Stack","Once the agent is live, this pack wires the deploy → traces → logs → alerts pipeline around it",{"url":137,"anchor":138,"reason":139},"\u002Fen\u002Fpacks\u002Fagent-memory-layer","Agent Memory Layer pack","Companion pack for the longer-lived memory that sits behind session state — vector stores, semantic memory, retrieval",{"url":141,"anchor":142,"reason":143},"\u002Fen\u002Fai-tools-for\u002Fdevops","DevOps tools for AI agents","Broader catalog of deploy targets, container tools, and runtime patterns curated for agent workloads",[145,149,153],{"claim":146,"source_name":147,"source_url":148},"E2B provides secure cloud sandboxes for executing AI-generated code","E2B documentation","https:\u002F\u002Fe2b.dev\u002Fdocs",{"claim":150,"source_name":151,"source_url":152},"Modal Sandboxes let you run untrusted code in isolated cloud containers","Modal Sandboxes docs","https:\u002F\u002Fmodal.com\u002Fdocs\u002Fguide\u002Fsandbox",{"claim":154,"source_name":155,"source_url":156},"LangGraph builds stateful agents as graphs with checkpointing","LangGraph documentation","https:\u002F\u002Flangchain-ai.github.io\u002Flanggraph\u002F",920,"2026-05-22T12:00:00Z"]