# Replicate Webhooks — Async Notifications for Slow Models > Replicate Webhooks let async predictions notify your server when ready. Skip polling for slow models (FLUX, video gen). HMAC-signed for verifiable origin. ## Install Save the content below to `.claude/skills/` or append to your `CLAUDE.md`: ## Quick Use 1. Generate a webhook secret: `POST /v1/webhooks/default/secret` 2. Pass `webhook` and `webhook_events_filter` in your `predictions.create` call 3. Implement HMAC-SHA256 signature verification in your handler --- ## Intro Replicate Webhooks let you start a prediction and have Replicate POST to your server when it's done — no polling needed. Critical for slow models (FLUX image generation, video generation, large LLMs) where the prediction can run for tens of seconds to minutes. Best for: production apps with async UI, agents that fire-and-forget, and queue-based pipelines. Works with: Replicate API, any HTTP endpoint. Setup time: 5 minutes. --- ### Start a prediction with webhook ```typescript import Replicate from "replicate"; const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN }); const prediction = await replicate.predictions.create({ version: "black-forest-labs/flux-schnell", input: { prompt: "A misty forest at dawn, photorealistic" }, webhook: "https://yourapp.com/webhooks/replicate", webhook_events_filter: ["completed"], // or ["start", "output", "logs", "completed"] }); return Response.json({ predictionId: prediction.id }); ``` ### Receive the webhook (Next.js example) ```typescript // app/webhooks/replicate/route.ts import crypto from "node:crypto"; export async function POST(req: Request) { // Verify the webhook signature const signature = req.headers.get("webhook-signature"); const timestamp = req.headers.get("webhook-timestamp"); const id = req.headers.get("webhook-id"); const body = await req.text(); const signedContent = `${id}.${timestamp}.${body}`; const expected = crypto .createHmac("sha256", process.env.REPLICATE_WEBHOOK_SECRET!) .update(signedContent) .digest("base64"); if (!signature?.includes(expected)) { return Response.json({ error: "invalid signature" }, { status: 401 }); } // Process the prediction const prediction = JSON.parse(body); if (prediction.status === "succeeded") { await saveImageUrl(prediction.id, prediction.output); } else { await flagFailure(prediction.id, prediction.error); } return Response.json({ ok: true }); } ``` ### Get the webhook secret ```bash # Generate a webhook signing secret curl -X POST https://api.replicate.com/v1/webhooks/default/secret \ -H "Authorization: Token $REPLICATE_API_TOKEN" # Returns: { "key": "whsec_..." } # Store as REPLICATE_WEBHOOK_SECRET env var ``` ### Why this beats polling - **Polling**: 5-second intervals = 5-second p50 latency for done events, plus ongoing API requests against your quota - **Webhooks**: Sub-second from completion to your handler, zero polling load --- ### FAQ **Q: Are webhooks idempotent?** A: Replicate may retry a webhook on transient errors, so handlers must be idempotent. Use the `webhook-id` header (a per-event unique ID) for deduplication on your side. **Q: Can I get streaming output via webhooks?** A: Yes — set `webhook_events_filter: ["output"]` to receive incremental output events as the model produces tokens / frames / partial results. Useful for streaming UI updates from slow models. **Q: What if my webhook endpoint is down?** A: Replicate retries failed webhooks with exponential backoff for ~24 hours. After that, the prediction is still available via GET /predictions/{id}, but webhooks won't replay. --- ## Source & Thanks > Built by [Replicate](https://github.com/replicate). Commercial product. > > [replicate.com/docs/webhooks](https://replicate.com/docs/topics/webhooks) — Webhooks documentation --- ## 快速使用 1. 生成 webhook secret:`POST /v1/webhooks/default/secret` 2. 在 `predictions.create` 调用里传 `webhook` 和 `webhook_events_filter` 3. 在 handler 里做 HMAC-SHA256 签名校验 --- ## 简介 Replicate Webhook 让你启动一个 prediction 后,Replicate 在完成时 POST 到你的服务器 —— 不用轮询。慢模型(FLUX 图像生成、视频生成、大 LLM)这种 prediction 跑十几秒到几分钟的场景关键。适合带异步 UI 的生产应用、fire-and-forget agent、基于队列的流水线。需要 Replicate API + 任何 HTTP 端点。装机时间 5 分钟。 --- ### 启动带 webhook 的 prediction ```typescript import Replicate from "replicate"; const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN }); const prediction = await replicate.predictions.create({ version: "black-forest-labs/flux-schnell", input: { prompt: "A misty forest at dawn, photorealistic" }, webhook: "https://yourapp.com/webhooks/replicate", webhook_events_filter: ["completed"], // 或 ["start", "output", "logs", "completed"] }); return Response.json({ predictionId: prediction.id }); ``` ### 接收 webhook(Next.js 示例) ```typescript // app/webhooks/replicate/route.ts import crypto from "node:crypto"; export async function POST(req: Request) { // 校验 webhook 签名 const signature = req.headers.get("webhook-signature"); const timestamp = req.headers.get("webhook-timestamp"); const id = req.headers.get("webhook-id"); const body = await req.text(); const signedContent = `${id}.${timestamp}.${body}`; const expected = crypto .createHmac("sha256", process.env.REPLICATE_WEBHOOK_SECRET!) .update(signedContent) .digest("base64"); if (!signature?.includes(expected)) { return Response.json({ error: "invalid signature" }, { status: 401 }); } // 处理 prediction const prediction = JSON.parse(body); if (prediction.status === "succeeded") { await saveImageUrl(prediction.id, prediction.output); } else { await flagFailure(prediction.id, prediction.error); } return Response.json({ ok: true }); } ``` ### 拿 webhook secret ```bash # 生成 webhook 签名 secret curl -X POST https://api.replicate.com/v1/webhooks/default/secret \ -H "Authorization: Token $REPLICATE_API_TOKEN" # 返回:{ "key": "whsec_..." } # 存为 REPLICATE_WEBHOOK_SECRET 环境变量 ``` ### 为啥比轮询好 - **轮询**:5 秒间隔 = 完成事件 p50 延迟 5 秒,还要持续消耗 API 配额 - **Webhook**:完成到你 handler 亚秒级,零轮询负载 --- ### FAQ **Q: Webhook 幂等吗?** A: Replicate 会在瞬时错误时重试 webhook,所以 handler 必须幂等。用 `webhook-id` header(每个事件的唯一 ID)做你这边的去重。 **Q: 能用 webhook 拿流式输出吗?** A: 能 —— 设 `webhook_events_filter: ["output"]` 接收模型出 token / 帧 / 部分结果的增量输出事件。慢模型搭流式 UI 更新好用。 **Q: 我的 webhook 端点挂了怎么办?** A: Replicate 会按指数退避重试失败的 webhook 约 24 小时。之后 prediction 还能通过 GET /predictions/{id} 拿到,但 webhook 不会回放。 --- ## 来源与感谢 > Built by [Replicate](https://github.com/replicate). Commercial product. > > [replicate.com/docs/webhooks](https://replicate.com/docs/topics/webhooks) — Webhooks documentation --- Source: https://tokrepo.com/en/workflows/replicate-webhooks-async-notifications-for-slow-models Author: Replicate