Esta página se muestra en inglés. Una traducción al español está en curso.
SkillsMay 7, 2026·3 min de lectura

Replicate Webhooks — Async Notifications for Slow Models

Replicate Webhooks let async predictions notify your server when ready. Skip polling for slow models (FLUX, video gen). HMAC-signed for verifiable origin.

Listo para agents

Este activo puede ser leído e instalado directamente por agents

TokRepo expone un comando CLI universal, contrato de instalación, metadata JSON, plan según adaptador y contenido raw para que los agents evalúen compatibilidad, riesgo y próximos pasos.

Stage only · 17/100Stage only
Superficie agent
Cualquier agent MCP/CLI
Tipo
Skill
Instalación
Stage only
Confianza
Confianza: New
Entrada
Asset
Comando CLI universal
npx tokrepo install 39c66b39-d832-457b-89f8-308f599fc64b
Introducción

Replicate Webhooks let you start a prediction and have Replicate POST to your server when it's done — no polling needed. Critical for slow models (FLUX image generation, video generation, large LLMs) where the prediction can run for tens of seconds to minutes. Best for: production apps with async UI, agents that fire-and-forget, and queue-based pipelines. Works with: Replicate API, any HTTP endpoint. Setup time: 5 minutes.


Start a prediction with webhook

import Replicate from "replicate";

const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN });

const prediction = await replicate.predictions.create({
  version: "black-forest-labs/flux-schnell",
  input: { prompt: "A misty forest at dawn, photorealistic" },
  webhook: "https://yourapp.com/webhooks/replicate",
  webhook_events_filter: ["completed"],  // or ["start", "output", "logs", "completed"]
});

return Response.json({ predictionId: prediction.id });

Receive the webhook (Next.js example)

// app/webhooks/replicate/route.ts
import crypto from "node:crypto";

export async function POST(req: Request) {
  // Verify the webhook signature
  const signature = req.headers.get("webhook-signature");
  const timestamp = req.headers.get("webhook-timestamp");
  const id = req.headers.get("webhook-id");
  const body = await req.text();

  const signedContent = `${id}.${timestamp}.${body}`;
  const expected = crypto
    .createHmac("sha256", process.env.REPLICATE_WEBHOOK_SECRET!)
    .update(signedContent)
    .digest("base64");

  if (!signature?.includes(expected)) {
    return Response.json({ error: "invalid signature" }, { status: 401 });
  }

  // Process the prediction
  const prediction = JSON.parse(body);
  if (prediction.status === "succeeded") {
    await saveImageUrl(prediction.id, prediction.output);
  } else {
    await flagFailure(prediction.id, prediction.error);
  }

  return Response.json({ ok: true });
}

Get the webhook secret

# Generate a webhook signing secret
curl -X POST https://api.replicate.com/v1/webhooks/default/secret \
  -H "Authorization: Token $REPLICATE_API_TOKEN"

# Returns: { "key": "whsec_..." }
# Store as REPLICATE_WEBHOOK_SECRET env var

Why this beats polling

  • Polling: 5-second intervals = 5-second p50 latency for done events, plus ongoing API requests against your quota
  • Webhooks: Sub-second from completion to your handler, zero polling load

FAQ

Q: Are webhooks idempotent? A: Replicate may retry a webhook on transient errors, so handlers must be idempotent. Use the webhook-id header (a per-event unique ID) for deduplication on your side.

Q: Can I get streaming output via webhooks? A: Yes — set webhook_events_filter: ["output"] to receive incremental output events as the model produces tokens / frames / partial results. Useful for streaming UI updates from slow models.

Q: What if my webhook endpoint is down? A: Replicate retries failed webhooks with exponential backoff for ~24 hours. After that, the prediction is still available via GET /predictions/{id}, but webhooks won't replay.


Quick Use

  1. Generate a webhook secret: POST /v1/webhooks/default/secret
  2. Pass webhook and webhook_events_filter in your predictions.create call
  3. Implement HMAC-SHA256 signature verification in your handler

Intro

Replicate Webhooks let you start a prediction and have Replicate POST to your server when it's done — no polling needed. Critical for slow models (FLUX image generation, video generation, large LLMs) where the prediction can run for tens of seconds to minutes. Best for: production apps with async UI, agents that fire-and-forget, and queue-based pipelines. Works with: Replicate API, any HTTP endpoint. Setup time: 5 minutes.


Start a prediction with webhook

import Replicate from "replicate";

const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN });

const prediction = await replicate.predictions.create({
  version: "black-forest-labs/flux-schnell",
  input: { prompt: "A misty forest at dawn, photorealistic" },
  webhook: "https://yourapp.com/webhooks/replicate",
  webhook_events_filter: ["completed"],  // or ["start", "output", "logs", "completed"]
});

return Response.json({ predictionId: prediction.id });

Receive the webhook (Next.js example)

// app/webhooks/replicate/route.ts
import crypto from "node:crypto";

export async function POST(req: Request) {
  // Verify the webhook signature
  const signature = req.headers.get("webhook-signature");
  const timestamp = req.headers.get("webhook-timestamp");
  const id = req.headers.get("webhook-id");
  const body = await req.text();

  const signedContent = `${id}.${timestamp}.${body}`;
  const expected = crypto
    .createHmac("sha256", process.env.REPLICATE_WEBHOOK_SECRET!)
    .update(signedContent)
    .digest("base64");

  if (!signature?.includes(expected)) {
    return Response.json({ error: "invalid signature" }, { status: 401 });
  }

  // Process the prediction
  const prediction = JSON.parse(body);
  if (prediction.status === "succeeded") {
    await saveImageUrl(prediction.id, prediction.output);
  } else {
    await flagFailure(prediction.id, prediction.error);
  }

  return Response.json({ ok: true });
}

Get the webhook secret

# Generate a webhook signing secret
curl -X POST https://api.replicate.com/v1/webhooks/default/secret \
  -H "Authorization: Token $REPLICATE_API_TOKEN"

# Returns: { "key": "whsec_..." }
# Store as REPLICATE_WEBHOOK_SECRET env var

Why this beats polling

  • Polling: 5-second intervals = 5-second p50 latency for done events, plus ongoing API requests against your quota
  • Webhooks: Sub-second from completion to your handler, zero polling load

FAQ

Q: Are webhooks idempotent? A: Replicate may retry a webhook on transient errors, so handlers must be idempotent. Use the webhook-id header (a per-event unique ID) for deduplication on your side.

Q: Can I get streaming output via webhooks? A: Yes — set webhook_events_filter: ["output"] to receive incremental output events as the model produces tokens / frames / partial results. Useful for streaming UI updates from slow models.

Q: What if my webhook endpoint is down? A: Replicate retries failed webhooks with exponential backoff for ~24 hours. After that, the prediction is still available via GET /predictions/{id}, but webhooks won't replay.


Source & Thanks

Built by Replicate. Commercial product.

replicate.com/docs/webhooks — Webhooks documentation

🙏

Fuente y agradecimientos

Built by Replicate. Commercial product.

replicate.com/docs/webhooks — Webhooks documentation

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados