Cette page est affichée en anglais. Une traduction française est en cours.
SkillsMay 7, 2026·3 min de lecture

Replicate Webhooks — Async Notifications for Slow Models

Replicate Webhooks let async predictions notify your server when ready. Skip polling for slow models (FLUX, video gen). HMAC-signed for verifiable origin.

Replicate
Replicate · Community
Prêt pour agents

Cet actif peut être lu et installé directement par les agents

TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.

Stage only · 17/100Stage only
Surface agent
Tout agent MCP/CLI
Type
Skill
Installation
Stage only
Confiance
Confiance : New
Point d'entrée
Asset
Commande CLI universelle
npx tokrepo install 39c66b39-d832-457b-89f8-308f599fc64b
Introduction

Replicate Webhooks let you start a prediction and have Replicate POST to your server when it's done — no polling needed. Critical for slow models (FLUX image generation, video generation, large LLMs) where the prediction can run for tens of seconds to minutes. Best for: production apps with async UI, agents that fire-and-forget, and queue-based pipelines. Works with: Replicate API, any HTTP endpoint. Setup time: 5 minutes.


Start a prediction with webhook

import Replicate from "replicate";

const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN });

const prediction = await replicate.predictions.create({
  version: "black-forest-labs/flux-schnell",
  input: { prompt: "A misty forest at dawn, photorealistic" },
  webhook: "https://yourapp.com/webhooks/replicate",
  webhook_events_filter: ["completed"],  // or ["start", "output", "logs", "completed"]
});

return Response.json({ predictionId: prediction.id });

Receive the webhook (Next.js example)

// app/webhooks/replicate/route.ts
import crypto from "node:crypto";

export async function POST(req: Request) {
  // Verify the webhook signature
  const signature = req.headers.get("webhook-signature");
  const timestamp = req.headers.get("webhook-timestamp");
  const id = req.headers.get("webhook-id");
  const body = await req.text();

  const signedContent = `${id}.${timestamp}.${body}`;
  const expected = crypto
    .createHmac("sha256", process.env.REPLICATE_WEBHOOK_SECRET!)
    .update(signedContent)
    .digest("base64");

  if (!signature?.includes(expected)) {
    return Response.json({ error: "invalid signature" }, { status: 401 });
  }

  // Process the prediction
  const prediction = JSON.parse(body);
  if (prediction.status === "succeeded") {
    await saveImageUrl(prediction.id, prediction.output);
  } else {
    await flagFailure(prediction.id, prediction.error);
  }

  return Response.json({ ok: true });
}

Get the webhook secret

# Generate a webhook signing secret
curl -X POST https://api.replicate.com/v1/webhooks/default/secret \
  -H "Authorization: Token $REPLICATE_API_TOKEN"

# Returns: { "key": "whsec_..." }
# Store as REPLICATE_WEBHOOK_SECRET env var

Why this beats polling

  • Polling: 5-second intervals = 5-second p50 latency for done events, plus ongoing API requests against your quota
  • Webhooks: Sub-second from completion to your handler, zero polling load

FAQ

Q: Are webhooks idempotent? A: Replicate may retry a webhook on transient errors, so handlers must be idempotent. Use the webhook-id header (a per-event unique ID) for deduplication on your side.

Q: Can I get streaming output via webhooks? A: Yes — set webhook_events_filter: ["output"] to receive incremental output events as the model produces tokens / frames / partial results. Useful for streaming UI updates from slow models.

Q: What if my webhook endpoint is down? A: Replicate retries failed webhooks with exponential backoff for ~24 hours. After that, the prediction is still available via GET /predictions/{id}, but webhooks won't replay.


Quick Use

  1. Generate a webhook secret: POST /v1/webhooks/default/secret
  2. Pass webhook and webhook_events_filter in your predictions.create call
  3. Implement HMAC-SHA256 signature verification in your handler

Intro

Replicate Webhooks let you start a prediction and have Replicate POST to your server when it's done — no polling needed. Critical for slow models (FLUX image generation, video generation, large LLMs) where the prediction can run for tens of seconds to minutes. Best for: production apps with async UI, agents that fire-and-forget, and queue-based pipelines. Works with: Replicate API, any HTTP endpoint. Setup time: 5 minutes.


Start a prediction with webhook

import Replicate from "replicate";

const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN });

const prediction = await replicate.predictions.create({
  version: "black-forest-labs/flux-schnell",
  input: { prompt: "A misty forest at dawn, photorealistic" },
  webhook: "https://yourapp.com/webhooks/replicate",
  webhook_events_filter: ["completed"],  // or ["start", "output", "logs", "completed"]
});

return Response.json({ predictionId: prediction.id });

Receive the webhook (Next.js example)

// app/webhooks/replicate/route.ts
import crypto from "node:crypto";

export async function POST(req: Request) {
  // Verify the webhook signature
  const signature = req.headers.get("webhook-signature");
  const timestamp = req.headers.get("webhook-timestamp");
  const id = req.headers.get("webhook-id");
  const body = await req.text();

  const signedContent = `${id}.${timestamp}.${body}`;
  const expected = crypto
    .createHmac("sha256", process.env.REPLICATE_WEBHOOK_SECRET!)
    .update(signedContent)
    .digest("base64");

  if (!signature?.includes(expected)) {
    return Response.json({ error: "invalid signature" }, { status: 401 });
  }

  // Process the prediction
  const prediction = JSON.parse(body);
  if (prediction.status === "succeeded") {
    await saveImageUrl(prediction.id, prediction.output);
  } else {
    await flagFailure(prediction.id, prediction.error);
  }

  return Response.json({ ok: true });
}

Get the webhook secret

# Generate a webhook signing secret
curl -X POST https://api.replicate.com/v1/webhooks/default/secret \
  -H "Authorization: Token $REPLICATE_API_TOKEN"

# Returns: { "key": "whsec_..." }
# Store as REPLICATE_WEBHOOK_SECRET env var

Why this beats polling

  • Polling: 5-second intervals = 5-second p50 latency for done events, plus ongoing API requests against your quota
  • Webhooks: Sub-second from completion to your handler, zero polling load

FAQ

Q: Are webhooks idempotent? A: Replicate may retry a webhook on transient errors, so handlers must be idempotent. Use the webhook-id header (a per-event unique ID) for deduplication on your side.

Q: Can I get streaming output via webhooks? A: Yes — set webhook_events_filter: ["output"] to receive incremental output events as the model produces tokens / frames / partial results. Useful for streaming UI updates from slow models.

Q: What if my webhook endpoint is down? A: Replicate retries failed webhooks with exponential backoff for ~24 hours. After that, the prediction is still available via GET /predictions/{id}, but webhooks won't replay.


Source & Thanks

Built by Replicate. Commercial product.

replicate.com/docs/webhooks — Webhooks documentation

🙏

Source et remerciements

Built by Replicate. Commercial product.

replicate.com/docs/webhooks — Webhooks documentation

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires