ScriptsMar 29, 2026·1 min read

Cloudflare Workers AI — Serverless AI Inference

Run AI models at the edge with Cloudflare Workers. Text generation, image generation, speech-to-text, translation, embeddings — all serverless with global distribution.

TO
TokRepo精选 · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

npx wrangler init my-ai-app
cd my-ai-app
export default {
  async fetch(request, env) {
    const response = await env.AI.run("@cf/meta/llama-3-8b-instruct", {
      prompt: "What is the capital of France?"
    });
    return Response.json(response);
  }
};

Intro

Cloudflare Workers AI lets you run AI models serverlessly at the edge — close to your users worldwide. No GPU management, no infrastructure. Just deploy and scale automatically.

Best for: AI-powered APIs, edge inference, low-latency AI features Works with: Cloudflare Workers, Pages


Available Models

  • Text Generation: Llama 3, Mistral, Gemma, Phi
  • Image Generation: Stable Diffusion, SDXL
  • Speech-to-Text: Whisper
  • Translation: M2M-100
  • Embeddings: BGE, all-MiniLM
  • Image Classification: ResNet
  • Text Classification: Sentiment analysis

Pricing

  • Free tier: 10,000 neurons/day
  • Pay-as-you-go: Usage-based pricing per model

🙏

Source & Thanks

Related Assets