# Cloudflare AI Workers — Deploy AI Apps at the Edge > Run AI models on Cloudflare's global edge network. Workers AI provides serverless inference for LLMs, embeddings, image generation, and speech-to-text at low latency. ## Install Save in your project root: ## Quick Use ```bash npm create cloudflare@latest my-ai-app cd my-ai-app ``` ```typescript export default { async fetch(request, env) { const response = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", { messages: [{ role: "user", content: "What is Cloudflare?" }], }); return Response.json(response); }, }; ``` ```bash npx wrangler deploy ``` ## What is Cloudflare Workers AI? Workers AI lets you run AI models on Cloudflare's global edge network — 300+ cities worldwide. It provides serverless inference for LLMs, text embeddings, image generation, speech-to-text, and more with no GPU management, automatic scaling, and pay-per-request pricing. **Answer-Ready**: Cloudflare Workers AI provides serverless AI inference on a global edge network (300+ cities). Run Llama, Mistral, Stable Diffusion, and Whisper models with no GPU management, auto-scaling, and pay-per-request pricing. **Best for**: Developers building AI features who want low-latency, serverless deployment. **Works with**: Llama 3, Mistral, Stable Diffusion, Whisper, BAAI embeddings. **Setup time**: Under 5 minutes. ## Core Features ### 1. Pre-Built Model Catalog | Category | Models | |----------|--------| | Text Generation | Llama 3.1 (8B/70B), Mistral 7B, Gemma | | Embeddings | BAAI bge-base, bge-large | | Image Generation | Stable Diffusion XL, FLUX.1 | | Speech-to-Text | Whisper | | Translation | Meta M2M-100 | | Classification | BERT, DistilBERT | ### 2. Vectorize (Built-In Vector DB) ```typescript // Create index const index = env.VECTORIZE_INDEX; // Insert embeddings const embedding = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: ["document text here"], }); await index.upsert([{ id: "doc1", values: embedding.data[0], metadata: { title: "..." } }]); // Query const results = await index.query(queryVector, { topK: 5 }); ``` ### 3. AI Gateway Route, cache, and monitor AI API calls: ```typescript const response = await fetch("https://gateway.ai.cloudflare.com/v1/{account}/my-gateway/openai/chat/completions", { method: "POST", headers: { "Authorization": "Bearer sk-...", "Content-Type": "application/json" }, body: JSON.stringify({ model: "gpt-4o", messages: [...] }), }); ``` Features: caching, rate limiting, fallbacks, analytics, logging. ### 4. Edge Deployment Models run on Cloudflare's GPU fleet across 300+ cities: - P50 latency: < 50ms for embeddings - Auto-scaling: 0 to millions of requests - No cold starts for popular models ### 5. Pay-Per-Request Pricing | Resource | Free Tier | Paid | |----------|-----------|------| | Neurons (compute) | 10,000/day | $0.011 per 1,000 | | Vectorize queries | 30M/mo | $0.01 per 1M | | Storage | 5M vectors | $0.05 per 1M | ## FAQ **Q: Can I use my own fine-tuned models?** A: Yes, via LoRA adapters on supported base models. **Q: How does it compare to AWS Bedrock?** A: Workers AI is edge-native (lower latency globally), simpler to use, and cheaper for small-to-medium workloads. Bedrock offers more enterprise models. **Q: Is there a free tier?** A: Yes, 10,000 neurons/day free — enough for ~100-200 LLM requests. ## Source & Thanks > Created by [Cloudflare](https://developers.cloudflare.com/workers-ai/). > > Documentation: [developers.cloudflare.com/workers-ai](https://developers.cloudflare.com/workers-ai/) ## 快速使用 ```bash npm create cloudflare@latest my-ai-app ``` 5 分钟部署 AI 应用到全球 300+ 城市边缘网络。 ## 什么是 Workers AI? Cloudflare Workers AI 在全球边缘网络上运行 AI 模型推理——无需管理 GPU,自动扩缩,按请求付费。 **一句话总结**:Cloudflare Workers AI 在全球 300+ 城市提供无服务器 AI 推理,支持 Llama、Stable Diffusion、Whisper 等模型。 **适合人群**:需要低延迟无服务器 AI 部署的开发者。 ## 核心功能 ### 1. 预置模型目录 文本生成、嵌入、图像生成、语音转文字等。 ### 2. 内置向量数据库 Vectorize 提供嵌入存储和查询。 ### 3. AI 网关 路由、缓存、监控 AI API 调用。 ### 4. 边缘部署 全球 300+ 城市 GPU 集群,P50 延迟 < 50ms。 ## 常见问题 **Q: 有免费层吗?** A: 有,每天 10,000 neurons 免费。 **Q: 和 AWS Bedrock 比较?** A: Workers AI 边缘原生更低延迟,更简单,中小负载更便宜。 ## 来源与致谢 > [Cloudflare Workers AI](https://developers.cloudflare.com/workers-ai/) --- Source: https://tokrepo.com/en/workflows/bd8d0961-db4e-4890-828e-095163614679 Author: AI Open Source