# Cloudflare AI Gateway — LLM Proxy, Cache & Analytics > Free proxy gateway for LLM API calls with caching, rate limiting, cost tracking, and fallback routing across providers. Reduce costs up to 95% with response caching. 7,000+ stars. ## Install Save in your project root: ## Quick Use 1. Sign up at [dash.cloudflare.com](https://dash.cloudflare.com) (free tier available) 2. Navigate to AI > AI Gateway > Create Gateway 3. Replace your API base URL: ```python # Before client = OpenAI(base_url="https://api.openai.com/v1") # After — route through Cloudflare AI Gateway client = OpenAI( base_url="https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_name}/openai" ) # Same API key, same code — now with caching, logging, and analytics ``` --- ## Intro Cloudflare AI Gateway is a free proxy that sits between your application and LLM providers, adding caching, rate limiting, cost analytics, and fallback routing with 7,000+ GitHub stars. Route API calls through the gateway without changing your code — just swap the base URL. Cached responses can reduce LLM costs by up to 95% for repeated queries. Best for teams running LLM applications in production who need cost control and observability. Works with: OpenAI, Anthropic, Google AI, Azure, Workers AI, HuggingFace. Setup time: under 5 minutes. --- ## Key Features ### Response Caching Cache identical LLM requests to avoid paying twice: ``` First call: "Summarize this doc" → hits API → $0.03 → cached Second call: same prompt → cache hit → $0.00 → <10ms ``` Configurable TTL from 1 minute to 30 days. ### Cost Analytics Real-time dashboard showing: - Total requests and tokens per model - Cost breakdown by provider - Cache hit rate - Error rate and latency percentiles ### Rate Limiting Protect your API budget: ``` Rules: - Max 100 requests/minute per user - Max $50/day total spend - Alert at 80% budget threshold ``` ### Provider Fallbacks Automatic failover between providers: ```json { "providers": ["openai", "anthropic", "azure"], "fallback": true, "retry": { "attempts": 3, "backoff": "exponential" } } ``` If OpenAI is down, requests automatically route to Anthropic. ### Logging & Debugging Every request logged with full details: - Input/output tokens - Latency breakdown - Model used - Cache status - Error details ### Supported Providers | Provider | Endpoint Pattern | |----------|-----------------| | OpenAI | `/{gateway}/openai` | | Anthropic | `/{gateway}/anthropic` | | Google AI | `/{gateway}/google-ai-studio` | | Azure | `/{gateway}/azure-openai` | | HuggingFace | `/{gateway}/huggingface` | | Workers AI | `/{gateway}/workers-ai` | ### Key Stats - 7,000+ GitHub stars - Free tier available - Up to 95% cost reduction with caching - 6+ provider integrations - Real-time analytics dashboard ### FAQ **Q: What is Cloudflare AI Gateway?** A: A free proxy gateway that adds caching, rate limiting, analytics, and fallback routing to LLM API calls without code changes — just swap the base URL. **Q: Is AI Gateway free?** A: Yes, free tier includes 10,000 requests/day. Paid plans for higher volume. **Q: Does it add latency?** A: Minimal — Cloudflare edge network adds <5ms. Cache hits are <10ms vs 500ms+ for API calls. --- ## Source & Thanks > Created by [Cloudflare](https://github.com/cloudflare). Licensed under Apache 2.0. > > [ai-gateway](https://github.com/cloudflare/ai-gateway) — ⭐ 7,000+ Thanks to Cloudflare for making LLM cost control accessible to every developer. --- ## 快速使用 1. 在 Cloudflare 控制台创建 AI Gateway 2. 替换 API base URL: ```python client = OpenAI( base_url="https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway}/openai" ) ``` --- ## 简介 Cloudflare AI Gateway 是一个免费的 LLM API 代理网关,GitHub 7,000+ stars。提供缓存、限流、成本分析和故障转移路由。缓存响应可降低最高 95% 的 LLM 成本。适合在生产环境运行 LLM 应用需要成本控制的团队。 --- ## 来源与感谢 > Created by [Cloudflare](https://github.com/cloudflare). Licensed under Apache 2.0. > > [ai-gateway](https://github.com/cloudflare/ai-gateway) — ⭐ 7,000+ --- Source: https://tokrepo.com/en/workflows/b1962c77-9ecf-4a84-87b1-e7d4b677dabe Author: AI Open Source