What is Cloudflare AI Gateway — LLM Proxy, Cache & Analytics?

Free proxy gateway for LLM API calls with caching, rate limiting, cost tracking, and fallback routing across providers. Reduce costs up to 95% with response caching. 7,000+ stars.

Is Cloudflare AI Gateway — LLM Proxy, Cache & Analytics free to use?

Yes. Cloudflare AI Gateway — LLM Proxy, Cache & Analytics is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Cloudflare AI Gateway — LLM Proxy, Cache & Analytics?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Cloudflare AI Gateway — LLM Proxy, Cache & Analytics

# Before client = OpenAI(base_url="https://api.openai.com/v1") # After — route through Cloudflare AI Gateway client = OpenAI( base_url="https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_name}/openai" ) # Same API key, same code — now with caching, logging, and analytics

Key Features

Response Caching

Cache identical LLM requests to avoid paying twice:

First call: "Summarize this doc" → hits API → $0.03 → cached
Second call: same prompt → cache hit → $0.00 → <10ms

Configurable TTL from 1 minute to 30 days.

Cost Analytics

Real-time dashboard showing:

Total requests and tokens per model
Cost breakdown by provider
Cache hit rate
Error rate and latency percentiles

Rate Limiting

Protect your API budget:

Rules:
- Max 100 requests/minute per user
- Max $50/day total spend
- Alert at 80% budget threshold

Provider Fallbacks

Automatic failover between providers:

{
  "providers": ["openai", "anthropic", "azure"],
  "fallback": true,
  "retry": { "attempts": 3, "backoff": "exponential" }
}

If OpenAI is down, requests automatically route to Anthropic.

Logging & Debugging

Every request logged with full details:

Input/output tokens
Latency breakdown
Model used
Cache status
Error details

Supported Providers

Provider	Endpoint Pattern
OpenAI	`/{gateway}/openai`
Anthropic	`/{gateway}/anthropic`
Google AI	`/{gateway}/google-ai-studio`
Azure	`/{gateway}/azure-openai`
HuggingFace	`/{gateway}/huggingface`
Workers AI	`/{gateway}/workers-ai`

Key Stats

7,000+ GitHub stars
Free tier available
Up to 95% cost reduction with caching
6+ provider integrations
Real-time analytics dashboard

FAQ

Q: What is Cloudflare AI Gateway? A: A free proxy gateway that adds caching, rate limiting, analytics, and fallback routing to LLM API calls without code changes — just swap the base URL.

Q: Is AI Gateway free? A: Yes, free tier includes 10,000 requests/day. Paid plans for higher volume.

Q: Does it add latency? A: Minimal — Cloudflare edge network adds <5ms. Cache hits are <10ms vs 500ms+ for API calls.

Cloudflare AI Gateway — LLM Proxy, Cache & Analytics

Use it first, then decide how deep to go

Key Features

Response Caching

Cost Analytics

Rate Limiting

Provider Fallbacks

Logging & Debugging

Supported Providers

Key Stats

FAQ

Source & Thanks

Discussion

Related Assets

OpenRouter — Unified LLM API with Smart Routing