Architecture Comparison
LiteLLM — Self-Hosted Proxy
Your App → LiteLLM (your server) → OpenAI / Anthropic / Azure / etc.- What it is: Open-source Python proxy you deploy yourself
- Key value: Full control, load balancing, spend tracking
- Deploy: Docker, Kubernetes, or bare metal
OpenRouter — Unified API
Your App → OpenRouter (their servers) → 200+ models- What it is: Managed API gateway with one key for all models
- Key value: One API key, 200+ models, smart routing
- Deploy: Nothing to deploy — use their API
Cloudflare AI Gateway — Edge Cache
Your App → CF Edge (global CDN) → Any LLM provider- What it is: Edge proxy that caches and logs LLM requests
- Key value: Response caching, cost reduction, global edge
- Deploy: Configure in Cloudflare dashboard
Feature Matrix
| Feature | LiteLLM | OpenRouter | CF Gateway |
|---|---|---|---|
| Self-hosted | Yes | No | No |
| Models | 100+ (via keys) | 200+ (one key) | Any (pass-through) |
| Load balancing | Yes | Automatic | No |
| Fallbacks | Yes | Yes | No |
| Response caching | No | No | Yes (up to 95% savings) |
| Spend tracking | Yes (Postgres) | Yes (dashboard) | Yes (dashboard) |
| Rate limiting | Yes | Per-key | Yes |
| Latency added | ~5ms (your server) | ~20ms | ~5ms (edge) |
| Open-source | Yes (MIT) | No | Partial |
| Free tier | Yes (self-host) | Limited credits | 10K req/day |
Pricing Comparison
LiteLLM
- Software: Free (open-source)
- Cost: Your server + direct API provider pricing
- Example: $20/mo VPS + provider costs at wholesale
OpenRouter
- Pass-through: Most models at provider pricing
- Some models: Small markup (5-15%)
- Free models: Select open-source models at $0
Cloudflare AI Gateway
- Free tier: 10,000 requests/day
- Cache hits: $0 (no API call made)
- Potential savings: Up to 95% for repeated queries
When to Use Each
Use LiteLLM When:
- You need full control over routing logic
- Data sovereignty requires self-hosting
- You want custom load balancing rules
- Your team manages its own infrastructure
Use OpenRouter When:
- You want maximum model access with minimum setup
- You are prototyping and need to try many models
- You do not want to manage API keys per provider
- You need smart routing (cheapest/fastest)
Use Cloudflare AI Gateway When:
- Many users ask similar questions (high cache hit rate)
- You need global edge distribution
- Cost reduction is the primary goal
- You already use Cloudflare
Combine All Three
Many production setups stack gateways:
App → CF AI Gateway (cache) → LiteLLM (load balance) → Providers
↓ (fallback)
OpenRouter (200+ models)FAQ
Q: Can I use multiple gateways together? A: Yes, they stack well. A common pattern is CF Gateway for caching, LiteLLM for routing, and OpenRouter as a fallback provider.
Q: Which gateway adds the least latency? A: Cloudflare AI Gateway (~5ms, edge) and LiteLLM (~5ms, your server) add minimal latency. OpenRouter adds ~20ms due to their proxy.
Q: Which is best for a small team just starting out? A: OpenRouter for simplest setup. Add Cloudflare AI Gateway when you want caching. Add LiteLLM when you need full control.