# LLM Gateway Comparison — LiteLLM vs OpenRouter vs CF > In-depth comparison of LLM API gateways: LiteLLM (self-hosted proxy), OpenRouter (unified API), and Cloudflare AI Gateway (edge cache). Architecture, pricing, and when to use each. ## Install Paste the prompt below into your AI tool: ## Quick Use | Need | Best Gateway | |------|-------------| | Self-hosted, full control | **LiteLLM** | | Fastest setup, many models | **OpenRouter** | | Caching + cost reduction | **Cloudflare AI Gateway** | | All three combined | LiteLLM (proxy) → OpenRouter (models) → CF (cache) | --- ## Intro Every team running LLM applications faces the same question: which gateway should sit between my app and the model providers? This guide compares the three leading options — LiteLLM (self-hosted proxy), OpenRouter (unified API), and Cloudflare AI Gateway (edge cache) — across architecture, pricing, features, and ideal use cases. Best for engineering teams choosing their LLM infrastructure stack. Each gateway solves a different problem, and many production setups combine two or three. --- ## Architecture Comparison ### LiteLLM — Self-Hosted Proxy ``` Your App → LiteLLM (your server) → OpenAI / Anthropic / Azure / etc. ``` - **What it is**: Open-source Python proxy you deploy yourself - **Key value**: Full control, load balancing, spend tracking - **Deploy**: Docker, Kubernetes, or bare metal ### OpenRouter — Unified API ``` Your App → OpenRouter (their servers) → 200+ models ``` - **What it is**: Managed API gateway with one key for all models - **Key value**: One API key, 200+ models, smart routing - **Deploy**: Nothing to deploy — use their API ### Cloudflare AI Gateway — Edge Cache ``` Your App → CF Edge (global CDN) → Any LLM provider ``` - **What it is**: Edge proxy that caches and logs LLM requests - **Key value**: Response caching, cost reduction, global edge - **Deploy**: Configure in Cloudflare dashboard ## Feature Matrix | Feature | LiteLLM | OpenRouter | CF Gateway | |---------|---------|------------|------------| | Self-hosted | Yes | No | No | | Models | 100+ (via keys) | 200+ (one key) | Any (pass-through) | | Load balancing | Yes | Automatic | No | | Fallbacks | Yes | Yes | No | | Response caching | No | No | Yes (up to 95% savings) | | Spend tracking | Yes (Postgres) | Yes (dashboard) | Yes (dashboard) | | Rate limiting | Yes | Per-key | Yes | | Latency added | ~5ms (your server) | ~20ms | ~5ms (edge) | | Open-source | Yes (MIT) | No | Partial | | Free tier | Yes (self-host) | Limited credits | 10K req/day | ## Pricing Comparison ### LiteLLM - **Software**: Free (open-source) - **Cost**: Your server + direct API provider pricing - **Example**: $20/mo VPS + provider costs at wholesale ### OpenRouter - **Pass-through**: Most models at provider pricing - **Some models**: Small markup (5-15%) - **Free models**: Select open-source models at $0 ### Cloudflare AI Gateway - **Free tier**: 10,000 requests/day - **Cache hits**: $0 (no API call made) - **Potential savings**: Up to 95% for repeated queries ## When to Use Each ### Use LiteLLM When: - You need full control over routing logic - Data sovereignty requires self-hosting - You want custom load balancing rules - Your team manages its own infrastructure ### Use OpenRouter When: - You want maximum model access with minimum setup - You are prototyping and need to try many models - You do not want to manage API keys per provider - You need smart routing (cheapest/fastest) ### Use Cloudflare AI Gateway When: - Many users ask similar questions (high cache hit rate) - You need global edge distribution - Cost reduction is the primary goal - You already use Cloudflare ### Combine All Three Many production setups stack gateways: ``` App → CF AI Gateway (cache) → LiteLLM (load balance) → Providers ↓ (fallback) OpenRouter (200+ models) ``` ### FAQ **Q: Can I use multiple gateways together?** A: Yes, they stack well. A common pattern is CF Gateway for caching, LiteLLM for routing, and OpenRouter as a fallback provider. **Q: Which gateway adds the least latency?** A: Cloudflare AI Gateway (~5ms, edge) and LiteLLM (~5ms, your server) add minimal latency. OpenRouter adds ~20ms due to their proxy. **Q: Which is best for a small team just starting out?** A: OpenRouter for simplest setup. Add Cloudflare AI Gateway when you want caching. Add LiteLLM when you need full control. --- ## Source & Thanks > Comparison based on official documentation and community benchmarks as of April 2026. > > Related assets on TokRepo: [LiteLLM](https://tokrepo.com), [OpenRouter](https://tokrepo.com), [Cloudflare AI Gateway](https://tokrepo.com) --- ## 快速使用 | 需求 | 最佳选择 | |------|---------| | 自托管,完全控制 | **LiteLLM** | | 最快上手,最多模型 | **OpenRouter** | | 缓存 + 降低成本 | **Cloudflare AI Gateway** | --- ## 简介 每个运行 LLM 应用的团队都面临同一个问题:应用和模型提供商之间用哪个网关?本指南对比三大方案:LiteLLM(自托管代理)、OpenRouter(统一 API)和 Cloudflare AI Gateway(边缘缓存)。涵盖架构、定价、功能和最佳场景。许多生产环境同时使用两到三个。 --- ## 来源与感谢 > 基于官方文档和社区基准测试,2026 年 4 月更新。 --- Source: https://tokrepo.com/en/workflows/27fc09fd-0f35-4c66-b033-aaf970b53d8e Author: Prompt Lab