PromptsApr 6, 2026·4 min read

LLM Gateway Comparison — LiteLLM vs OpenRouter vs CF

In-depth comparison of LLM API gateways: LiteLLM (self-hosted proxy), OpenRouter (unified API), and Cloudflare AI Gateway (edge cache). Architecture, pricing, and when to use each.

PR
Prompt Lab · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

| Need | Best Gateway | |------|-------------| | Self-hosted, full control | LiteLLM | | Fastest setup, many models | OpenRouter | | Caching + cost reduction | Cloudflare AI Gateway | | All three combined | LiteLLM (proxy) → OpenRouter (models) → CF (cache) |


Intro

Every team running LLM applications faces the same question: which gateway should sit between my app and the model providers? This guide compares the three leading options — LiteLLM (self-hosted proxy), OpenRouter (unified API), and Cloudflare AI Gateway (edge cache) — across architecture, pricing, features, and ideal use cases. Best for engineering teams choosing their LLM infrastructure stack. Each gateway solves a different problem, and many production setups combine two or three.


Architecture Comparison

LiteLLM — Self-Hosted Proxy

Your App → LiteLLM (your server) → OpenAI / Anthropic / Azure / etc.
  • What it is: Open-source Python proxy you deploy yourself
  • Key value: Full control, load balancing, spend tracking
  • Deploy: Docker, Kubernetes, or bare metal

OpenRouter — Unified API

Your App → OpenRouter (their servers) → 200+ models
  • What it is: Managed API gateway with one key for all models
  • Key value: One API key, 200+ models, smart routing
  • Deploy: Nothing to deploy — use their API

Cloudflare AI Gateway — Edge Cache

Your AppCF Edge (global CDN) → Any LLM provider
  • What it is: Edge proxy that caches and logs LLM requests
  • Key value: Response caching, cost reduction, global edge
  • Deploy: Configure in Cloudflare dashboard

Feature Matrix

Feature LiteLLM OpenRouter CF Gateway
Self-hosted Yes No No
Models 100+ (via keys) 200+ (one key) Any (pass-through)
Load balancing Yes Automatic No
Fallbacks Yes Yes No
Response caching No No Yes (up to 95% savings)
Spend tracking Yes (Postgres) Yes (dashboard) Yes (dashboard)
Rate limiting Yes Per-key Yes
Latency added ~5ms (your server) ~20ms ~5ms (edge)
Open-source Yes (MIT) No Partial
Free tier Yes (self-host) Limited credits 10K req/day

Pricing Comparison

LiteLLM

  • Software: Free (open-source)
  • Cost: Your server + direct API provider pricing
  • Example: $20/mo VPS + provider costs at wholesale

OpenRouter

  • Pass-through: Most models at provider pricing
  • Some models: Small markup (5-15%)
  • Free models: Select open-source models at $0

Cloudflare AI Gateway

  • Free tier: 10,000 requests/day
  • Cache hits: $0 (no API call made)
  • Potential savings: Up to 95% for repeated queries

When to Use Each

Use LiteLLM When:

  • You need full control over routing logic
  • Data sovereignty requires self-hosting
  • You want custom load balancing rules
  • Your team manages its own infrastructure

Use OpenRouter When:

  • You want maximum model access with minimum setup
  • You are prototyping and need to try many models
  • You do not want to manage API keys per provider
  • You need smart routing (cheapest/fastest)

Use Cloudflare AI Gateway When:

  • Many users ask similar questions (high cache hit rate)
  • You need global edge distribution
  • Cost reduction is the primary goal
  • You already use Cloudflare

Combine All Three

Many production setups stack gateways:

App → CF AI Gateway (cache) → LiteLLM (load balance) → Providers
                                    ↓ (fallback)
                               OpenRouter (200+ models)

FAQ

Q: Can I use multiple gateways together? A: Yes, they stack well. A common pattern is CF Gateway for caching, LiteLLM for routing, and OpenRouter as a fallback provider.

Q: Which gateway adds the least latency? A: Cloudflare AI Gateway (~5ms, edge) and LiteLLM (~5ms, your server) add minimal latency. OpenRouter adds ~20ms due to their proxy.

Q: Which is best for a small team just starting out? A: OpenRouter for simplest setup. Add Cloudflare AI Gateway when you want caching. Add LiteLLM when you need full control.


🙏

Source & Thanks

Comparison based on official documentation and community benchmarks as of April 2026.

Related assets on TokRepo: LiteLLM, OpenRouter, Cloudflare AI Gateway

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets