# LLM Gateway Comparison — LiteLLM vs OpenRouter vs CF

> In-depth comparison of LLM API gateways: LiteLLM (self-hosted proxy), OpenRouter (unified API), and Cloudflare AI Gateway (edge cache). Architecture, pricing, and when to use each.

## Install

Paste the prompt below into your AI tool:

## Quick Use

| Need | Best Gateway |
|------|-------------|
| Self-hosted, full control | **LiteLLM** |
| Fastest setup, many models | **OpenRouter** |
| Caching + cost reduction | **Cloudflare AI Gateway** |
| All three combined | LiteLLM (proxy) → OpenRouter (models) → CF (cache) |

---

## Intro

Every team running LLM applications faces the same question: which gateway should sit between my app and the model providers? This guide compares the three leading options — LiteLLM (self-hosted proxy), OpenRouter (unified API), and Cloudflare AI Gateway (edge cache) — across architecture, pricing, features, and ideal use cases. Best for engineering teams choosing their LLM infrastructure stack. Each gateway solves a different problem, and many production setups combine two or three.

---

## Architecture Comparison

### LiteLLM — Self-Hosted Proxy
```
Your App → LiteLLM (your server) → OpenAI / Anthropic / Azure / etc.
```
- **What it is**: Open-source Python proxy you deploy yourself
- **Key value**: Full control, load balancing, spend tracking
- **Deploy**: Docker, Kubernetes, or bare metal

### OpenRouter — Unified API
```
Your App → OpenRouter (their servers) → 200+ models
```
- **What it is**: Managed API gateway with one key for all models
- **Key value**: One API key, 200+ models, smart routing
- **Deploy**: Nothing to deploy — use their API

### Cloudflare AI Gateway — Edge Cache
```
Your App → CF Edge (global CDN) → Any LLM provider
```
- **What it is**: Edge proxy that caches and logs LLM requests
- **Key value**: Response caching, cost reduction, global edge
- **Deploy**: Configure in Cloudflare dashboard

## Feature Matrix

| Feature | LiteLLM | OpenRouter | CF Gateway |
|---------|---------|------------|------------|
| Self-hosted | Yes | No | No |
| Models | 100+ (via keys) | 200+ (one key) | Any (pass-through) |
| Load balancing | Yes | Automatic | No |
| Fallbacks | Yes | Yes | No |
| Response caching | No | No | Yes (up to 95% savings) |
| Spend tracking | Yes (Postgres) | Yes (dashboard) | Yes (dashboard) |
| Rate limiting | Yes | Per-key | Yes |
| Latency added | ~5ms (your server) | ~20ms | ~5ms (edge) |
| Open-source | Yes (MIT) | No | Partial |
| Free tier | Yes (self-host) | Limited credits | 10K req/day |

## Pricing Comparison

### LiteLLM
- **Software**: Free (open-source)
- **Cost**: Your server + direct API provider pricing
- **Example**: $20/mo VPS + provider costs at wholesale

### OpenRouter
- **Pass-through**: Most models at provider pricing
- **Some models**: Small markup (5-15%)
- **Free models**: Select open-source models at $0

### Cloudflare AI Gateway
- **Free tier**: 10,000 requests/day
- **Cache hits**: $0 (no API call made)
- **Potential savings**: Up to 95% for repeated queries

## When to Use Each

### Use LiteLLM When:
- You need full control over routing logic
- Data sovereignty requires self-hosting
- You want custom load balancing rules
- Your team manages its own infrastructure

### Use OpenRouter When:
- You want maximum model access with minimum setup
- You are prototyping and need to try many models
- You do not want to manage API keys per provider
- You need smart routing (cheapest/fastest)

### Use Cloudflare AI Gateway When:
- Many users ask similar questions (high cache hit rate)
- You need global edge distribution
- Cost reduction is the primary goal
- You already use Cloudflare

### Combine All Three
Many production setups stack gateways:

```
App → CF AI Gateway (cache) → LiteLLM (load balance) → Providers
                                    ↓ (fallback)
                               OpenRouter (200+ models)
```

### FAQ

**Q: Can I use multiple gateways together?**
A: Yes, they stack well. A common pattern is CF Gateway for caching, LiteLLM for routing, and OpenRouter as a fallback provider.

**Q: Which gateway adds the least latency?**
A: Cloudflare AI Gateway (~5ms, edge) and LiteLLM (~5ms, your server) add minimal latency. OpenRouter adds ~20ms due to their proxy.

**Q: Which is best for a small team just starting out?**
A: OpenRouter for simplest setup. Add Cloudflare AI Gateway when you want caching. Add LiteLLM when you need full control.

---

## Source & Thanks

> Comparison based on official documentation and community benchmarks as of April 2026.
>
> Related assets on TokRepo: [LiteLLM](https://tokrepo.com), [OpenRouter](https://tokrepo.com), [Cloudflare AI Gateway](https://tokrepo.com)

---

<!-- ZH -->

## 快速使用

| 需求 | 最佳选择 |
|------|---------|
| 自托管，完全控制 | **LiteLLM** |
| 最快上手，最多模型 | **OpenRouter** |
| 缓存 + 降低成本 | **Cloudflare AI Gateway** |

---

## 简介

每个运行 LLM 应用的团队都面临同一个问题：应用和模型提供商之间用哪个网关？本指南对比三大方案：LiteLLM（自托管代理）、OpenRouter（统一 API）和 Cloudflare AI Gateway（边缘缓存）。涵盖架构、定价、功能和最佳场景。许多生产环境同时使用两到三个。

---

## 来源与感谢

> 基于官方文档和社区基准测试，2026 年 4 月更新。

---
Source: https://tokrepo.com/en/workflows/27fc09fd-0f35-4c66-b033-aaf970b53d8e
Author: Prompt Lab