Cette page est affichée en anglais. Une traduction française est en cours.

PromptsApr 6, 2026·4 min de lecture

LLM Gateway Comparison — LiteLLM vs OpenRouter vs CF

In-depth comparison of LLM API gateways: LiteLLM (self-hosted proxy), OpenRouter (unified API), and Cloudflare AI Gateway (edge cache). Architecture, pricing, and when to use each.

Prompt Lab · Community

Prêt pour agents

Installation agent prête

Cet actif peut être installé après choix du runtime, vérification du plan et exécution de la commande adaptée.

Native · 96/100Policy : autoriser

Surface agent

Tout agent MCP/CLI

Type

Prompt

Installation

Single

Confiance

Confiance : Community

Point d'entrée

LLM Gateway Comparison — LiteLLM vs OpenRouter vs CF

Commande d'installation directe

npx -y tokrepo@latest install 27fc09fd-0f35-4c66-b033-aaf970b53d8e --target codex

À exécuter après confirmation du plan en dry-run.

TL;DR

Compare LiteLLM, OpenRouter, and Cloudflare AI Gateway for routing, caching, and cost control.

§01

What it is

This guide compares three LLM API gateways: LiteLLM (a self-hosted proxy that unifies 100+ LLM providers behind an OpenAI-compatible API), OpenRouter (a managed unified API with usage-based pricing), and Cloudflare AI Gateway (an edge-layer cache and rate limiter for LLM calls).

The comparison targets engineering teams deciding how to route LLM traffic across multiple providers with fallback, load balancing, and cost visibility.

§02

How it saves time or tokens

Choosing the right gateway before building saves weeks of refactoring. LiteLLM gives you full control and zero markup but requires self-hosting. OpenRouter gives you instant access to many models with no infrastructure but adds a small fee per token. Cloudflare AI Gateway sits in front of any provider and caches repeated requests, reducing token spend.

This comparison condenses the decision into architecture, pricing, and use-case fit.

§03

How to use

Assess your requirements: self-hosted vs managed, number of providers, caching needs
If self-hosted with full control: pick LiteLLM and deploy as a Docker container
If managed with broad model access: pick OpenRouter and use their unified API key
If you need edge caching on top of existing provider keys: add Cloudflare AI Gateway

§04

Example

# LiteLLM: unified call across providers
from litellm import completion

# Same interface, different providers
response = completion(
    model='gpt-4o',
    messages=[{'role': 'user', 'content': 'Hello'}]
)

# Switch to Anthropic with one line change
response = completion(
    model='claude-sonnet-4-20250514',
    messages=[{'role': 'user', 'content': 'Hello'}]
)

# Fallback chain
response = completion(
    model='gpt-4o',
    fallbacks=['claude-sonnet-4-20250514', 'gemini/gemini-pro'],
    messages=[{'role': 'user', 'content': 'Hello'}]
)

§05

Related on TokRepo

AI gateway providers -- Compare all gateway providers on TokRepo
LiteLLM deep-dive -- Detailed LiteLLM page

§06

Common pitfalls

LiteLLM requires managing your own provider API keys and infrastructure; do not assume it handles billing
OpenRouter adds a markup on top of provider pricing; for high-volume use, self-hosted LiteLLM is cheaper
Cloudflare AI Gateway only caches identical requests; slight prompt variations produce cache misses

Questions fréquentes

Which gateway has the lowest latency?+

LiteLLM adds minimal latency since it runs alongside your app. OpenRouter adds network hop latency to their proxy. Cloudflare AI Gateway sits at the edge and can reduce latency for cached responses but adds a hop for cache misses. For latency-sensitive apps, self-hosted LiteLLM is fastest.

Can I use multiple gateways together?+

Yes. A common pattern is LiteLLM for routing and fallback, with Cloudflare AI Gateway in front for caching and rate limiting. OpenRouter can be one of LiteLLM's backend providers, giving you access to models you do not have direct API keys for.

How does pricing compare across gateways?+

LiteLLM is free and open-source; you pay only provider costs. OpenRouter adds a small percentage markup on provider pricing. Cloudflare AI Gateway is free for the first 10K requests per month, then usage-based. Total cost depends on your volume and provider mix.

Do these gateways support streaming?+

All three support streaming. LiteLLM proxies SSE streams from providers. OpenRouter streams via their API. Cloudflare AI Gateway passes through streaming responses. Streaming behavior is transparent to your application.

Which gateway is best for a startup?+

Start with OpenRouter for fast prototyping without managing infrastructure. Move to LiteLLM when you want cost control and self-hosting. Add Cloudflare AI Gateway when you need edge caching or rate limiting for production traffic.

Sources citées (3)

LiteLLM GitHub— LiteLLM unifies 100+ LLM providers behind an OpenAI-compatible API
OpenRouter— OpenRouter provides a unified API for accessing multiple LLM providers
Cloudflare Docs— Cloudflare AI Gateway provides caching, rate limiting, and observability for AI …

En lien sur TokRepo

AI gateway providers LiteLLM Cloudflare AI Gateway

🙏

Source et remerciements

Comparison based on official documentation and community benchmarks as of April 2026.

Related assets on TokRepo: LiteLLM, OpenRouter, Cloudflare AI Gateway

Fil de discussion

Connectez-vous pour rejoindre la discussion.

Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires

Cursor vs Claude Code vs Codex — AI Coding Compared

In-depth comparison of the three leading AI coding tools in 2026. Covers pricing, context window, MCP support, agent capabilities, and best use cases for each platform.

Prompts

Prompt Lab

LLM Gateway Comparison — Proxy Your AI Requests

Compare top LLM gateway and proxy tools for routing AI requests. Covers LiteLLM, Bifrost, Portkey, and OpenRouter for cost optimization, failover, and multi-provider access.

Skills

Agent Toolkit

LiteLLM Proxy — Unified Gateway for 100+ LLM APIs

LiteLLM Proxy maps 100+ LLM providers (Anthropic, OpenAI, Bedrock, Vertex) to one OpenAI-compatible endpoint. Auth, rate limit, cost track, fallbacks.

Workflows

LiteLLM (BerriAI)

Cloudflare AI Gateway — LLM Proxy, Cache & Analytics

Free proxy gateway for LLM API calls with caching, rate limiting, cost tracking, and fallback routing across providers. Reduce costs up to 95% with response caching. 7,000+ stars.

Skills

Cloudflare