AI Gateway

LiteLLM — 支持 100+ 提供商的开源 LLM 代理

LiteLLM 是开源 LLM 代理，把 100+ 家模型 API 统一成 OpenAI SDK 风格。Claude、Gemini、Ollama、Bedrock、Vertex、Azure 全部一个客户端调用。

Why LiteLLM

LiteLLM is the "one SDK for every LLM" answer, plus a full Proxy server for teams that want a hosted gateway they control. The SDK alone normalizes inputs and outputs: completion(model="claude-3-5-sonnet", messages=[...]) works identically to the OpenAI call. The Proxy adds routing, budgets, key management, logging, and a Swagger UI.

It’s the most popular OSS gateway (25K+ GitHub stars) and the standard reference for framework-agnostic multi-model access. LangChain, LlamaIndex, and CrewAI all support LiteLLM as a model provider out of the box. If you’ve read "point it at any OpenAI-compatible endpoint" in a dozen READMEs, LiteLLM is how most of those setups work.

What you give up: polish. The dashboard exists but is functional, not beautiful. Observability is present but not deep — most teams pair LiteLLM Proxy with Langfuse or Helicone for traces. For the free-and-open price, you trade UX for control.

Quick Start — SDK or Proxy

The SDK is the fastest path to multi-provider support — no server to run. The Proxy is a small FastAPI server that exposes OpenAI-compatible endpoints; point any OpenAI SDK at it. Config-driven routing means you change providers or load-balance strategies without touching app code.

# Option A: SDK only (no server needed)
# pip install litellm
from litellm import completion

resp = completion(
    model="anthropic/claude-3-5-sonnet-20241022",
    messages=[{"role": "user", "content": "Hello from LiteLLM"}],
)
print(resp.choices[0].message.content)

# Option B: Run the Proxy for team use
# pip install 'litellm[proxy]'
# litellm --config config.yaml --port 4000
#
# config.yaml:
# model_list:
#   - model_name: fast
#     litellm_params:
#       model: gpt-4o-mini
#       api_key: os.environ/OPENAI_KEY
#   - model_name: fast
#     litellm_params:
#       model: claude-3-5-haiku-20241022
#       api_key: os.environ/ANTHROPIC_KEY
# router_settings:
#   routing_strategy: usage-based-routing-v2

# Now call the proxy as if it were OpenAI
from openai import OpenAI
client = OpenAI(base_url="http://localhost:4000", api_key="sk-proxy-token")
r = client.chat.completions.create(model="fast", messages=[{"role":"user","content":"hi"}])
# Proxy load-balances between gpt-4o-mini and claude-3-5-haiku based on usage.

核心能力

100+ providers

OpenAI, Anthropic, Gemini, Bedrock, Azure, Vertex, Ollama, Together, Fireworks, Anyscale, Groq, Mistral, Cohere, HuggingFace, and many more. All through the same completion() signature.

Proxy server

Production-grade FastAPI server: routing, load-balancing, retries, caching, key management, and user budgets. Deploy with Docker; expose as an internal OpenAI-compatible endpoint.

Budgets & rate limits

Per-user, per-team, per-key budgets enforced at the Proxy. Alerts on 80% / 100% spend. Essential for multi-tenant or internal platform-as-a-service setups.

Langfuse / Helicone / Sentry hooks

Native callback integrations. Pair LiteLLM Proxy with Langfuse for traces, Helicone for observability, Sentry for errors. Configure via proxy YAML.

Fallback & retry

Declarative fallback lists: try Claude, fall back to GPT-4o, then to gpt-4o-mini. Exponential backoff built in. Configurable per route.

Custom auth & RBAC

Proxy generates virtual keys per user; role-based access controls which models and budgets each user can hit. Integrates with your existing SSO via OIDC.

对比

	License	Deployment	Dashboard	Best For
LiteLLM本工具	MIT (SDK) + proxy	Self-host	Functional	Teams wanting OSS gateway + unified SDK
Portkey	Gateway Apache 2.0; cloud proprietary	Managed + self-host	Polished	Teams wanting managed UX
OpenRouter	Proprietary	Managed only	Web UI	Quick multi-model experiments
Cloudflare AI Gateway	Proprietary	Managed only	Web UI	Edge caching, simple setup

实际用例

01. Internal AI platform

Platform team runs LiteLLM Proxy; product teams hit one OpenAI-compatible endpoint. Central control over providers, keys, budgets; no central code deploys when a team wants a new model.

02. Multi-model apps

Agents that route between fast/cheap and slow/powerful models. LiteLLM’s unified completion() signature means the routing logic is 10 lines, not an integration per provider.

03. Local + cloud hybrid

Use Ollama for dev and cheap inference, OpenAI/Claude for production. Same code path — switch via the model name.

价格与许可

LiteLLM: MIT license, free. No enterprise support SKU — the project is maintained by BerriAI and a growing community. For commercial support, litellm.ai offers hosted and enterprise tiers with SLAs.

Operational cost: small VM for the Proxy (2 vCPU / 4GB handles thousands of RPS in practice), plus your underlying LLM spend. No per-request gateway fees.

What you pay in hidden complexity: self-hosting means you own uptime, upgrades, and debugging. For teams that want "pay and forget", Portkey or Cloudflare lower the ops burden at the cost of license-free freedom.

常见问题

LiteLLM SDK vs LiteLLM Proxy — which do I need?+

SDK for single apps: you want unified completion() calls, no server. Proxy for teams / internal platform: multiple apps share the gateway, centralized keys and budgets, OpenAI-compatible endpoint for tools that want one.

Does LiteLLM add latency?+

SDK: ~0 (in-process). Proxy: 3-10ms hot-path overhead. Caching and load-balancing can save far more than they cost on realistic traffic.

How does LiteLLM compare to OpenRouter?+

OpenRouter is a managed SaaS with pay-per-token pricing across providers. LiteLLM is self-hosted with BYO-keys. Use OpenRouter for fast experimentation or when you want one invoice; use LiteLLM when you want control over keys, budgets, and data flow.

Is LiteLLM production-ready?+

Yes — deployed in production by many large organizations. Check the GitHub README for active adopter list. Expected caveats: watch the changelog for occasional breaks during rapid development; upgrade in staging before production.

Does it work with Claude Code / Cursor / Cline?+

Yes. Any tool that accepts an OpenAI-compatible endpoint (base URL + API key) works. Point Cursor or Cline at your LiteLLM Proxy, and the tool’s "OpenAI" integration now routes through your multi-provider gateway.

How do I add a new provider?+

LiteLLM’s /providers list covers most mainstream LLMs. For new or custom ones, register a generic OpenAI-compatible endpoint in the model_list config — no code change needed.