Quick Use
- Save the YAML config below as
config.yaml docker run -p 4000:4000 -v $(pwd)/config.yaml:/app/config.yaml ghcr.io/berriai/litellm:main-stable --config /app/config.yaml- Point any OpenAI SDK at
http://localhost:4000
Intro
LiteLLM Proxy is a self-hostable gateway that exposes 100+ LLM providers as one OpenAI-compatible endpoint. Point any OpenAI SDK at the proxy and route to Anthropic / Bedrock / Vertex / Together / Groq / local Ollama with no code changes. Adds team-level auth, rate limits, cost tracking, and fallbacks. Best for: enterprises with multi-team / multi-provider LLM use, or anyone who wants to swap models without rewriting clients. Works with: any OpenAI SDK (Python, Node, Go, etc), Claude Code, Cursor (via custom OpenAI base URL). Setup time: 5 minutes (Docker compose).
Run the proxy
# config.yaml
model_list:
- model_name: claude-3-5-sonnet
litellm_params:
model: anthropic/claude-3-5-sonnet-20241022
api_key: os.environ/ANTHROPIC_API_KEY
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
- model_name: codestral
litellm_params:
model: ollama/codestral
api_base: http://localhost:11434
general_settings:
master_key: sk-1234
database_url: postgresql://...
router_settings:
routing_strategy: simple-shuffle
fallbacks:
- claude-3-5-sonnet: ["gpt-4o"] # if Claude rate-limits, try GPTdocker run -p 4000:4000 \
-v $(pwd)/config.yaml:/app/config.yaml \
-e ANTHROPIC_API_KEY \
-e OPENAI_API_KEY \
ghcr.io/berriai/litellm:main-stable \
--config /app/config.yamlUse from any OpenAI SDK
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:4000",
api_key="sk-virtual-key-for-team-acme", # generated via /key/generate
)
# Same API, but routed through LiteLLM
resp = client.chat.completions.create(
model="claude-3-5-sonnet", # name from config.yaml
messages=[{"role": "user", "content": "Hello"}],
)Per-team budgets and rate limits
# Generate a key for team Acme with $50/mo budget
curl -X POST http://localhost:4000/key/generate \
-H "Authorization: Bearer sk-1234" \
-d '{"team_id": "acme", "max_budget": 50, "rpm_limit": 100}'
# Returns: {"key": "sk-acme-xyz123", ...}LiteLLM tracks every call per team, blocks once the budget is hit, and exposes a /spend dashboard.
Connect Claude Code / Cursor
# Claude Code
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=sk-acme-xyz123
# Cursor — Settings > Custom OpenAI Base URL: http://localhost:4000/v1FAQ
Q: Is LiteLLM Proxy free? A: Yes — open-source under MIT license. Self-host with Docker for free. BerriAI also offers a hosted version (LiteLLM Cloud) with SLAs and managed observability for teams that don't want to run infra.
Q: How does this differ from OpenRouter or Portkey? A: OpenRouter is a hosted-only proxy. Portkey is a hosted gateway with observability. LiteLLM is the only one that's primarily open-source self-host — full control over auth, routing, and data. All three speak OpenAI format.
Q: Will my non-OpenAI providers really 'just work' through OpenAI SDK? A: Yes for chat/completions. Tool calls work too (LiteLLM normalizes the format). Edge cases: streaming usage stats and provider-specific extensions (Anthropic prompt caching, OpenAI logprobs) need explicit support — most are wired up. Check litellm.ai/docs/providers for your provider.
Source & Thanks
Built by BerriAI. Licensed under MIT.
BerriAI/litellm — ⭐ 17,000+