Esta página se muestra en inglés. Una traducción al español está en curso.
WorkflowsMay 7, 2026·4 min de lectura

LiteLLM Proxy — Unified Gateway for 100+ LLM APIs

LiteLLM Proxy maps 100+ LLM providers (Anthropic, OpenAI, Bedrock, Vertex) to one OpenAI-compatible endpoint. Auth, rate limit, cost track, fallbacks.

Listo para agents

Este activo puede ser leído e instalado directamente por agents

TokRepo expone un comando CLI universal, contrato de instalación, metadata JSON, plan según adaptador y contenido raw para que los agents evalúen compatibilidad, riesgo y próximos pasos.

Stage only · 17/100Stage only
Superficie agent
Cualquier agent MCP/CLI
Tipo
Skill
Instalación
Stage only
Confianza
Confianza: New
Entrada
Asset
Comando CLI universal
npx tokrepo install 0f113965-1adc-4435-982b-fb613fa4d157
Introducción

LiteLLM Proxy is a self-hostable gateway that exposes 100+ LLM providers as one OpenAI-compatible endpoint. Point any OpenAI SDK at the proxy and route to Anthropic / Bedrock / Vertex / Together / Groq / local Ollama with no code changes. Adds team-level auth, rate limits, cost tracking, and fallbacks. Best for: enterprises with multi-team / multi-provider LLM use, or anyone who wants to swap models without rewriting clients. Works with: any OpenAI SDK (Python, Node, Go, etc), Claude Code, Cursor (via custom OpenAI base URL). Setup time: 5 minutes (Docker compose).


Run the proxy

# config.yaml
model_list:
  - model_name: claude-3-5-sonnet
    litellm_params:
      model: anthropic/claude-3-5-sonnet-20241022
      api_key: os.environ/ANTHROPIC_API_KEY
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  - model_name: codestral
    litellm_params:
      model: ollama/codestral
      api_base: http://localhost:11434

general_settings:
  master_key: sk-1234
  database_url: postgresql://...

router_settings:
  routing_strategy: simple-shuffle
  fallbacks:
    - claude-3-5-sonnet: ["gpt-4o"]  # if Claude rate-limits, try GPT
docker run -p 4000:4000 \
  -v $(pwd)/config.yaml:/app/config.yaml \
  -e ANTHROPIC_API_KEY \
  -e OPENAI_API_KEY \
  ghcr.io/berriai/litellm:main-stable \
  --config /app/config.yaml

Use from any OpenAI SDK

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:4000",
    api_key="sk-virtual-key-for-team-acme",  # generated via /key/generate
)

# Same API, but routed through LiteLLM
resp = client.chat.completions.create(
    model="claude-3-5-sonnet",   # name from config.yaml
    messages=[{"role": "user", "content": "Hello"}],
)

Per-team budgets and rate limits

# Generate a key for team Acme with $50/mo budget
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-1234" \
  -d '{"team_id": "acme", "max_budget": 50, "rpm_limit": 100}'

# Returns: {"key": "sk-acme-xyz123", ...}

LiteLLM tracks every call per team, blocks once the budget is hit, and exposes a /spend dashboard.

Connect Claude Code / Cursor

# Claude Code
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=sk-acme-xyz123

# Cursor — Settings > Custom OpenAI Base URL: http://localhost:4000/v1

FAQ

Q: Is LiteLLM Proxy free? A: Yes — open-source under MIT license. Self-host with Docker for free. BerriAI also offers a hosted version (LiteLLM Cloud) with SLAs and managed observability for teams that don't want to run infra.

Q: How does this differ from OpenRouter or Portkey? A: OpenRouter is a hosted-only proxy. Portkey is a hosted gateway with observability. LiteLLM is the only one that's primarily open-source self-host — full control over auth, routing, and data. All three speak OpenAI format.

Q: Will my non-OpenAI providers really 'just work' through OpenAI SDK? A: Yes for chat/completions. Tool calls work too (LiteLLM normalizes the format). Edge cases: streaming usage stats and provider-specific extensions (Anthropic prompt caching, OpenAI logprobs) need explicit support — most are wired up. Check litellm.ai/docs/providers for your provider.


Quick Use

  1. Save the YAML config below as config.yaml
  2. docker run -p 4000:4000 -v $(pwd)/config.yaml:/app/config.yaml ghcr.io/berriai/litellm:main-stable --config /app/config.yaml
  3. Point any OpenAI SDK at http://localhost:4000

Intro

LiteLLM Proxy is a self-hostable gateway that exposes 100+ LLM providers as one OpenAI-compatible endpoint. Point any OpenAI SDK at the proxy and route to Anthropic / Bedrock / Vertex / Together / Groq / local Ollama with no code changes. Adds team-level auth, rate limits, cost tracking, and fallbacks. Best for: enterprises with multi-team / multi-provider LLM use, or anyone who wants to swap models without rewriting clients. Works with: any OpenAI SDK (Python, Node, Go, etc), Claude Code, Cursor (via custom OpenAI base URL). Setup time: 5 minutes (Docker compose).


Run the proxy

# config.yaml
model_list:
  - model_name: claude-3-5-sonnet
    litellm_params:
      model: anthropic/claude-3-5-sonnet-20241022
      api_key: os.environ/ANTHROPIC_API_KEY
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  - model_name: codestral
    litellm_params:
      model: ollama/codestral
      api_base: http://localhost:11434

general_settings:
  master_key: sk-1234
  database_url: postgresql://...

router_settings:
  routing_strategy: simple-shuffle
  fallbacks:
    - claude-3-5-sonnet: ["gpt-4o"]  # if Claude rate-limits, try GPT
docker run -p 4000:4000 \
  -v $(pwd)/config.yaml:/app/config.yaml \
  -e ANTHROPIC_API_KEY \
  -e OPENAI_API_KEY \
  ghcr.io/berriai/litellm:main-stable \
  --config /app/config.yaml

Use from any OpenAI SDK

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:4000",
    api_key="sk-virtual-key-for-team-acme",  # generated via /key/generate
)

# Same API, but routed through LiteLLM
resp = client.chat.completions.create(
    model="claude-3-5-sonnet",   # name from config.yaml
    messages=[{"role": "user", "content": "Hello"}],
)

Per-team budgets and rate limits

# Generate a key for team Acme with $50/mo budget
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-1234" \
  -d '{"team_id": "acme", "max_budget": 50, "rpm_limit": 100}'

# Returns: {"key": "sk-acme-xyz123", ...}

LiteLLM tracks every call per team, blocks once the budget is hit, and exposes a /spend dashboard.

Connect Claude Code / Cursor

# Claude Code
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=sk-acme-xyz123

# Cursor — Settings > Custom OpenAI Base URL: http://localhost:4000/v1

FAQ

Q: Is LiteLLM Proxy free? A: Yes — open-source under MIT license. Self-host with Docker for free. BerriAI also offers a hosted version (LiteLLM Cloud) with SLAs and managed observability for teams that don't want to run infra.

Q: How does this differ from OpenRouter or Portkey? A: OpenRouter is a hosted-only proxy. Portkey is a hosted gateway with observability. LiteLLM is the only one that's primarily open-source self-host — full control over auth, routing, and data. All three speak OpenAI format.

Q: Will my non-OpenAI providers really 'just work' through OpenAI SDK? A: Yes for chat/completions. Tool calls work too (LiteLLM normalizes the format). Edge cases: streaming usage stats and provider-specific extensions (Anthropic prompt caching, OpenAI logprobs) need explicit support — most are wired up. Check litellm.ai/docs/providers for your provider.


Source & Thanks

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — ⭐ 17,000+

🙏

Fuente y agradecimientos

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — ⭐ 17,000+

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados