Is LiteLLM Proxy — Unified Gateway for 100+ LLM APIs free to use?

Yes. LiteLLM Proxy — Unified Gateway for 100+ LLM APIs is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install LiteLLM Proxy — Unified Gateway for 100+ LLM APIs?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Esta página se muestra en inglés. Una traducción al español está en curso.

WorkflowsMay 7, 2026·4 min de lectura

LiteLLM Proxy — Unified Gateway for 100+ LLM APIs

Name: LiteLLM Proxy — Unified Gateway for 100+ LLM APIs
Author: LiteLLM (BerriAI)

LiteLLM Proxy maps 100+ LLM providers (Anthropic, OpenAI, Bedrock, Vertex) to one OpenAI-compatible endpoint. Auth, rate limit, cost track, fallbacks.

LiteLLM (BerriAI) · Community

Listo para agents

Este activo puede ser leído e instalado directamente por agents

TokRepo expone un comando CLI universal, contrato de instalación, metadata JSON, plan según adaptador y contenido raw para que los agents evalúen compatibilidad, riesgo y próximos pasos.

Stage only · 17/100Stage only

Superficie agent

Cualquier agent MCP/CLI

Tipo

Skill

Instalación

Stage only

Confianza

Confianza: New

Entrada

Asset

Comando CLI universal

npx tokrepo install 0f113965-1adc-4435-982b-fb613fa4d157

contrato de instalación JSON de metadata plan adaptador contenido raw

Introducción

LiteLLM Proxy is a self-hostable gateway that exposes 100+ LLM providers as one OpenAI-compatible endpoint. Point any OpenAI SDK at the proxy and route to Anthropic / Bedrock / Vertex / Together / Groq / local Ollama with no code changes. Adds team-level auth, rate limits, cost tracking, and fallbacks. Best for: enterprises with multi-team / multi-provider LLM use, or anyone who wants to swap models without rewriting clients. Works with: any OpenAI SDK (Python, Node, Go, etc), Claude Code, Cursor (via custom OpenAI base URL). Setup time: 5 minutes (Docker compose).

Run the proxy

# config.yaml
model_list:
  - model_name: claude-3-5-sonnet
    litellm_params:
      model: anthropic/claude-3-5-sonnet-20241022
      api_key: os.environ/ANTHROPIC_API_KEY
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  - model_name: codestral
    litellm_params:
      model: ollama/codestral
      api_base: http://localhost:11434

general_settings:
  master_key: sk-1234
  database_url: postgresql://...

router_settings:
  routing_strategy: simple-shuffle
  fallbacks:
    - claude-3-5-sonnet: ["gpt-4o"]  # if Claude rate-limits, try GPT

docker run -p 4000:4000 \
  -v $(pwd)/config.yaml:/app/config.yaml \
  -e ANTHROPIC_API_KEY \
  -e OPENAI_API_KEY \
  ghcr.io/berriai/litellm:main-stable \
  --config /app/config.yaml

Use from any OpenAI SDK

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:4000",
    api_key="sk-virtual-key-for-team-acme",  # generated via /key/generate
)

# Same API, but routed through LiteLLM
resp = client.chat.completions.create(
    model="claude-3-5-sonnet",   # name from config.yaml
    messages=[{"role": "user", "content": "Hello"}],
)

Per-team budgets and rate limits

# Generate a key for team Acme with $50/mo budget
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-1234" \
  -d '{"team_id": "acme", "max_budget": 50, "rpm_limit": 100}'

# Returns: {"key": "sk-acme-xyz123", ...}

LiteLLM tracks every call per team, blocks once the budget is hit, and exposes a /spend dashboard.

Connect Claude Code / Cursor

# Claude Code
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=sk-acme-xyz123

# Cursor — Settings > Custom OpenAI Base URL: http://localhost:4000/v1

FAQ

Q: Is LiteLLM Proxy free? A: Yes — open-source under MIT license. Self-host with Docker for free. BerriAI also offers a hosted version (LiteLLM Cloud) with SLAs and managed observability for teams that don't want to run infra.

Q: How does this differ from OpenRouter or Portkey? A: OpenRouter is a hosted-only proxy. Portkey is a hosted gateway with observability. LiteLLM is the only one that's primarily open-source self-host — full control over auth, routing, and data. All three speak OpenAI format.

Q: Will my non-OpenAI providers really 'just work' through OpenAI SDK? A: Yes for chat/completions. Tool calls work too (LiteLLM normalizes the format). Edge cases: streaming usage stats and provider-specific extensions (Anthropic prompt caching, OpenAI logprobs) need explicit support — most are wired up. Check litellm.ai/docs/providers for your provider.

Quick Use

Save the YAML config below as config.yaml
docker run -p 4000:4000 -v $(pwd)/config.yaml:/app/config.yaml ghcr.io/berriai/litellm:main-stable --config /app/config.yaml
Point any OpenAI SDK at http://localhost:4000

Intro

Run the proxy

# config.yaml
model_list:
  - model_name: claude-3-5-sonnet
    litellm_params:
      model: anthropic/claude-3-5-sonnet-20241022
      api_key: os.environ/ANTHROPIC_API_KEY
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  - model_name: codestral
    litellm_params:
      model: ollama/codestral
      api_base: http://localhost:11434

general_settings:
  master_key: sk-1234
  database_url: postgresql://...

router_settings:
  routing_strategy: simple-shuffle
  fallbacks:
    - claude-3-5-sonnet: ["gpt-4o"]  # if Claude rate-limits, try GPT

docker run -p 4000:4000 \
  -v $(pwd)/config.yaml:/app/config.yaml \
  -e ANTHROPIC_API_KEY \
  -e OPENAI_API_KEY \
  ghcr.io/berriai/litellm:main-stable \
  --config /app/config.yaml

Use from any OpenAI SDK

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:4000",
    api_key="sk-virtual-key-for-team-acme",  # generated via /key/generate
)

# Same API, but routed through LiteLLM
resp = client.chat.completions.create(
    model="claude-3-5-sonnet",   # name from config.yaml
    messages=[{"role": "user", "content": "Hello"}],
)

Per-team budgets and rate limits

# Generate a key for team Acme with $50/mo budget
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-1234" \
  -d '{"team_id": "acme", "max_budget": 50, "rpm_limit": 100}'

# Returns: {"key": "sk-acme-xyz123", ...}

LiteLLM tracks every call per team, blocks once the budget is hit, and exposes a /spend dashboard.

Connect Claude Code / Cursor

# Claude Code
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=sk-acme-xyz123

# Cursor — Settings > Custom OpenAI Base URL: http://localhost:4000/v1

FAQ

Source & Thanks

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — ⭐ 17,000+

🙏

Fuente y agradecimientos

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — ⭐ 17,000+

Discusión

Inicia sesión para unirte a la discusión.

Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados

LLM Gateway Comparison — Proxy Your AI Requests

Compare top LLM gateway and proxy tools for routing AI requests. Covers LiteLLM, Bifrost, Portkey, and OpenRouter for cost optimization, failover, and multi-provider access.

Workflows

Agent Toolkit

Cloudflare AI Gateway — LLM Proxy, Cache & Analytics

Free proxy gateway for LLM API calls with caching, rate limiting, cost tracking, and fallback routing across providers. Reduce costs up to 95% with response caching. 7,000+ stars.

Configs

Cloudflare

LiteLLM — Unified Proxy for 100+ LLM APIs

Python SDK and proxy server to call 100+ LLM APIs in OpenAI format. Cost tracking, guardrails, load balancing, logging. Supports Bedrock, Azure, Anthropic, Vertex, and more. 42K+ stars.

Scripts

Script Depot

LLM Gateway Comparison — LiteLLM vs OpenRouter vs CF

In-depth comparison of LLM API gateways: LiteLLM (self-hosted proxy), OpenRouter (unified API), and Cloudflare AI Gateway (edge cache). Architecture, pricing, and when to use each.

Prompts

Prompt Lab