Is LiteLLM Proxy — Unified Gateway for 100+ LLM APIs free to use?

Yes. LiteLLM Proxy — Unified Gateway for 100+ LLM APIs is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install LiteLLM Proxy — Unified Gateway for 100+ LLM APIs?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Cette page est affichée en anglais. Une traduction française est en cours.

WorkflowsMay 7, 2026·4 min de lecture

LiteLLM Proxy — Unified Gateway for 100+ LLM APIs

Name: LiteLLM Proxy — Unified Gateway for 100+ LLM APIs
Author: LiteLLM (BerriAI)

LiteLLM Proxy maps 100+ LLM providers (Anthropic, OpenAI, Bedrock, Vertex) to one OpenAI-compatible endpoint. Auth, rate limit, cost track, fallbacks.

LiteLLM (BerriAI) · Community

Prêt pour agents

Cet actif peut être lu et installé directement par les agents

TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.

Stage only · 17/100Stage only

Surface agent

Tout agent MCP/CLI

Type

Skill

Installation

Stage only

Confiance

Confiance : New

Point d'entrée

Asset

Commande CLI universelle

npx tokrepo install 0f113965-1adc-4435-982b-fb613fa4d157

contrat d'installation JSON metadata plan adaptateur contenu raw

Introduction

LiteLLM Proxy is a self-hostable gateway that exposes 100+ LLM providers as one OpenAI-compatible endpoint. Point any OpenAI SDK at the proxy and route to Anthropic / Bedrock / Vertex / Together / Groq / local Ollama with no code changes. Adds team-level auth, rate limits, cost tracking, and fallbacks. Best for: enterprises with multi-team / multi-provider LLM use, or anyone who wants to swap models without rewriting clients. Works with: any OpenAI SDK (Python, Node, Go, etc), Claude Code, Cursor (via custom OpenAI base URL). Setup time: 5 minutes (Docker compose).

Run the proxy

# config.yaml
model_list:
  - model_name: claude-3-5-sonnet
    litellm_params:
      model: anthropic/claude-3-5-sonnet-20241022
      api_key: os.environ/ANTHROPIC_API_KEY
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  - model_name: codestral
    litellm_params:
      model: ollama/codestral
      api_base: http://localhost:11434

general_settings:
  master_key: sk-1234
  database_url: postgresql://...

router_settings:
  routing_strategy: simple-shuffle
  fallbacks:
    - claude-3-5-sonnet: ["gpt-4o"]  # if Claude rate-limits, try GPT

docker run -p 4000:4000 \
  -v $(pwd)/config.yaml:/app/config.yaml \
  -e ANTHROPIC_API_KEY \
  -e OPENAI_API_KEY \
  ghcr.io/berriai/litellm:main-stable \
  --config /app/config.yaml

Use from any OpenAI SDK

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:4000",
    api_key="sk-virtual-key-for-team-acme",  # generated via /key/generate
)

# Same API, but routed through LiteLLM
resp = client.chat.completions.create(
    model="claude-3-5-sonnet",   # name from config.yaml
    messages=[{"role": "user", "content": "Hello"}],
)

Per-team budgets and rate limits

# Generate a key for team Acme with $50/mo budget
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-1234" \
  -d '{"team_id": "acme", "max_budget": 50, "rpm_limit": 100}'

# Returns: {"key": "sk-acme-xyz123", ...}

LiteLLM tracks every call per team, blocks once the budget is hit, and exposes a /spend dashboard.

Connect Claude Code / Cursor

# Claude Code
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=sk-acme-xyz123

# Cursor — Settings > Custom OpenAI Base URL: http://localhost:4000/v1

FAQ

Q: Is LiteLLM Proxy free? A: Yes — open-source under MIT license. Self-host with Docker for free. BerriAI also offers a hosted version (LiteLLM Cloud) with SLAs and managed observability for teams that don't want to run infra.

Q: How does this differ from OpenRouter or Portkey? A: OpenRouter is a hosted-only proxy. Portkey is a hosted gateway with observability. LiteLLM is the only one that's primarily open-source self-host — full control over auth, routing, and data. All three speak OpenAI format.

Q: Will my non-OpenAI providers really 'just work' through OpenAI SDK? A: Yes for chat/completions. Tool calls work too (LiteLLM normalizes the format). Edge cases: streaming usage stats and provider-specific extensions (Anthropic prompt caching, OpenAI logprobs) need explicit support — most are wired up. Check litellm.ai/docs/providers for your provider.

Quick Use

Save the YAML config below as config.yaml
docker run -p 4000:4000 -v $(pwd)/config.yaml:/app/config.yaml ghcr.io/berriai/litellm:main-stable --config /app/config.yaml
Point any OpenAI SDK at http://localhost:4000

Intro

Run the proxy

# config.yaml
model_list:
  - model_name: claude-3-5-sonnet
    litellm_params:
      model: anthropic/claude-3-5-sonnet-20241022
      api_key: os.environ/ANTHROPIC_API_KEY
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  - model_name: codestral
    litellm_params:
      model: ollama/codestral
      api_base: http://localhost:11434

general_settings:
  master_key: sk-1234
  database_url: postgresql://...

router_settings:
  routing_strategy: simple-shuffle
  fallbacks:
    - claude-3-5-sonnet: ["gpt-4o"]  # if Claude rate-limits, try GPT

docker run -p 4000:4000 \
  -v $(pwd)/config.yaml:/app/config.yaml \
  -e ANTHROPIC_API_KEY \
  -e OPENAI_API_KEY \
  ghcr.io/berriai/litellm:main-stable \
  --config /app/config.yaml

Use from any OpenAI SDK

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:4000",
    api_key="sk-virtual-key-for-team-acme",  # generated via /key/generate
)

# Same API, but routed through LiteLLM
resp = client.chat.completions.create(
    model="claude-3-5-sonnet",   # name from config.yaml
    messages=[{"role": "user", "content": "Hello"}],
)

Per-team budgets and rate limits

# Generate a key for team Acme with $50/mo budget
curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-1234" \
  -d '{"team_id": "acme", "max_budget": 50, "rpm_limit": 100}'

# Returns: {"key": "sk-acme-xyz123", ...}

LiteLLM tracks every call per team, blocks once the budget is hit, and exposes a /spend dashboard.

Connect Claude Code / Cursor

# Claude Code
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_API_KEY=sk-acme-xyz123

# Cursor — Settings > Custom OpenAI Base URL: http://localhost:4000/v1

FAQ

Source & Thanks

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — ⭐ 17,000+

🙏

Source et remerciements

Built by BerriAI. Licensed under MIT.

BerriAI/litellm — ⭐ 17,000+

Fil de discussion

Connectez-vous pour rejoindre la discussion.

Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires

LLM Gateway Comparison — Proxy Your AI Requests

Compare top LLM gateway and proxy tools for routing AI requests. Covers LiteLLM, Bifrost, Portkey, and OpenRouter for cost optimization, failover, and multi-provider access.

Workflows

Agent Toolkit

Cloudflare AI Gateway — LLM Proxy, Cache & Analytics

Free proxy gateway for LLM API calls with caching, rate limiting, cost tracking, and fallback routing across providers. Reduce costs up to 95% with response caching. 7,000+ stars.

Configs

Cloudflare

LiteLLM — Unified Proxy for 100+ LLM APIs

Python SDK and proxy server to call 100+ LLM APIs in OpenAI format. Cost tracking, guardrails, load balancing, logging. Supports Bedrock, Azure, Anthropic, Vertex, and more. 42K+ stars.

Scripts

Script Depot

LLM Gateway Comparison — LiteLLM vs OpenRouter vs CF

In-depth comparison of LLM API gateways: LiteLLM (self-hosted proxy), OpenRouter (unified API), and Cloudflare AI Gateway (edge cache). Architecture, pricing, and when to use each.

Prompts

Prompt Lab