ScriptsMar 31, 2026·2 min read

LiteLLM — Unified Proxy for 100+ LLM APIs

Python SDK and proxy server to call 100+ LLM APIs in OpenAI format. Cost tracking, guardrails, load balancing, logging. Supports Bedrock, Azure, Anthropic, Vertex, and more. 42K+ stars.

TO
TokRepo精选 · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

pip install litellm

# Use as SDK
python -c "
from litellm import completion
resp = completion(model='anthropic/claude-sonnet-4-20250514', messages=[{'role':'user','content':'Hi'}])
print(resp.choices[0].message.content)
"

Or run as a proxy server:

litellm --model anthropic/claude-sonnet-4-20250514
# Now call http://localhost:4000 with OpenAI format

Intro

LiteLLM is a Python SDK and AI Gateway proxy to call 100+ LLM APIs using the OpenAI format. Write your code once, switch providers by changing one string. Includes cost tracking, rate limiting, guardrails, load balancing, fallbacks, and logging. Supports OpenAI, Anthropic, Azure, AWS Bedrock, Google Vertex, Cohere, HuggingFace, Ollama, and 90+ more. 42,000+ GitHub stars.

Best for: Teams managing multiple LLM providers with unified API, cost control, and observability Works with: OpenAI, Anthropic, Google, Azure, AWS Bedrock, Ollama, 100+ providers


Key Features

Unified API

One format for all providers — just change the model string:

# OpenAI
completion(model="gpt-4o", messages=messages)
# Anthropic
completion(model="anthropic/claude-sonnet-4-20250514", messages=messages)
# Bedrock
completion(model="bedrock/anthropic.claude-3", messages=messages)

Proxy Server (AI Gateway)

Deploy as a centralized gateway for your team:

  • Cost tracking per user, team, and API key
  • Rate limiting and budget caps
  • Load balancing across providers
  • Fallbacks — auto-retry with backup models
  • Guardrails — content filtering, PII detection

100+ Providers

OpenAI, Anthropic, Azure, AWS Bedrock, Google Vertex, Cohere, Mistral, Ollama, vLLM, Together, Replicate, HuggingFace, and many more.

Observability

Built-in logging to Langfuse, Helicone, Lunary, and custom callbacks.


FAQ

Q: What is LiteLLM? A: A Python SDK and proxy server to call 100+ LLM APIs using the OpenAI format with cost tracking, load balancing, and guardrails. 42K+ GitHub stars.

Q: How is LiteLLM different from OpenRouter? A: LiteLLM is self-hosted (you control the proxy and API keys), while OpenRouter is a managed service. LiteLLM gives you cost tracking, rate limiting, and team management.


🙏

Source & Thanks

Created by BerriAI. Licensed under MIT. BerriAI/litellm — 42,000+ GitHub stars

Related Assets