Skills2026年3月30日·1 分钟阅读

LiteLLM — Unified Proxy for 100+ LLM APIs

Python SDK and proxy server to call 100+ LLM APIs in OpenAI format. Cost tracking, guardrails, load balancing, logging. Supports Bedrock, Azure, Anthropic, Vertex, and more. 42K+ stars.

Agent 就绪

先审查再安装

这个资产需要先审查。复制的指令会要求 Agent dry-run、列出写入项,确认后再继续。

Needs Confirmation · 66/100策略:需确认
Agent 入口
任意 MCP/CLI Agent
类型
Skill
安装
Single
信任
信任等级:Established
入口
LiteLLM — Unified Proxy for 100+ LLM APIs
先审查命令
npx -y tokrepo@latest install d11eb1fe-cfa0-4da0-ac2d-b6a77abc1b8c --target codex

先 dry-run,确认写入项后再运行此命令。

TL;DR
A Python proxy that translates OpenAI-format API calls to 100+ LLM providers for seamless model switching.
§01

What it is

LiteLLM is a Python library and proxy server that provides a unified API interface for calling over 100 LLM providers. You write your code once using the OpenAI SDK format, and LiteLLM translates the call to whichever provider you specify: Anthropic Claude, Google Gemini, Azure OpenAI, AWS Bedrock, Ollama, Groq, Together AI, and many more. Switching between providers is a one-line change in the model name.

The library can be used as a Python SDK (direct import), as a standalone proxy server (OpenAI-compatible endpoint), or as a gateway for managing keys, budgets, and rate limits across teams. It is designed for developers and organizations that use multiple LLM providers and want a consistent interface without vendor lock-in.

§02

How it saves time or tokens

Without LiteLLM, calling different providers requires different SDKs, different request formats, and different response parsing. Switching from OpenAI to Claude means rewriting API calls, changing authentication, and adjusting response handling. LiteLLM handles all provider differences behind a single completion() call.

The proxy server mode adds operational value: centralized API key management, per-user budgets, request logging, model fallbacks (try Claude, fall back to GPT-4 if it fails), and load balancing across multiple model deployments. These features save significant engineering time for teams running LLM operations at scale.

§03

How to use

  1. Install LiteLLM:

```bash

pip install litellm

```

  1. Use as a Python SDK:

```python

from litellm import completion

# Call Claude

response = completion(

model='anthropic/claude-sonnet-4-20250514',

messages=[{'role': 'user', 'content': 'Hello!'}],

)

# Switch to GPT-4 by changing one line

response = completion(

model='gpt-4o',

messages=[{'role': 'user', 'content': 'Hello!'}],

)

```

  1. Or run as a proxy server for team-wide access.
§04

Example

Running LiteLLM as a proxy server:

# Start the proxy
litellm --model anthropic/claude-sonnet-4-20250514

# The proxy listens on localhost:4000
# Any OpenAI-compatible client can connect
curl http://localhost:4000/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "anthropic/claude-sonnet-4-20250514",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Proxy configuration with fallbacks and load balancing:

# litellm_config.yaml
model_list:
  - model_name: default
    litellm_params:
      model: anthropic/claude-sonnet-4-20250514
      api_key: sk-ant-...
  - model_name: default
    litellm_params:
      model: gpt-4o
      api_key: sk-...

router_settings:
  routing_strategy: simple-shuffle  # load balance
  num_retries: 2                    # fallback on failure
FeatureSDK ModeProxy Mode
Provider translationYesYes
StreamingYesYes
FallbacksYesYes
Budget managementNoYes
Key managementNoYes
Request loggingBasicFull
Team access controlNoYes
§05

Related on TokRepo

§06

Common pitfalls

  • Not setting the correct environment variable for each provider. LiteLLM reads API keys from environment variables: ANTHROPIC_API_KEY for Claude, OPENAI_API_KEY for OpenAI, GEMINI_API_KEY for Google. Missing keys produce authentication errors that may be confusing when switching providers.
  • Assuming all providers support all features. Some providers do not support streaming, function calling, vision inputs, or JSON mode. LiteLLM translates what it can, but if a provider does not support a feature, the call fails. Check the provider's capabilities before relying on advanced features.
  • Running the proxy without rate limiting in production. Without rate limits, a single client can exhaust your API budget. Configure per-user or per-team budgets in the proxy config to prevent runaway costs.

常见问题

How many LLM providers does LiteLLM support?+

LiteLLM supports over 100 providers including OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Google Vertex AI, Cohere, Mistral, Groq, Together AI, Perplexity, Ollama, vLLM, and many more. The full list is maintained in the LiteLLM documentation and grows with each release.

Can I use LiteLLM with existing OpenAI SDK code?+

Yes. The proxy server is fully OpenAI-compatible. Point your existing OpenAI SDK client to http://localhost:4000 instead of api.openai.com, and your code works without modification. The proxy translates the request to whichever backend model you configure.

Does LiteLLM support streaming responses?+

Yes. Pass stream=True in the completion call, and LiteLLM streams responses from any supported provider. The streaming format follows the OpenAI SSE format, so any client that handles OpenAI streaming works with LiteLLM streams from any provider.

How do fallbacks work?+

Configure multiple models for the same model_name in the proxy config. If the primary model fails (rate limit, error, timeout), LiteLLM automatically retries with the next model in the list. This provides automatic resilience against provider outages without changing client code.

Is LiteLLM suitable for production use?+

Yes. The proxy mode is designed for production with features like key management, budget controls, request logging, and automatic retries. Deploy the proxy as a Docker container or systemd service. Many organizations use LiteLLM as their central LLM gateway for team-wide access.

引用来源 (3)
🙏

来源与感谢

Created by BerriAI. Licensed under MIT. BerriAI/litellm — 42,000+ GitHub stars

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产