ScriptsApr 7, 2026·1 min read

Portkey AI Gateway — Unified API for 200+ LLMs

Route, load-balance, and fallback across 200+ LLMs with a single API. Built-in caching, guardrails, observability, and budget controls for production AI apps.

SC
Script Depot · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

pip install portkey-ai
from portkey_ai import Portkey

client = Portkey(
    api_key="YOUR_PORTKEY_KEY",
    virtual_key="openai-xxx",  # Your OpenAI key stored in Portkey
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

What is Portkey?

Portkey is an AI gateway that sits between your app and LLM providers. It provides a unified OpenAI-compatible API to route requests across 200+ models with automatic fallbacks, load balancing, caching, and cost tracking.

Answer-Ready: Portkey AI Gateway is a unified API layer for 200+ LLMs that provides routing, fallbacks, caching, guardrails, and cost tracking for production AI applications.

Core Features

1. Automatic Fallbacks

from portkey_ai import Portkey, createHeaders

client = Portkey(
    api_key="YOUR_KEY",
    config={
        "strategy": {"mode": "fallback"},
        "targets": [
            {"virtual_key": "openai-key", "override_params": {"model": "gpt-4o"}},
            {"virtual_key": "anthropic-key", "override_params": {"model": "claude-sonnet-4-20250514"}},
        ],
    },
)

2. Load Balancing

config = {
    "strategy": {"mode": "loadbalance"},
    "targets": [
        {"virtual_key": "openai-1", "weight": 0.7},
        {"virtual_key": "openai-2", "weight": 0.3},
    ],
}

3. Semantic Caching

Cache similar requests to reduce costs and latency:

config = {
    "cache": {"mode": "semantic", "max_age": 3600}
}

4. Guardrails

Add input/output checks:

config = {
    "input_guardrails": ["pii-detection", "prompt-injection"],
    "output_guardrails": ["toxicity-check"],
}

5. Cost & Usage Tracking

Real-time dashboard showing spend per model, per user, per feature.

Supported Providers

OpenAI, Anthropic, Google Gemini, Mistral, Cohere, Azure OpenAI, AWS Bedrock, Groq, Together AI, Fireworks, Ollama, and 190+ more.

FAQ

Q: Is it open source? A: Yes, the gateway is open source. Managed cloud version available.

Q: Latency overhead? A: < 5ms for direct routing. Semantic caching adds ~10ms but saves full LLM call.

Q: OpenAI SDK compatible? A: Yes, drop-in replacement — change base URL and add Portkey headers.

🙏

Source & Thanks

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets