AI Gateway

Portkey — AI Gateway with Prompt Management & Observability

Portkey is an end-to-end LLM control plane: gateway for routing and fallback, prompt manager for versioning, and an observability suite with cost tracking and guardrails — all behind a single API.

Official Site GitHub

Why Portkey

Portkey is the answer to "I want one product, not four". Its single API covers the full production LLM concerns: routing (pick the right model for each request), reliability (retries, fallbacks, load balancing across providers), prompt management (versioned prompts with A/B testing), observability (traces, cost breakdown, user attribution), and guardrails (PII redaction, schema enforcement).

The bet is that these concerns are intertwined enough that separating them across tools creates integration debt. In practice teams that adopt Portkey replace 3-4 point tools with one control plane, and the savings in wiring and data-model drift are real.

The counter-bet is vendor lock-in: your prompt registry, traces, and routing configs all live in Portkey. Their self-hosted option (the Gateway is Apache 2.0 open source) mitigates this for the inline path, but the SaaS-side features (prompt management, analytics UI) are proprietary. For teams that want pure OSS, LiteLLM + Langfuse is the standard alternative.

Quick Start — OpenAI SDK + Portkey Headers

virtual_key is Portkey’s per-provider key vault — you rotate keys centrally instead of redeploying apps. config is a JSON policy (fallback, retry, cache, load-balance, guardrails) applied inline. metadata lets you attribute every request to user/team/feature for cost analysis.

# pip install portkey-ai
from portkey_ai import Portkey

client = Portkey(
    api_key="pk-...",
    virtual_key="openai-prod",   # maps to your OpenAI key in Portkey vault
    config={
        # Fallback: try Claude first, then OpenAI on error
        "strategy": {"mode": "fallback"},
        "targets": [
            {"virtual_key": "anthropic-prod", "override_params": {"model": "claude-3-5-sonnet-20241022"}},
            {"virtual_key": "openai-prod", "override_params": {"model": "gpt-4o-mini"}},
        ],
        # Cache identical requests for 10 minutes
        "cache": {"mode": "simple", "max_age": 600},
    },
)

resp = client.chat.completions.create(
    messages=[{"role": "user", "content": "Why is AI gateway a category?"}],
    # Attach custom metadata for later filtering in dashboards
    metadata={"user_id": "william", "tier": "pro"},
)
print(resp.choices[0].message.content)

# Portkey dashboard now shows: latency, token cost, which target served,
# cache hit/miss, and a full prompt/response trace.

Key Features

Virtual keys

Store provider keys in Portkey vault; your app only sees virtual keys. Rotate, disable, or swap providers without redeploy.

Strategy-based routing

Declarative JSON configs for fallback, retry, load-balance, conditional routing. No custom code — change strategy in dashboard, gateway picks it up.

Prompt registry with versioning

Store prompts as first-class resources with version history and A/B test support. Reference by ID from code; edit without redeploy.

Cost & user attribution

Every request carries metadata. Dashboards break down spend by user, team, prompt, or model. Essential for per-tenant pricing and cost allocation.

Guardrails

Built-in PII redaction, JSON schema validation, profanity detection, competitor mentions. Wrap calls with guardrail configs; violations are logged and optionally blocked.

OSS gateway + paid cloud

The core Gateway is Apache 2.0 — self-host for compliance. Portkey Cloud adds prompt management, observability UI, and team features.

Comparison

	Scope	Deployment	Prompt Mgmt	Self-host Option
Portkeythis	Gateway + observability + prompts + guardrails	Cloud + self-host	Yes (first-class)	Gateway OSS; cloud UI proprietary
Cloudflare AI Gateway	Gateway + basic logs	Managed only	No	No
LiteLLM + Langfuse	Gateway (LiteLLM) + observability (Langfuse)	Self-host both	Via Langfuse	Yes (both OSS)
Kong AI Gateway	Enterprise gateway	Self-host	Via plugins	Enterprise

Use Cases

01. Multi-team organizations

Central platform team runs Portkey; product teams hit it with virtual keys. Policy (which models, what cost caps) enforced centrally; teams ship autonomously.

02. Regulated industries

Guardrails + self-hosted Gateway + audit logs meet common compliance requirements (PII redaction, tenant isolation). The SaaS-side features remain optional.

03. Prompt-heavy applications

Products with dozens or hundreds of distinct prompts benefit enormously from versioning and A/B testing. Portkey’s prompt registry is the most mature in the gateway category.

Pricing & License

Portkey Gateway OSS: Apache 2.0. Self-host for free. Includes all routing, caching, and guardrail logic. Does not include Portkey Cloud UI or prompt registry.

Portkey Cloud: free dev tier, then paid plans by request volume. Enterprise tier adds SSO, SOC 2, on-prem deploy, and dedicated support. See portkey.ai/pricing.

What you save: organizations typically replace Helicone + a prompt management tool + custom routing code with Portkey. The ROI calculation is usually about engineering hours saved, not per-request cost.

Related Assets on TokRepo

Portkey AI Gateway — Route to 250+ LLMs

Portkey AI Gateway routes to 250+ LLMs with sub-1ms latency, 40+ guardrails, retries, fallbacks, and caching. 11.1K+ stars. Apache 2.0.

LLM Gateway Comparison — Proxy Your AI Requests

Compare top LLM gateway and proxy tools for routing AI requests. Covers LiteLLM, Bifrost, Portkey, and OpenRouter for cost optimization, failover, and multi-provider access.

Frequently Asked Questions

Portkey vs Cloudflare AI Gateway?+

Cloudflare is free, edge-fast, and observability-light. Portkey is paid but broader (prompt management, guardrails, deeper observability). Rule of thumb: if Cloudflare’s logs answer your questions, stay there; if you want to manage prompts or enforce guardrails, Portkey is worth the migration.

Can I use only the OSS Portkey Gateway?+

Yes. The Gateway is Apache 2.0 on GitHub. You lose the Cloud UI, prompt registry, and analytics — but the inline routing, retry, fallback, caching, and guardrail logic are all in the OSS binary.

Does Portkey support local / self-hosted models?+

Yes. Any OpenAI-compatible endpoint (Ollama, vLLM, LM Studio, Together AI, Fireworks, Anyscale) works as a target. You can route between managed and self-hosted models based on request metadata.

How does Portkey’s observability compare to Langfuse?+

Portkey is broader but shallower. It covers traces, costs, user attribution, and dashboards — sufficient for most teams. Langfuse goes deeper on nested spans, evaluation loops, and dataset-based testing. Heavy evaluation users pair Portkey gateway with Langfuse traces.

Is there a latency overhead?+

Typically 5-15ms added on the hot path — the proxy does some policy evaluation and metric emission. Cache hits save hundreds of ms, so net latency is usually neutral or better on realistic traffic.

Compare Alternatives

Cloudflare AI Gateway — Edge Proxy for LLM Traffic LiteLLM — Open-source LLM Proxy for 100+ Providers Helicone — Zero-Code LLM Observability Platform Langfuse — Open-source LLM Engineering Platform