AI Gateway

Portkey — 自带 Prompt 管理与可观测的 AI 网关

Portkey 是端到端的 LLM 控制平面：网关做路由和容灾、Prompt 管理做版本、可观测套件做成本追踪与护栏——统一 API 暴露。

Why Portkey

Portkey is the answer to "I want one product, not four". Its single API covers the full production LLM concerns: routing (pick the right model for each request), reliability (retries, fallbacks, load balancing across providers), prompt management (versioned prompts with A/B testing), observability (traces, cost breakdown, user attribution), and guardrails (PII redaction, schema enforcement).

The bet is that these concerns are intertwined enough that separating them across tools creates integration debt. In practice teams that adopt Portkey replace 3-4 point tools with one control plane, and the savings in wiring and data-model drift are real.

The counter-bet is vendor lock-in: your prompt registry, traces, and routing configs all live in Portkey. Their self-hosted option (the Gateway is Apache 2.0 open source) mitigates this for the inline path, but the SaaS-side features (prompt management, analytics UI) are proprietary. For teams that want pure OSS, LiteLLM + Langfuse is the standard alternative.

Quick Start — OpenAI SDK + Portkey Headers

virtual_key is Portkey’s per-provider key vault — you rotate keys centrally instead of redeploying apps. config is a JSON policy (fallback, retry, cache, load-balance, guardrails) applied inline. metadata lets you attribute every request to user/team/feature for cost analysis.

# pip install portkey-ai
from portkey_ai import Portkey

client = Portkey(
    api_key="pk-...",
    virtual_key="openai-prod",   # maps to your OpenAI key in Portkey vault
    config={
        # Fallback: try Claude first, then OpenAI on error
        "strategy": {"mode": "fallback"},
        "targets": [
            {"virtual_key": "anthropic-prod", "override_params": {"model": "claude-3-5-sonnet-20241022"}},
            {"virtual_key": "openai-prod", "override_params": {"model": "gpt-4o-mini"}},
        ],
        # Cache identical requests for 10 minutes
        "cache": {"mode": "simple", "max_age": 600},
    },
)

resp = client.chat.completions.create(
    messages=[{"role": "user", "content": "Why is AI gateway a category?"}],
    # Attach custom metadata for later filtering in dashboards
    metadata={"user_id": "william", "tier": "pro"},
)
print(resp.choices[0].message.content)

# Portkey dashboard now shows: latency, token cost, which target served,
# cache hit/miss, and a full prompt/response trace.

核心能力

Virtual keys

Store provider keys in Portkey vault; your app only sees virtual keys. Rotate, disable, or swap providers without redeploy.

Strategy-based routing

Declarative JSON configs for fallback, retry, load-balance, conditional routing. No custom code — change strategy in dashboard, gateway picks it up.

Prompt registry with versioning

Store prompts as first-class resources with version history and A/B test support. Reference by ID from code; edit without redeploy.

Cost & user attribution

Every request carries metadata. Dashboards break down spend by user, team, prompt, or model. Essential for per-tenant pricing and cost allocation.

Guardrails

Built-in PII redaction, JSON schema validation, profanity detection, competitor mentions. Wrap calls with guardrail configs; violations are logged and optionally blocked.

OSS gateway + paid cloud

The core Gateway is Apache 2.0 — self-host for compliance. Portkey Cloud adds prompt management, observability UI, and team features.

对比

	Scope	Deployment	Prompt Mgmt	Self-host Option
Portkey本工具	Gateway + observability + prompts + guardrails	Cloud + self-host	Yes (first-class)	Gateway OSS; cloud UI proprietary
Cloudflare AI Gateway	Gateway + basic logs	Managed only	No	No
LiteLLM + Langfuse	Gateway (LiteLLM) + observability (Langfuse)	Self-host both	Via Langfuse	Yes (both OSS)
Kong AI Gateway	Enterprise gateway	Self-host	Via plugins	Enterprise

实际用例

01. Multi-team organizations

Central platform team runs Portkey; product teams hit it with virtual keys. Policy (which models, what cost caps) enforced centrally; teams ship autonomously.

02. Regulated industries

Guardrails + self-hosted Gateway + audit logs meet common compliance requirements (PII redaction, tenant isolation). The SaaS-side features remain optional.

03. Prompt-heavy applications

Products with dozens or hundreds of distinct prompts benefit enormously from versioning and A/B testing. Portkey’s prompt registry is the most mature in the gateway category.

价格与许可

Portkey Gateway OSS: Apache 2.0. Self-host for free. Includes all routing, caching, and guardrail logic. Does not include Portkey Cloud UI or prompt registry.

Portkey Cloud: free dev tier, then paid plans by request volume. Enterprise tier adds SSO, SOC 2, on-prem deploy, and dedicated support. See portkey.ai/pricing.

What you save: organizations typically replace Helicone + a prompt management tool + custom routing code with Portkey. The ROI calculation is usually about engineering hours saved, not per-request cost.

常见问题

Portkey vs Cloudflare AI Gateway?+

Cloudflare is free, edge-fast, and observability-light. Portkey is paid but broader (prompt management, guardrails, deeper observability). Rule of thumb: if Cloudflare’s logs answer your questions, stay there; if you want to manage prompts or enforce guardrails, Portkey is worth the migration.

Can I use only the OSS Portkey Gateway?+

Yes. The Gateway is Apache 2.0 on GitHub. You lose the Cloud UI, prompt registry, and analytics — but the inline routing, retry, fallback, caching, and guardrail logic are all in the OSS binary.

Does Portkey support local / self-hosted models?+

Yes. Any OpenAI-compatible endpoint (Ollama, vLLM, LM Studio, Together AI, Fireworks, Anyscale) works as a target. You can route between managed and self-hosted models based on request metadata.

How does Portkey’s observability compare to Langfuse?+

Portkey is broader but shallower. It covers traces, costs, user attribution, and dashboards — sufficient for most teams. Langfuse goes deeper on nested spans, evaluation loops, and dataset-based testing. Heavy evaluation users pair Portkey gateway with Langfuse traces.

Is there a latency overhead?+

Typically 5-15ms added on the hot path — the proxy does some policy evaluation and metric emission. Cache hits save hundreds of ms, so net latency is usually neutral or better on realistic traffic.