为什么选它
Portkey is the answer to "I want one product, not four". Its single API covers the full production LLM concerns: routing (pick the right model for each request), reliability (retries, fallbacks, load balancing across providers), prompt management (versioned prompts with A/B testing), observability (traces, cost breakdown, user attribution), and guardrails (PII redaction, schema enforcement).
The bet is that these concerns are intertwined enough that separating them across tools creates integration debt. In practice teams that adopt Portkey replace 3-4 point tools with one control plane, and the savings in wiring and data-model drift are real.
The counter-bet is vendor lock-in: your prompt registry, traces, and routing configs all live in Portkey. Their self-hosted option (the Gateway is Apache 2.0 open source) mitigates this for the inline path, but the SaaS-side features (prompt management, analytics UI) are proprietary. For teams that want pure OSS, LiteLLM + Langfuse is the standard alternative.
Quick Start — OpenAI SDK + Portkey Headers
virtual_key is Portkey’s per-provider key vault — you rotate keys centrally instead of redeploying apps. config is a JSON policy (fallback, retry, cache, load-balance, guardrails) applied inline. metadata lets you attribute every request to user/team/feature for cost analysis.
# pip install portkey-ai
from portkey_ai import Portkey
client = Portkey(
api_key="pk-...",
virtual_key="openai-prod", # maps to your OpenAI key in Portkey vault
config={
# Fallback: try Claude first, then OpenAI on error
"strategy": {"mode": "fallback"},
"targets": [
{"virtual_key": "anthropic-prod", "override_params": {"model": "claude-3-5-sonnet-20241022"}},
{"virtual_key": "openai-prod", "override_params": {"model": "gpt-4o-mini"}},
],
# Cache identical requests for 10 minutes
"cache": {"mode": "simple", "max_age": 600},
},
)
resp = client.chat.completions.create(
messages=[{"role": "user", "content": "Why is AI gateway a category?"}],
# Attach custom metadata for later filtering in dashboards
metadata={"user_id": "william", "tier": "pro"},
)
print(resp.choices[0].message.content)
# Portkey dashboard now shows: latency, token cost, which target served,
# cache hit/miss, and a full prompt/response trace.核心能力
Virtual keys
Store provider keys in Portkey vault; your app only sees virtual keys. Rotate, disable, or swap providers without redeploy.
Strategy-based routing
Declarative JSON configs for fallback, retry, load-balance, conditional routing. No custom code — change strategy in dashboard, gateway picks it up.
Prompt registry with versioning
Store prompts as first-class resources with version history and A/B test support. Reference by ID from code; edit without redeploy.
Cost & user attribution
Every request carries metadata. Dashboards break down spend by user, team, prompt, or model. Essential for per-tenant pricing and cost allocation.
Guardrails
Built-in PII redaction, JSON schema validation, profanity detection, competitor mentions. Wrap calls with guardrail configs; violations are logged and optionally blocked.
OSS gateway + paid cloud
The core Gateway is Apache 2.0 — self-host for compliance. Portkey Cloud adds prompt management, observability UI, and team features.
对比
| Scope | Deployment | Prompt Mgmt | Self-host Option | |
|---|---|---|---|---|
| Portkeythis | Gateway + observability + prompts + guardrails | Cloud + self-host | Yes (first-class) | Gateway OSS; cloud UI proprietary |
| Cloudflare AI Gateway | Gateway + basic logs | Managed only | No | No |
| LiteLLM + Langfuse | Gateway (LiteLLM) + observability (Langfuse) | Self-host both | Via Langfuse | Yes (both OSS) |
| Kong AI Gateway | Enterprise gateway | Self-host | Via plugins | Enterprise |
实际用例
01. Multi-team organizations
Central platform team runs Portkey; product teams hit it with virtual keys. Policy (which models, what cost caps) enforced centrally; teams ship autonomously.
02. Regulated industries
Guardrails + self-hosted Gateway + audit logs meet common compliance requirements (PII redaction, tenant isolation). The SaaS-side features remain optional.
03. Prompt-heavy applications
Products with dozens or hundreds of distinct prompts benefit enormously from versioning and A/B testing. Portkey’s prompt registry is the most mature in the gateway category.
价格与许可
Portkey Gateway OSS: Apache 2.0. Self-host for free. Includes all routing, caching, and guardrail logic. Does not include Portkey Cloud UI or prompt registry.
Portkey Cloud: free dev tier, then paid plans by request volume. Enterprise tier adds SSO, SOC 2, on-prem deploy, and dedicated support. See portkey.ai/pricing.
What you save: organizations typically replace Helicone + a prompt management tool + custom routing code with Portkey. The ROI calculation is usually about engineering hours saved, not per-request cost.
相关 TokRepo 资产
Portkey AI Gateway — Route to 250+ LLMs
Portkey AI Gateway routes to 250+ LLMs with sub-1ms latency, 40+ guardrails, retries, fallbacks, and caching. 11.1K+ stars. Apache 2.0.
LLM Gateway Comparison — Proxy Your AI Requests
Compare top LLM gateway and proxy tools for routing AI requests. Covers LiteLLM, Bifrost, Portkey, and OpenRouter for cost optimization, failover, and multi-provider access.
Portkey AI Gateway — Unified API for 200+ LLMs
Route, load-balance, and fallback across 200+ LLMs with a single API. Built-in caching, guardrails, observability, and budget controls for production AI apps.
常见问题
Portkey vs Cloudflare AI Gateway?+
Cloudflare is free, edge-fast, and observability-light. Portkey is paid but broader (prompt management, guardrails, deeper observability). Rule of thumb: if Cloudflare’s logs answer your questions, stay there; if you want to manage prompts or enforce guardrails, Portkey is worth the migration.
Can I use only the OSS Portkey Gateway?+
Yes. The Gateway is Apache 2.0 on GitHub. You lose the Cloud UI, prompt registry, and analytics — but the inline routing, retry, fallback, caching, and guardrail logic are all in the OSS binary.
Does Portkey support local / self-hosted models?+
Yes. Any OpenAI-compatible endpoint (Ollama, vLLM, LM Studio, Together AI, Fireworks, Anyscale) works as a target. You can route between managed and self-hosted models based on request metadata.
How does Portkey’s observability compare to Langfuse?+
Portkey is broader but shallower. It covers traces, costs, user attribution, and dashboards — sufficient for most teams. Langfuse goes deeper on nested spans, evaluation loops, and dataset-based testing. Heavy evaluation users pair Portkey gateway with Langfuse traces.
Is there a latency overhead?+
Typically 5-15ms added on the hot path — the proxy does some policy evaluation and metric emission. Cache hits save hundreds of ms, so net latency is usually neutral or better on realistic traffic.