LLM Observability

Helicone — 零代码侵入的 LLM 可观测平台

Helicone 是开源 LLM 可观测平台，提供请求日志、成本追踪、用户分析与 prompt 实验——只需改 base URL，一行业务代码不动。

Why Helicone

Helicone’s pitch is "observability without SDK buy-in". Change your OpenAI base URL to helicone.ai’s proxy and every request is logged automatically — no manual spans, no code changes, no new library. This is the fastest way to add production-grade LLM observability to an existing codebase.

The platform then layers analytics on top of logs: per-user cost, per-feature latency, request outliers, model-switch A/B tests. You get most of what Langfuse offers for tracing, without having to instrument your code.

Where Helicone is weaker: deep agentic traces. A single LLM call is a flat log entry; a multi-step agent that calls 10 tools becomes 10 entries without nested relationships. Langfuse and Phoenix go deeper here — Helicone’s async-log model trades trace depth for zero integration cost.

Quick Start — Change One Line

The only required change is base_url + Helicone-Auth header. Property headers (Helicone-User-Id, Helicone-Property-*) power the analytics dashboards. All request/response content is logged by default — configurable redaction for compliance-sensitive data.

# pip install openai
from openai import OpenAI

client = OpenAI(
    api_key="sk-...",
    base_url="https://oai.helicone.ai/v1",          # was api.openai.com
    default_headers={
        "Helicone-Auth": "Bearer sk-helicone-...",
        # Optional — feature/user tagging
        "Helicone-User-Id": "william@example.com",
        "Helicone-Property-Feature": "onboarding-chat",
    },
)

resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hi"}],
)

# Dashboard now shows: latency, token cost, user "william", feature "onboarding-chat".
# Works for Anthropic, Azure, AWS Bedrock, Gemini, Together, etc. — each has
# its own proxy host in Helicone docs.

核心能力

Proxy-based logging

Zero SDK integration. Change base URL, Helicone logs everything. Works with every OpenAI-compatible client, plus native integrations for Anthropic, Azure, Bedrock, Gemini, Together.

User and feature analytics

Tag requests with user IDs and property headers. Dashboards slice latency, cost, and error rates by any dimension.

Prompt experiments

Built-in prompt versioning and A/B tests. Compare output quality and cost across prompt variants on real production traffic.

Caching & rate-limits (optional)

Helicone acts as a gateway too — enable caching or rate limits for your proxy path. Fewer features than a dedicated gateway but useful as a free layer.

OSS + cloud

Apache 2.0. Self-host the full stack for zero license cost; use Helicone Cloud for managed convenience.

Webhooks and alerts

Trigger workflows on cost thresholds, error spikes, or unusual user activity. Integrates with Slack, PagerDuty, and generic webhooks.

对比

	Integration Style	Trace Depth	Prompt Experiments	OSS?
Helicone本工具	Proxy (base URL change)	Per-request (flat)	Yes	Yes
Langfuse	SDK + OTEL	Nested spans + eval loops	Yes	Yes
Arize Phoenix	OpenTelemetry	Span-level + eval	Via playground	Yes
Traceloop	OTEL instrumentation	Span-level	Limited	OSS agent

实际用例

01. Adding observability to legacy code

Apps where you can’t easily rewire SDK calls. Change base URL in one place, every LLM request logged. Common scenario: inherited codebase, contractor code, third-party libraries making LLM calls.

02. Cost allocation in multi-tenant apps

Tag each request with user/tenant ID via headers. Dashboards break down spend per tenant — critical for chargeback or per-tier pricing.

03. Quick prompt A/B tests

Ship prompt variants on a slice of traffic, compare outputs in the Helicone UI, roll winner out. No separate experimentation infra.

价格与许可

Helicone: Apache 2.0 open source. Self-host for free — includes full proxy, analytics, and prompt experiments.

Helicone Cloud: free tier up to ~10K requests/month; usage-based paid plans beyond. Enterprise adds SSO, SOC 2, dedicated support.

Cost note: the proxy path itself is free — you pay for log storage and analytics compute. Self-hosting eliminates even that.

常见问题

Helicone vs Langfuse?+

Helicone: proxy-based, zero code change, flat per-request logs. Langfuse: SDK-based, richer nested traces and evaluation loops. Pick Helicone for speed-to-value, Langfuse for depth.

Does the proxy add latency?+

Typically 10-30ms. Helicone logs asynchronously — once the request clears the proxy, logging happens in background. Hot path latency is dominated by your chosen LLM’s upstream latency, not Helicone.

Can I use Helicone as my only gateway?+

Yes for observability + optional caching. For heavy routing/fallback logic, pair with a dedicated gateway (Portkey, LiteLLM, Cloudflare). Helicone handles the observability; the gateway handles reliability.

Is request content logged by default?+

Yes — prompts and completions are stored. Configure field-level redaction or disable content logging entirely for compliance needs. Self-host if you need data to never leave your network.

Does Helicone support tool calls and streaming?+

Yes. Tool calls are captured as part of the log entry. Streaming responses are buffered and logged after completion — the client still gets SSE stream in real time.