What is Helicone?
Helicone is an open-source LLM observability platform that works as a proxy between your app and LLM providers. With a one-line base URL change (no SDK needed), you get request logging, cost tracking, latency metrics, caching, rate limiting, and prompt versioning — for any LLM provider.
Answer-Ready: Helicone is an open-source LLM observability platform. One-line proxy integration (change base URL, no SDK) for request logging, cost tracking, caching, rate limiting, and prompt versioning across OpenAI, Anthropic, and all providers. 5k+ GitHub stars.
Best for: Teams running LLM apps in production who need observability without code changes. Works with: OpenAI, Anthropic, Google, Azure, any OpenAI-compatible API. Setup time: Under 1 minute.
Core Features
1. Zero-SDK Integration
Just change the base URL:
# OpenAI
client = OpenAI(base_url="https://oai.helicone.ai/v1")
# Anthropic
client = Anthropic(base_url="https://anthropic.helicone.ai")
# Azure OpenAI
client = AzureOpenAI(azure_endpoint="https://oai.helicone.ai")2. Request Dashboard
Real-time dashboard showing:
- All requests with input/output
- Latency percentiles (p50, p95, p99)
- Token usage per model
- Cost breakdown per user/feature
- Error rates and patterns
- Geographic distribution
3. Cost Tracking
Dashboard view:
Today: $42.50 (1,250 requests)
This week: $285.30 (8,700 requests)
By model:
gpt-4o: $180 (40%)
claude-sonnet: $85 (30%)
gpt-4o-mini: $20 (30%)4. Caching
# Enable caching with a header
client = OpenAI(
base_url="https://oai.helicone.ai/v1",
default_headers={
"Helicone-Auth": "Bearer hlc-...",
"Helicone-Cache-Enabled": "true",
},
)
# Identical requests return cached results instantly5. Rate Limiting
headers = {
"Helicone-RateLimit-Policy": "10;w=60", # 10 requests per 60 seconds
}6. Custom Properties
headers = {
"Helicone-Property-User": "user-123",
"Helicone-Property-Feature": "chat",
"Helicone-Property-Environment": "production",
}
# Filter and group by these properties in the dashboard7. Prompt Versioning
headers = {
"Helicone-Prompt-Id": "customer-support-v3",
}
# Track performance per prompt versionSelf-Hosting
git clone https://github.com/Helicone/helicone
docker compose up -d
# Dashboard at http://localhost:3000FAQ
Q: Does it add latency? A: Helicone proxy adds < 50ms. Requests are logged asynchronously.
Q: Is my data safe? A: Self-host for full data control. Cloud version is SOC 2 Type II compliant.
Q: Can I use it with Anthropic Claude?
A: Yes, change base URL to https://anthropic.helicone.ai.