# Helicone — LLM Observability and Prompt Management

> Open-source LLM observability platform. One-line proxy integration for request logging, cost tracking, caching, rate limiting, and prompt versioning across all providers.

## Install

Paste the prompt below into your AI tool:

## Quick Use

```python
# Just change the base URL — no SDK needed
from openai import OpenAI

client = OpenAI(
    api_key="sk-...",
    base_url="https://oai.helicone.ai/v1",
    default_headers={"Helicone-Auth": "Bearer hlc-..."},
)

# All calls are now logged and tracked
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)
```

## What is Helicone?

Helicone is an open-source LLM observability platform that works as a proxy between your app and LLM providers. With a one-line base URL change (no SDK needed), you get request logging, cost tracking, latency metrics, caching, rate limiting, and prompt versioning — for any LLM provider.

**Answer-Ready**: Helicone is an open-source LLM observability platform. One-line proxy integration (change base URL, no SDK) for request logging, cost tracking, caching, rate limiting, and prompt versioning across OpenAI, Anthropic, and all providers. 5k+ GitHub stars.

**Best for**: Teams running LLM apps in production who need observability without code changes. **Works with**: OpenAI, Anthropic, Google, Azure, any OpenAI-compatible API. **Setup time**: Under 1 minute.

## Core Features

### 1. Zero-SDK Integration
Just change the base URL:

```python
# OpenAI
client = OpenAI(base_url="https://oai.helicone.ai/v1")

# Anthropic
client = Anthropic(base_url="https://anthropic.helicone.ai")

# Azure OpenAI
client = AzureOpenAI(azure_endpoint="https://oai.helicone.ai")
```

### 2. Request Dashboard
Real-time dashboard showing:
- All requests with input/output
- Latency percentiles (p50, p95, p99)
- Token usage per model
- Cost breakdown per user/feature
- Error rates and patterns
- Geographic distribution

### 3. Cost Tracking

```
Dashboard view:
  Today:     $42.50 (1,250 requests)
  This week: $285.30 (8,700 requests)
  By model:
    gpt-4o:        $180 (40%)
    claude-sonnet:  $85 (30%)
    gpt-4o-mini:    $20 (30%)
```

### 4. Caching

```python
# Enable caching with a header
client = OpenAI(
    base_url="https://oai.helicone.ai/v1",
    default_headers={
        "Helicone-Auth": "Bearer hlc-...",
        "Helicone-Cache-Enabled": "true",
    },
)
# Identical requests return cached results instantly
```

### 5. Rate Limiting

```python
headers = {
    "Helicone-RateLimit-Policy": "10;w=60",  # 10 requests per 60 seconds
}
```

### 6. Custom Properties

```python
headers = {
    "Helicone-Property-User": "user-123",
    "Helicone-Property-Feature": "chat",
    "Helicone-Property-Environment": "production",
}
# Filter and group by these properties in the dashboard
```

### 7. Prompt Versioning

```python
headers = {
    "Helicone-Prompt-Id": "customer-support-v3",
}
# Track performance per prompt version
```

## Self-Hosting

```bash
git clone https://github.com/Helicone/helicone
docker compose up -d
# Dashboard at http://localhost:3000
```

## FAQ

**Q: Does it add latency?**
A: Helicone proxy adds < 50ms. Requests are logged asynchronously.

**Q: Is my data safe?**
A: Self-host for full data control. Cloud version is SOC 2 Type II compliant.

**Q: Can I use it with Anthropic Claude?**
A: Yes, change base URL to `https://anthropic.helicone.ai`.

## Source & Thanks

> Created by [Helicone](https://github.com/Helicone). Licensed under Apache 2.0.
>
> [Helicone/helicone](https://github.com/Helicone/helicone) — 5k+ stars

<!-- ZH -->


## Quick Start

```python
# Just change the base_url — no SDK needed
client = OpenAI(base_url="https://oai.helicone.ai/v1")
```

Enable LLM observability in one line of code.

## What is Helicone?

Open-source LLM observability platform with one-line proxy integration. Includes request logging, cost tracking, caching, rate limiting, and prompt version management.

**In one sentence**: Open-source LLM observability platform with one-line proxy integration (just change the base URL) — covers logging, cost, caching, and rate limiting — 5k+ GitHub stars.

**For**: Teams running LLM applications in production who need non-invasive observability.

## Core Features

### 1. Zero-SDK Integration
Just change the base URL — supports OpenAI, Anthropic, Azure.

### 2. Real-Time Dashboard
Requests, latency, cost, and error rate at a glance.

### 3. Caching and Rate Limiting
Enable via headers — no code changes needed.

### 4. Self-Hostable
Deploy with Docker Compose in one command.

## FAQ

**Q: Latency overhead?**
A: Proxy adds < 50ms with async logging.

**Q: Does it support Claude?**
A: Yes — change base URL to `anthropic.helicone.ai`.

## Source & Thanks

> [Helicone/helicone](https://github.com/Helicone/helicone) — 5k+ stars, Apache 2.0

---
Source: https://tokrepo.com/en/workflows/helicone-llm-observability-prompt-management-8a35faad
Author: Helicone