# Datadog LLM Observability — Trace Cost, Latency, Drift

> Datadog LLM Observability traces OpenAI / Anthropic / Bedrock calls, tracks per-user cost, surfaces drift. Dashboards and span-level prompt view.

## Install

Copy the content below into your project:

## Quick Use

1. `pip install ddtrace`
2. Set DD_LLMOBS_ENABLED=1, DD_LLMOBS_ML_APP, DD_API_KEY, DD_SITE
3. `patch(openai=True)` — every call now traces to Datadog

---

## Intro

Datadog LLM Observability (formerly LLM Monitoring) is a turn-key tracing layer for AI apps that already live in Datadog. Drop the ddtrace SDK in, every OpenAI / Anthropic / Bedrock / LangChain call generates a span with prompt, completion, cost, latency, model name, user, and session ID. Built-in dashboards for top-cost users, p95 latency by model, error rate, and drift detection. Best for: teams with Datadog APM/logs already wired into product; enterprise security review where prompt logging needs central retention. Works with: Python ddtrace, Node dd-trace, OpenTelemetry exporter for any language. Setup time: 10 minutes.

---

### Python install

```bash
pip install ddtrace
```

### Auto-instrument OpenAI

```python
import os, ddtrace
from ddtrace import patch
patch(openai=True)

os.environ["DD_LLMOBS_ENABLED"] = "1"
os.environ["DD_LLMOBS_ML_APP"]  = "my-rag-app"
os.environ["DD_API_KEY"]        = "..."
os.environ["DD_SITE"]           = "datadoghq.com"

# Now use OpenAI normally — every call gets traced
from openai import OpenAI
OpenAI().chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain BPE tokenization"}],
)
```

### Tag traces with user / session

```python
from ddtrace.llmobs import LLMObs

with LLMObs.workflow(name="support_chat", session_id=session_id, user_id=user_id):
    # All LLM calls inside this block carry the session_id and user_id tags
    answer = run_my_rag_pipeline(question)
```

### Custom span for non-instrumented call

```python
@LLMObs.llm(name="custom-call", model_name="gpt-4o", model_provider="openai")
def call_my_proxy(prompt):
    return my_internal_proxy.complete(prompt)
```

### Built-in views (LLM Observability tab)

- **Traces** — every call with prompt, completion, cost, latency
- **Topology** — agent graph showing tools called per request
- **Quality** — eval scores attached to spans (hallucination, toxicity)
- **Cost** — by user / model / session, top spenders
- **Drift** — input topic distribution shift over time
- **Errors** — rate, by model, by application

### OpenTelemetry alternative

If you don't want ddtrace, send OTLP traces to Datadog with the OpenInference semantic conventions — Datadog renders them in the same LLM Observability views.

---

### FAQ

**Q: How does pricing work?**
A: LLM Observability is billed per million spans — a few cents per million. Existing Datadog APM customers can reuse the same agent infra. The first 100M spans/month are typically included in Pro plans.

**Q: Will prompts and completions be stored long-term?**
A: By default yes, with configurable retention (15 / 30 / 90 days). For PII-sensitive prompts, enable scrubbing rules at SDK level (`DD_LLMOBS_SAMPLE_RATE` + custom redactor) so PII is masked before it leaves the host.

**Q: Datadog vs Phoenix vs Langfuse?**
A: Datadog wins if your stack already lives there — same dashboards, alerts, on-call workflows. Phoenix wins for OTel-native portability and free self-host. Langfuse wins for prompt management + cheap self-host.

---

## Source & Thanks

> Built by [Datadog](https://github.com/DataDog). Docs at [docs.datadoghq.com/llm_observability](https://docs.datadoghq.com/llm_observability).
>
> [DataDog/dd-trace-py](https://github.com/DataDog/dd-trace-py) — ⭐ 700+

---

<!-- ZH -->

## 快速使用

1. `pip install ddtrace`
2. 设 DD_LLMOBS_ENABLED=1、DD_LLMOBS_ML_APP、DD_API_KEY、DD_SITE
3. `patch(openai=True)` —— 之后每次调用都 trace 到 Datadog

---

## 简介

Datadog LLM Observability（前称 LLM Monitoring）是已经活在 Datadog 里的 AI 应用的开箱即用追踪层。装 ddtrace SDK，每次 OpenAI / Anthropic / Bedrock / LangChain 调用产生一个 span，含 prompt、completion、成本、延迟、模型名、用户、session ID。内置仪表盘：最高成本用户、按模型的 p95 延迟、错误率、漂移检测。适合 Datadog APM/日志已经接到产品的团队、需要 prompt 日志中心化保留的企业安全合规。兼容 Python ddtrace、Node dd-trace、任意语言的 OpenTelemetry 导出器。装机时间 10 分钟。

---

### Python 安装

```bash
pip install ddtrace
```

### 自动注入 OpenAI

```python
import os, ddtrace
from ddtrace import patch
patch(openai=True)

os.environ["DD_LLMOBS_ENABLED"] = "1"
os.environ["DD_LLMOBS_ML_APP"]  = "my-rag-app"
os.environ["DD_API_KEY"]        = "..."
os.environ["DD_SITE"]           = "datadoghq.com"

# 之后正常用 OpenAI —— 每次调用都被 trace
from openai import OpenAI
OpenAI().chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "解释 BPE 分词"}],
)
```

### Trace 打 user / session 标签

```python
from ddtrace.llmobs import LLMObs

with LLMObs.workflow(name="support_chat", session_id=session_id, user_id=user_id):
    # 这块儿里所有 LLM 调用都带上 session_id 和 user_id 标签
    answer = run_my_rag_pipeline(question)
```

### 自定义 span（未被注入的调用）

```python
@LLMObs.llm(name="custom-call", model_name="gpt-4o", model_provider="openai")
def call_my_proxy(prompt):
    return my_internal_proxy.complete(prompt)
```

### 内置视图（LLM Observability 标签页）

- **Traces** —— 每次调用，带 prompt、completion、成本、延迟
- **Topology** —— agent 图谱，看每请求调了哪些 tool
- **Quality** —— eval 分数挂到 span 上（幻觉、毒性）
- **Cost** —— 按用户 / 模型 / session，最高花费
- **Drift** —— 输入主题分布随时间偏移
- **Errors** —— 速率、按模型、按应用

### OpenTelemetry 备选

不想用 ddtrace 的话，按 OpenInference 语义约定把 OTLP trace 推到 Datadog —— 同样在 LLM Observability 视图渲染。

---

### FAQ

**Q: 怎么计费？**
A: LLM Observability 按百万 span 计费 —— 每百万几美分。现有 Datadog APM 客户能复用同一 agent 基建。Pro 套餐通常含每月前 1 亿 span。

**Q: prompt 和 completion 会长期存吗？**
A: 默认存，保留期可配（15 / 30 / 90 天）。PII 敏感 prompt 在 SDK 级开 scrubbing 规则（`DD_LLMOBS_SAMPLE_RATE` + 自定义 redactor），让 PII 在离开主机前打码。

**Q: Datadog vs Phoenix vs Langfuse？**
A: 栈已经在 Datadog 的话 Datadog 赢 —— 同样仪表盘、告警、值班工作流。要 OTel 原生可移植 + 免费自托管选 Phoenix。要 prompt 管理 + 便宜自托管选 Langfuse。

---

## 来源与感谢

> Built by [Datadog](https://github.com/DataDog). Docs at [docs.datadoghq.com/llm_observability](https://docs.datadoghq.com/llm_observability).
>
> [DataDog/dd-trace-py](https://github.com/DataDog/dd-trace-py) — ⭐ 700+


---
Source: https://tokrepo.com/en/workflows/datadog-llm-observability-trace-cost-latency-drift
Author: Datadog