What is Datadog LLM Observability — Trace Cost, Latency, Drift?

Datadog LLM Observability traces OpenAI / Anthropic / Bedrock calls, tracks per-user cost, surfaces drift. Dashboards and span-level prompt view.

Is Datadog LLM Observability — Trace Cost, Latency, Drift free to use?

Yes. Datadog LLM Observability — Trace Cost, Latency, Drift is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Datadog LLM Observability — Trace Cost, Latency, Drift?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Datadog LLM Observability — Trace Cost, Latency, Drift

Name: Datadog LLM Observability — Trace Cost, Latency, Drift
Author: Datadog

import os, ddtrace from ddtrace import patch patch(openai=True) os.environ["DD_LLMOBS_ENABLED"] = "1" os.environ["DD_LLMOBS_ML_APP"] = "my-rag-app" os.environ["DD_API_KEY"] = "..." os.environ["DD_SITE"] = "datadoghq.com" # Now use OpenAI normally — every call gets traced from openai import OpenAI OpenAI().chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Explain BPE tokenization"}], )

from ddtrace.llmobs import LLMObs with LLMObs.workflow(name="support_chat", session_id=session_id, user_id=user_id): # All LLM calls inside this block carry the session_id and user_id tags answer = run_my_rag_pipeline(question)

Quick Use

pip install ddtrace
Set DD_LLMOBS_ENABLED=1, DD_LLMOBS_ML_APP, DD_API_KEY, DD_SITE
patch(openai=True) — every call now traces to Datadog

Intro

Datadog LLM Observability (formerly LLM Monitoring) is a turn-key tracing layer for AI apps that already live in Datadog. Drop the ddtrace SDK in, every OpenAI / Anthropic / Bedrock / LangChain call generates a span with prompt, completion, cost, latency, model name, user, and session ID. Built-in dashboards for top-cost users, p95 latency by model, error rate, and drift detection. Best for: teams with Datadog APM/logs already wired into product; enterprise security review where prompt logging needs central retention. Works with: Python ddtrace, Node dd-trace, OpenTelemetry exporter for any language. Setup time: 10 minutes.

Python install

pip install ddtrace

Auto-instrument OpenAI

import os, ddtrace
from ddtrace import patch
patch(openai=True)

os.environ["DD_LLMOBS_ENABLED"] = "1"
os.environ["DD_LLMOBS_ML_APP"]  = "my-rag-app"
os.environ["DD_API_KEY"]        = "..."
os.environ["DD_SITE"]           = "datadoghq.com"

# Now use OpenAI normally — every call gets traced
from openai import OpenAI
OpenAI().chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain BPE tokenization"}],
)

Tag traces with user / session

from ddtrace.llmobs import LLMObs

with LLMObs.workflow(name="support_chat", session_id=session_id, user_id=user_id):
    # All LLM calls inside this block carry the session_id and user_id tags
    answer = run_my_rag_pipeline(question)

Custom span for non-instrumented call

@LLMObs.llm(name="custom-call", model_name="gpt-4o", model_provider="openai")
def call_my_proxy(prompt):
    return my_internal_proxy.complete(prompt)

Built-in views (LLM Observability tab)

Traces — every call with prompt, completion, cost, latency
Topology — agent graph showing tools called per request
Quality — eval scores attached to spans (hallucination, toxicity)
Cost — by user / model / session, top spenders
Drift — input topic distribution shift over time
Errors — rate, by model, by application

OpenTelemetry alternative

If you don't want ddtrace, send OTLP traces to Datadog with the OpenInference semantic conventions — Datadog renders them in the same LLM Observability views.

FAQ

Q: How does pricing work? A: LLM Observability is billed per million spans — a few cents per million. Existing Datadog APM customers can reuse the same agent infra. The first 100M spans/month are typically included in Pro plans.

Q: Will prompts and completions be stored long-term? A: By default yes, with configurable retention (15 / 30 / 90 days). For PII-sensitive prompts, enable scrubbing rules at SDK level (DD_LLMOBS_SAMPLE_RATE + custom redactor) so PII is masked before it leaves the host.

Q: Datadog vs Phoenix vs Langfuse? A: Datadog wins if your stack already lives there — same dashboards, alerts, on-call workflows. Phoenix wins for OTel-native portability and free self-host. Langfuse wins for prompt management + cheap self-host.

Source & Thanks

Built by Datadog. Docs at docs.datadoghq.com/llm_observability.

DataDog/dd-trace-py — ⭐ 700+

Datadog LLM Observability — Trace Cost, Latency, Drift

Python install

Auto-instrument OpenAI

Tag traces with user / session

Custom span for non-instrumented call

Built-in views (LLM Observability tab)

OpenTelemetry alternative

FAQ

Quick Use

Intro

Python install

Auto-instrument OpenAI

Tag traces with user / session

Custom span for non-instrumented call

Built-in views (LLM Observability tab)

OpenTelemetry alternative

FAQ

Source & Thanks

Source & Thanks

Discussion

Related Assets

PostHog LLM Observability — Track AI Agents in Production

Helicone Sessions — Group LLM Calls by User Conversation

LiteLLM Cost Tracking — Per-Project LLM Spend Dashboard

Weave — Trace and Debug LLM Apps