Esta página se muestra en inglés. Una traducción al español está en curso.
KnowledgeMay 8, 2026·4 min de lectura

Langfuse Prompt Management — Versioned Prompts + A/B Tests

Langfuse Prompt Management versions, labels, and A/B tests prompts. Edit in UI, fetch via SDK, swap models without code deploys.

Listo para agents

Este activo puede ser leído e instalado directamente por agents

TokRepo expone un comando CLI universal, contrato de instalación, metadata JSON, plan según adaptador y contenido raw para que los agents evalúen compatibilidad, riesgo y próximos pasos.

Needs Confirmation · 64/100Política: confirmar
Superficie agent
Cualquier agent MCP/CLI
Tipo
Knowledge
Instalación
Single
Confianza
Confianza: New
Entrada
Asset
Comando CLI universal
npx tokrepo install 461fddc5-0788-4539-b701-831dc2906c91
Introducción

Langfuse Prompt Management stores prompts as versioned, labeled artifacts in Langfuse — edit in the UI, fetch from the SDK at runtime, swap models or templates without code deploys. Each prompt has a label (production, staging, experiment-v3) you point your code at. Best for: teams iterating on prompts faster than they can ship code, A/B testing prompts, granting non-engineers safe edit access. Works with: Python and JS SDKs, LangChain Hub-compatible, OpenAI and Anthropic message format. Setup time: 5 minutes.


Push a prompt programmatically

from langfuse import Langfuse
lf = Langfuse()  # picks up LANGFUSE_PUBLIC_KEY / SECRET_KEY / HOST

lf.create_prompt(
    name="support-triage",
    prompt=[
        {"role": "system", "content": "You triage support tickets into urgent / billing / general."},
        {"role": "user",   "content": "Ticket: {{ticket_text}}"},
    ],
    config={"model": "claude-3-5-sonnet-20241022", "temperature": 0.2},
    labels=["production"],
)

Fetch + execute (with caching)

from langfuse import Langfuse
from langfuse.openai import openai  # auto-tracing wrapper

lf = Langfuse()
prompt = lf.get_prompt("support-triage", label="production", cache_ttl_seconds=60)

messages = prompt.compile(ticket_text="My card was charged twice for order #4521")
resp = openai.chat.completions.create(
    model=prompt.config["model"],
    messages=messages,
    temperature=prompt.config["temperature"],
)

A/B testing two prompt versions

Tag two versions with separate labels (production-a, production-b), split traffic in code by user_id hash, then compare success metrics in the Langfuse Scores tab.

import hashlib

def variant(user_id: str) -> str:
    return "production-a" if int(hashlib.md5(user_id.encode()).hexdigest(), 16) % 2 == 0 else "production-b"

prompt = lf.get_prompt("support-triage", label=variant(user_id))

Why labels not version pins

Approach Behavior
label="production" Latest production-tagged version. Edits in UI go live without code deploy.
version=7 Hard pin. Code change required to move forward. Use only for compliance freezes.

Self-hosted Langfuse

git clone https://github.com/langfuse/langfuse
cd langfuse
docker compose up -d
# Open http://localhost:3000, create org + project, copy keys into .env

FAQ

Q: How is this different from LangChain Hub? A: Langfuse Prompt Management ships with full traces — you see which prompt version produced which output, with cost and latency. LangChain Hub is registry-only with no observability. Langfuse also self-hosts; Hub is LangSmith-tied.

Q: Can non-engineers edit prompts safely? A: Yes — give them Langfuse project access with the Editor role. They edit in UI, save creates a new version. Production label only moves when an admin promotes it, so junior edits can't ship without review.

Q: What's the cache_ttl_seconds default? A: 60 seconds. The SDK fetches and caches; subsequent calls within 60s use the cache. Tune lower for fast-iteration dev (1s) or higher for stable prod (300s+) to reduce control-plane load.


Quick Use

  1. pip install langfuse
  2. Set LANGFUSE_PUBLIC_KEY / SECRET_KEY / HOST in env
  3. Author prompt in UI, label production, fetch with lf.get_prompt(name, label='production')

Intro

Langfuse Prompt Management stores prompts as versioned, labeled artifacts in Langfuse — edit in the UI, fetch from the SDK at runtime, swap models or templates without code deploys. Each prompt has a label (production, staging, experiment-v3) you point your code at. Best for: teams iterating on prompts faster than they can ship code, A/B testing prompts, granting non-engineers safe edit access. Works with: Python and JS SDKs, LangChain Hub-compatible, OpenAI and Anthropic message format. Setup time: 5 minutes.


Push a prompt programmatically

from langfuse import Langfuse
lf = Langfuse()  # picks up LANGFUSE_PUBLIC_KEY / SECRET_KEY / HOST

lf.create_prompt(
    name="support-triage",
    prompt=[
        {"role": "system", "content": "You triage support tickets into urgent / billing / general."},
        {"role": "user",   "content": "Ticket: {{ticket_text}}"},
    ],
    config={"model": "claude-3-5-sonnet-20241022", "temperature": 0.2},
    labels=["production"],
)

Fetch + execute (with caching)

from langfuse import Langfuse
from langfuse.openai import openai  # auto-tracing wrapper

lf = Langfuse()
prompt = lf.get_prompt("support-triage", label="production", cache_ttl_seconds=60)

messages = prompt.compile(ticket_text="My card was charged twice for order #4521")
resp = openai.chat.completions.create(
    model=prompt.config["model"],
    messages=messages,
    temperature=prompt.config["temperature"],
)

A/B testing two prompt versions

Tag two versions with separate labels (production-a, production-b), split traffic in code by user_id hash, then compare success metrics in the Langfuse Scores tab.

import hashlib

def variant(user_id: str) -> str:
    return "production-a" if int(hashlib.md5(user_id.encode()).hexdigest(), 16) % 2 == 0 else "production-b"

prompt = lf.get_prompt("support-triage", label=variant(user_id))

Why labels not version pins

Approach Behavior
label="production" Latest production-tagged version. Edits in UI go live without code deploy.
version=7 Hard pin. Code change required to move forward. Use only for compliance freezes.

Self-hosted Langfuse

git clone https://github.com/langfuse/langfuse
cd langfuse
docker compose up -d
# Open http://localhost:3000, create org + project, copy keys into .env

FAQ

Q: How is this different from LangChain Hub? A: Langfuse Prompt Management ships with full traces — you see which prompt version produced which output, with cost and latency. LangChain Hub is registry-only with no observability. Langfuse also self-hosts; Hub is LangSmith-tied.

Q: Can non-engineers edit prompts safely? A: Yes — give them Langfuse project access with the Editor role. They edit in UI, save creates a new version. Production label only moves when an admin promotes it, so junior edits can't ship without review.

Q: What's the cache_ttl_seconds default? A: 60 seconds. The SDK fetches and caches; subsequent calls within 60s use the cache. Tune lower for fast-iteration dev (1s) or higher for stable prod (300s+) to reduce control-plane load.


Source & Thanks

Built by Langfuse. Licensed under MIT.

langfuse/langfuse — ⭐ 8,000+

🙏

Fuente y agradecimientos

Built by Langfuse. Licensed under MIT.

langfuse/langfuse — ⭐ 8,000+

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados