LangSmith — Prompt Debugging and LLM Observability
Debug, test, and monitor LLM applications in production. LangSmith provides trace visualization, prompt playground, dataset evaluation, and regression testing for AI.
Staging seguro para este activo
Este activo primero queda en staging. El prompt copiado pide inspeccionar los archivos staged antes de activar scripts, config MCP o config global.
npx -y tokrepo@latest install 4d9432ea-330f-44b6-a629-5b29627f746a --target codexPrimero deja archivos en staging; la activación requiere revisar el README y el plan staged.
What it is
LangSmith is LangChain's observability and evaluation platform for LLM applications. It provides trace visualization for every LLM call, a prompt playground for rapid iteration, dataset-driven evaluation, and regression testing. You see exactly what prompts were sent, what the model returned, how long it took, and how many tokens it consumed.
This tool is for developers building LLM-powered applications who need visibility into model behavior. It works with LangChain, LangGraph, and standalone LLM calls.
How it saves time or tokens
Without observability, debugging LLM applications is guesswork. LangSmith shows the full trace of every chain, agent, or tool call, making it easy to spot where things go wrong. The prompt playground lets you test variations without redeploying. Dataset evaluation automates regression testing across model or prompt changes. The estimated token cost for the monitoring workflow is around 4,100 tokens.
How to use
- Create a LangSmith account and get an API key.
- Set the environment variables in your application.
- Traces are automatically captured for LangChain applications.
- View traces, run evaluations, and iterate in the dashboard.
# Set environment variables
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY='your-langsmith-api-key'
export LANGCHAIN_PROJECT='my-project'
# Your LangChain code is now automatically traced
python my_app.py
# Open LangSmith dashboard to view traces
# https://smith.langchain.com
Example
Adding custom tracing to non-LangChain code:
from langsmith import traceable
import openai
client = openai.OpenAI()
@traceable(name='generate_summary')
def summarize(text: str) -> str:
response = client.chat.completions.create(
model='gpt-4',
messages=[{'role': 'user', 'content': f'Summarize: {text}'}]
)
return response.choices[0].message.content
# Every call is now traced in LangSmith
result = summarize('Long article text here...')
Related on TokRepo
- AI gateway providers — Alternative LLM observability
- Multi-agent frameworks — LangGraph deep-dive
Common pitfalls
- LangSmith sends trace data to LangChain's servers. Ensure your security policy allows this for the data being processed.
- The free tier has trace retention limits. High-volume applications may need a paid plan for full trace history.
- Automatic tracing only works with LangChain. For other frameworks, use the @traceable decorator or manual trace API.
- Evaluation datasets need curation. Poor-quality test cases lead to misleading evaluation results.
- LangSmith is a separate service from LangChain the library. You need an account even if you already use LangChain.
- Review the official documentation before deploying to production to ensure compatibility with your specific environment and requirements.
- Start with default settings and customize incrementally. Changing too many configuration options at once makes debugging harder.
Preguntas frecuentes
Yes. LangSmith provides a Python SDK with @traceable decorators that work with any LLM provider. You can trace OpenAI, Anthropic, or custom model calls without using LangChain as a framework.
A trace shows the complete execution path: input prompts, model outputs, token usage, latency per step, tool calls, intermediate results, and errors. For chains and agents, you see each step in a timeline view.
LangSmith offers a free tier suitable for development and small projects. Paid plans scale with trace volume and add features like longer retention, team collaboration, and higher rate limits.
Yes. LangSmith supports dataset-driven evaluation where you define test cases with expected outputs. Run evaluations on prompt changes, model switches, or code updates to catch regressions before deployment.
Both provide LLM observability. LangSmith is built by the LangChain team with tight LangChain integration. Langfuse is open-source and self-hostable. Choose based on your integration needs and self-hosting requirements.
Referencias (3)
- LangSmith Documentation— LangSmith LLM observability and evaluation platform
- LangChain Tracing Docs— LangChain tracing configuration
- LangChain Evaluation Docs— LLM evaluation best practices
Relacionados en TokRepo
Fuente y agradecimientos
Created by LangChain.
smith.langchain.com — LLM observability platform
Discusión
Activos relacionados
Prompt Flow — Build, Test & Deploy LLM Pipelines
Prompt Flow by Microsoft provides a visual editor and CLI for building LLM application workflows with built-in evaluation, tracing, and CI/CD integration for production deployment.
LLM Wiki Memory Upgrade Prompt
One-click prompt to upgrade your AI agent memory system to Karpathy LLM Wiki pattern. Send to Claude Code / Cursor / Windsurf — auto audits, compiles fragments, resolves contradictions, builds structured wiki.
Fabric — 100+ AI Prompt Patterns for Everything
Fabric organizes 100+ AI prompt patterns for real-world tasks. 40.3K+ GitHub stars. 20+ providers, CLI + REST API, custom patterns. MIT.
LLM Gateway Comparison — LiteLLM vs OpenRouter vs CF
In-depth comparison of LLM API gateways: LiteLLM (self-hosted proxy), OpenRouter (unified API), and Cloudflare AI Gateway (edge cache). Architecture, pricing, and when to use each.