# LangSmith — Prompt Debugging and LLM Observability

> Debug, test, and monitor LLM applications in production. LangSmith provides trace visualization, prompt playground, dataset evaluation, and regression testing for AI.

## Install

Paste the prompt below into your AI tool:

## Quick Use

```bash
pip install langsmith
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY="lsv2_..."
```

```python
from langsmith import traceable

@traceable
def my_llm_call(question: str) -> str:
    # Your LLM call here — automatically traced
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": question}]
    )
    return response.choices[0].message.content

result = my_llm_call("What is RAG?")
# View trace at smith.langchain.com
```

## What is LangSmith?

LangSmith is an observability and evaluation platform for LLM applications. It captures detailed traces of every LLM call, chain, and agent step — showing latency, token usage, inputs, and outputs. Beyond monitoring, it provides a prompt playground for iteration, dataset management for systematic evaluation, and regression testing to catch prompt regressions before deployment.

**Answer-Ready**: LangSmith is an LLM observability platform by LangChain. Provides trace visualization, prompt playground, dataset evaluation, and regression testing. Works with any LLM framework (not just LangChain). Free tier available. Used by thousands of AI teams in production.

**Best for**: AI teams debugging and monitoring LLM applications. **Works with**: LangChain, OpenAI, Anthropic Claude, any Python/JS LLM app. **Setup time**: Under 3 minutes.

## Core Features

### 1. Trace Visualization

Every LLM call is captured with:
- Input/output at each step
- Latency breakdown
- Token usage and cost
- Error details and stack traces
- Nested chain/agent visualization

### 2. Prompt Playground

```
1. Select a traced LLM call
2. Modify the prompt in the playground
3. Re-run with different models
4. Compare outputs side-by-side
5. Save winning prompt version
```

### 3. Dataset & Evaluation

```python
from langsmith import Client
from langsmith.evaluation import evaluate

client = Client()

# Create evaluation dataset
dataset = client.create_dataset("qa-pairs")
client.create_examples(
    inputs=[{"question": "What is RAG?"}],
    outputs=[{"answer": "Retrieval-Augmented Generation"}],
    dataset_id=dataset.id,
)

# Run evaluation
results = evaluate(
    my_llm_call,
    data="qa-pairs",
    evaluators=["correctness", "helpfulness"],
)
```

### 4. Online Evaluation (Production)

```python
from langsmith import Client

client = Client()

# Add feedback to production traces
client.create_feedback(
    run_id="...",
    key="user_rating",
    score=1.0,
    comment="Helpful response",
)
```

### 5. Regression Testing

```python
# Compare prompt versions on same dataset
results_v1 = evaluate(prompt_v1, data="test-set")
results_v2 = evaluate(prompt_v2, data="test-set")
# Side-by-side comparison in UI
```

## LangSmith vs Alternatives

| Feature | LangSmith | LangFuse | Helicone |
|---------|-----------|----------|----------|
| Tracing | Deep nested | Deep nested | Request-level |
| Prompt Playground | Yes | Yes | No |
| Evaluation | Built-in | Basic | No |
| Regression Testing | Yes | No | No |
| Self-hosted | Enterprise | Yes (OSS) | Yes (OSS) |
| Free tier | 5K traces/mo | Unlimited (OSS) | 100K req/mo |

## Pricing

| Tier | Traces/mo | Price |
|------|-----------|-------|
| Developer | 5,000 | Free |
| Plus | 50,000 | $39/mo |
| Enterprise | Unlimited | Custom |

## FAQ

**Q: Do I need to use LangChain?**
A: No. LangSmith works with any LLM framework. The `@traceable` decorator works with plain Python functions.

**Q: How much overhead does tracing add?**
A: Traces are sent asynchronously. Typical overhead is <5ms per trace.

**Q: Can I self-host?**
A: Enterprise plan includes self-hosted deployment. For open-source alternatives, see LangFuse.

## Source & Thanks

> Created by [LangChain](https://github.com/langchain-ai).
>
> [smith.langchain.com](https://smith.langchain.com) — LLM observability platform

<!-- ZH -->

## 快速使用

```bash
pip install langsmith
export LANGCHAIN_TRACING_V2=true
```

两行配置，自动追踪所有 LLM 调用。

## 什么是 LangSmith？

LangSmith 是 LangChain 出品的 LLM 可观测性平台，提供 Trace 可视化、Prompt 调试、数据集评估和回归测试。

**一句话总结**：LLM 可观测性平台，Trace 可视化 + Prompt Playground + 数据集评估 + 回归测试，不限于 LangChain 框架，免费层可用。

**适合人群**：调试和监控 LLM 应用的 AI 团队。

## 核心功能

### 1. Trace 可视化
每次 LLM 调用的输入输出、延迟、token 用量。

### 2. Prompt Playground
修改 prompt，多模型对比，保存最优版本。

### 3. 数据集评估
系统化评估 prompt 效果，支持自定义评估器。

### 4. 回归测试
prompt 版本对比，部署前捕获退化。

## 常见问题

**Q: 必须用 LangChain？**
A: 不需要，`@traceable` 装饰器适用于任何 Python 函数。

**Q: 有开源替代？**
A: LangFuse 是开源替代方案。

## 来源与致谢

> [smith.langchain.com](https://smith.langchain.com) — LangChain 出品

---
Source: https://tokrepo.com/en/workflows/4d9432ea-330f-44b6-a629-5b29627f746a
Author: Prompt Lab