# LangSmith — Prompt Debugging and LLM Observability > Debug, test, and monitor LLM applications in production. LangSmith provides trace visualization, prompt playground, dataset evaluation, and regression testing for AI. ## Install Paste the prompt below into your AI tool: ## Quick Use ```bash pip install langsmith export LANGCHAIN_TRACING_V2=true export LANGCHAIN_API_KEY="lsv2_..." ``` ```python from langsmith import traceable @traceable def my_llm_call(question: str) -> str: # Your LLM call here — automatically traced response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": question}] ) return response.choices[0].message.content result = my_llm_call("What is RAG?") # View trace at smith.langchain.com ``` ## What is LangSmith? LangSmith is an observability and evaluation platform for LLM applications. It captures detailed traces of every LLM call, chain, and agent step — showing latency, token usage, inputs, and outputs. Beyond monitoring, it provides a prompt playground for iteration, dataset management for systematic evaluation, and regression testing to catch prompt regressions before deployment. **Answer-Ready**: LangSmith is an LLM observability platform by LangChain. Provides trace visualization, prompt playground, dataset evaluation, and regression testing. Works with any LLM framework (not just LangChain). Free tier available. Used by thousands of AI teams in production. **Best for**: AI teams debugging and monitoring LLM applications. **Works with**: LangChain, OpenAI, Anthropic Claude, any Python/JS LLM app. **Setup time**: Under 3 minutes. ## Core Features ### 1. Trace Visualization Every LLM call is captured with: - Input/output at each step - Latency breakdown - Token usage and cost - Error details and stack traces - Nested chain/agent visualization ### 2. Prompt Playground ``` 1. Select a traced LLM call 2. Modify the prompt in the playground 3. Re-run with different models 4. Compare outputs side-by-side 5. Save winning prompt version ``` ### 3. Dataset & Evaluation ```python from langsmith import Client from langsmith.evaluation import evaluate client = Client() # Create evaluation dataset dataset = client.create_dataset("qa-pairs") client.create_examples( inputs=[{"question": "What is RAG?"}], outputs=[{"answer": "Retrieval-Augmented Generation"}], dataset_id=dataset.id, ) # Run evaluation results = evaluate( my_llm_call, data="qa-pairs", evaluators=["correctness", "helpfulness"], ) ``` ### 4. Online Evaluation (Production) ```python from langsmith import Client client = Client() # Add feedback to production traces client.create_feedback( run_id="...", key="user_rating", score=1.0, comment="Helpful response", ) ``` ### 5. Regression Testing ```python # Compare prompt versions on same dataset results_v1 = evaluate(prompt_v1, data="test-set") results_v2 = evaluate(prompt_v2, data="test-set") # Side-by-side comparison in UI ``` ## LangSmith vs Alternatives | Feature | LangSmith | LangFuse | Helicone | |---------|-----------|----------|----------| | Tracing | Deep nested | Deep nested | Request-level | | Prompt Playground | Yes | Yes | No | | Evaluation | Built-in | Basic | No | | Regression Testing | Yes | No | No | | Self-hosted | Enterprise | Yes (OSS) | Yes (OSS) | | Free tier | 5K traces/mo | Unlimited (OSS) | 100K req/mo | ## Pricing | Tier | Traces/mo | Price | |------|-----------|-------| | Developer | 5,000 | Free | | Plus | 50,000 | $39/mo | | Enterprise | Unlimited | Custom | ## FAQ **Q: Do I need to use LangChain?** A: No. LangSmith works with any LLM framework. The `@traceable` decorator works with plain Python functions. **Q: How much overhead does tracing add?** A: Traces are sent asynchronously. Typical overhead is <5ms per trace. **Q: Can I self-host?** A: Enterprise plan includes self-hosted deployment. For open-source alternatives, see LangFuse. ## Source & Thanks > Created by [LangChain](https://github.com/langchain-ai). > > [smith.langchain.com](https://smith.langchain.com) — LLM observability platform ## Quick Start ```bash pip install langsmith export LANGCHAIN_TRACING_V2=true ``` Two lines of configuration auto-trace all LLM calls. ## What is LangSmith? LangSmith is LangChain's LLM observability platform, providing trace visualization, prompt debugging, dataset evaluation, and regression testing. **In one sentence**: LLM observability platform — trace visualization + prompt playground + dataset evaluation + regression testing. Not limited to LangChain. Free tier available. **For**: AI teams debugging and monitoring LLM applications. ## Core Features ### 1. Trace Visualization Inputs, outputs, latency, and token usage for every LLM call. ### 2. Prompt Playground Edit prompts, compare across models, save best versions. ### 3. Dataset Evaluation Systematically evaluate prompt effectiveness with custom evaluators. ### 4. Regression Testing Compare prompt versions — catch regressions before deploying. ## FAQ **Q: Do I have to use LangChain?** A: No — the `@traceable` decorator works on any Python function. **Q: Open-source alternative?** A: LangFuse is the open-source alternative. ## Source & Thanks > [smith.langchain.com](https://smith.langchain.com) — by LangChain --- Source: https://tokrepo.com/en/workflows/langsmith-prompt-debugging-llm-observability-4d9432ea Author: Prompt Lab