# Pydantic — Data Validation for AI Agent Pipelines > Python's most popular data validation library, essential for AI agent tool definitions. Pydantic enforces type safety in LLM structured outputs, API schemas, and config files. ## Install Save the content below to `.claude/skills/` or append to your `CLAUDE.md`: ## Quick Use ```bash pip install pydantic ``` ```python from pydantic import BaseModel, Field from typing import Optional class UserProfile(BaseModel): name: str = Field(description="Full name") age: int = Field(ge=0, le=150, description="Age in years") email: str = Field(pattern=r'^[\w.-]+@[\w.-]+\.\w+$') bio: Optional[str] = None # Validates automatically user = UserProfile(name="Alice", age=30, email="alice@example.com") print(user.model_dump_json()) # Raises ValidationError try: bad = UserProfile(name="Bob", age=-5, email="not-email") except Exception as e: print(e) # age: Input should be >= 0; email: invalid pattern ``` ## What is Pydantic? Pydantic is Python's most popular data validation library with 200M+ monthly downloads. In the AI ecosystem, it is foundational — used to define LLM tool schemas, validate structured outputs, configure agents, and build API contracts. If you are building AI agents in Python, you are almost certainly using Pydantic. **Answer-Ready**: Pydantic is Python's #1 data validation library (200M+ downloads/month). Essential for AI: defines LLM tool schemas, validates structured outputs, configures agents. Used by FastAPI, LangChain, Instructor, DSPy, and every major AI framework. V2 is 5-50x faster than V1. 22k+ GitHub stars. **Best for**: Python developers building AI agents, APIs, or data pipelines. **Works with**: Every Python AI framework. **Setup time**: Under 1 minute. ## Why Pydantic Matters for AI ### 1. LLM Tool Definitions ```python from pydantic import BaseModel class SearchTool(BaseModel): query: str = Field(description="Search query") max_results: int = Field(default=5, ge=1, le=20) language: str = Field(default="en") # Auto-generates JSON Schema for LLM tool calling print(SearchTool.model_json_schema()) ``` ### 2. Structured Output Validation ```python class ExtractedEntity(BaseModel): name: str entity_type: str = Field(description="person, org, or location") confidence: float = Field(ge=0, le=1) # Validate LLM output raw = {"name": "Anthropic", "entity_type": "org", "confidence": 0.95} entity = ExtractedEntity.model_validate(raw) ``` ### 3. Agent Configuration ```python class AgentConfig(BaseModel): model: str = "claude-sonnet-4-20250514" temperature: float = Field(default=0.7, ge=0, le=2) max_tokens: int = Field(default=4096, ge=1) tools: list[str] = [] system_prompt: str = "" config = AgentConfig.model_validate_json(open("config.json").read()) ``` ### 4. API Contracts (FastAPI) ```python from fastapi import FastAPI app = FastAPI() class ChatRequest(BaseModel): message: str model: str = "claude-sonnet-4-20250514" class ChatResponse(BaseModel): reply: str tokens_used: int @app.post("/chat", response_model=ChatResponse) async def chat(req: ChatRequest): ... ``` ## Pydantic V2 Performance | Operation | V1 | V2 | Speedup | |-----------|----|----|---------| | Model creation | 1x | 5-10x | 5-10x | | JSON parsing | 1x | 10-50x | 10-50x | | Serialization | 1x | 5-20x | 5-20x | V2 uses a Rust core (`pydantic-core`) for dramatic speed improvements. ## AI Frameworks Using Pydantic | Framework | How It Uses Pydantic | |-----------|---------------------| | LangChain | Tool definitions, output parsers | | Instructor | Structured output validation | | DSPy | Signature definitions | | FastAPI | Request/response models | | Pydantic AI | Agent framework built on Pydantic | | Guardrails AI | Validator definitions | ## FAQ **Q: V1 or V2?** A: Always V2. It is 5-50x faster and the ecosystem has migrated. V1 is in maintenance mode. **Q: How does it relate to JSON Schema?** A: Pydantic models auto-generate JSON Schema via `model_json_schema()`. This is how LLMs understand your tool parameters. **Q: Can I use it for runtime config?** A: Yes, `pydantic-settings` loads from env vars, .env files, and config files with full validation. ## Source & Thanks > Created by [Samuel Colvin](https://github.com/pydantic). Licensed under MIT. > > [pydantic/pydantic](https://github.com/pydantic/pydantic) — 22k+ stars ## Quick Use ```bash pip install pydantic ``` The bedrock of data validation in the Python AI ecosystem. ## What is Pydantic? Python's most popular data validation library (200M+ monthly downloads). Infrastructure for AI: define tool schemas, validate structured output, configure agents. **TL;DR**: #1 Python data validation (200M+/mo). AI essentials: LLM tool schemas + structured-output validation + agent config. Depended on by all major AI frameworks. V2 is 5–50x faster than V1. 22k+ stars. **Best for**: Python developers building AI agents, APIs, or data pipelines. ## Pydantic in AI ### 1. Tool Definition — auto-generated JSON Schema ### 2. Output Validation — validate structured data returned by LLMs ### 3. Agent Config — type-safe config files ## FAQ **Q: V1 or V2?** A: Always V2 — 5–50x faster; the ecosystem has migrated. ## Source & Thanks > [pydantic/pydantic](https://github.com/pydantic/pydantic) — 22k+ stars, MIT --- Source: https://tokrepo.com/en/workflows/pydantic-data-validation-ai-agent-pipelines-1960042c Author: Pydantic