ScriptsApr 8, 2026·2 min read

Pydantic — Data Validation for AI Agent Pipelines

Python's most popular data validation library, essential for AI agent tool definitions. Pydantic enforces type safety in LLM structured outputs, API schemas, and config files.

SC
Script Depot · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

pip install pydantic
from pydantic import BaseModel, Field
from typing import Optional

class UserProfile(BaseModel):
    name: str = Field(description="Full name")
    age: int = Field(ge=0, le=150, description="Age in years")
    email: str = Field(pattern=r'^[\w.-]+@[\w.-]+\.\w+$')
    bio: Optional[str] = None

# Validates automatically
user = UserProfile(name="Alice", age=30, email="alice@example.com")
print(user.model_dump_json())

# Raises ValidationError
try:
    bad = UserProfile(name="Bob", age=-5, email="not-email")
except Exception as e:
    print(e)  # age: Input should be >= 0; email: invalid pattern

What is Pydantic?

Pydantic is Python's most popular data validation library with 200M+ monthly downloads. In the AI ecosystem, it is foundational — used to define LLM tool schemas, validate structured outputs, configure agents, and build API contracts. If you are building AI agents in Python, you are almost certainly using Pydantic.

Answer-Ready: Pydantic is Python's #1 data validation library (200M+ downloads/month). Essential for AI: defines LLM tool schemas, validates structured outputs, configures agents. Used by FastAPI, LangChain, Instructor, DSPy, and every major AI framework. V2 is 5-50x faster than V1. 22k+ GitHub stars.

Best for: Python developers building AI agents, APIs, or data pipelines. Works with: Every Python AI framework. Setup time: Under 1 minute.

Why Pydantic Matters for AI

1. LLM Tool Definitions

from pydantic import BaseModel

class SearchTool(BaseModel):
    query: str = Field(description="Search query")
    max_results: int = Field(default=5, ge=1, le=20)
    language: str = Field(default="en")

# Auto-generates JSON Schema for LLM tool calling
print(SearchTool.model_json_schema())

2. Structured Output Validation

class ExtractedEntity(BaseModel):
    name: str
    entity_type: str = Field(description="person, org, or location")
    confidence: float = Field(ge=0, le=1)

# Validate LLM output
raw = {"name": "Anthropic", "entity_type": "org", "confidence": 0.95}
entity = ExtractedEntity.model_validate(raw)

3. Agent Configuration

class AgentConfig(BaseModel):
    model: str = "claude-sonnet-4-20250514"
    temperature: float = Field(default=0.7, ge=0, le=2)
    max_tokens: int = Field(default=4096, ge=1)
    tools: list[str] = []
    system_prompt: str = ""

config = AgentConfig.model_validate_json(open("config.json").read())

4. API Contracts (FastAPI)

from fastapi import FastAPI

app = FastAPI()

class ChatRequest(BaseModel):
    message: str
    model: str = "claude-sonnet-4-20250514"

class ChatResponse(BaseModel):
    reply: str
    tokens_used: int

@app.post("/chat", response_model=ChatResponse)
async def chat(req: ChatRequest):
    ...

Pydantic V2 Performance

Operation V1 V2 Speedup
Model creation 1x 5-10x 5-10x
JSON parsing 1x 10-50x 10-50x
Serialization 1x 5-20x 5-20x

V2 uses a Rust core (pydantic-core) for dramatic speed improvements.

AI Frameworks Using Pydantic

Framework How It Uses Pydantic
LangChain Tool definitions, output parsers
Instructor Structured output validation
DSPy Signature definitions
FastAPI Request/response models
Pydantic AI Agent framework built on Pydantic
Guardrails AI Validator definitions

FAQ

Q: V1 or V2? A: Always V2. It is 5-50x faster and the ecosystem has migrated. V1 is in maintenance mode.

Q: How does it relate to JSON Schema? A: Pydantic models auto-generate JSON Schema via model_json_schema(). This is how LLMs understand your tool parameters.

Q: Can I use it for runtime config? A: Yes, pydantic-settings loads from env vars, .env files, and config files with full validation.

🙏

Source & Thanks

Created by Samuel Colvin. Licensed under MIT.

pydantic/pydantic — 22k+ stars

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets