# Pydantic — Data Validation for AI Agent Pipelines

> Python's most popular data validation library, essential for AI agent tool definitions. Pydantic enforces type safety in LLM structured outputs, API schemas, and config files.

## Install

Save as a script file and run:

## Quick Use

```bash
pip install pydantic
```

```python
from pydantic import BaseModel, Field
from typing import Optional

class UserProfile(BaseModel):
    name: str = Field(description="Full name")
    age: int = Field(ge=0, le=150, description="Age in years")
    email: str = Field(pattern=r'^[\w.-]+@[\w.-]+\.\w+$')
    bio: Optional[str] = None

# Validates automatically
user = UserProfile(name="Alice", age=30, email="alice@example.com")
print(user.model_dump_json())

# Raises ValidationError
try:
    bad = UserProfile(name="Bob", age=-5, email="not-email")
except Exception as e:
    print(e)  # age: Input should be >= 0; email: invalid pattern
```

## What is Pydantic?

Pydantic is Python's most popular data validation library with 200M+ monthly downloads. In the AI ecosystem, it is foundational — used to define LLM tool schemas, validate structured outputs, configure agents, and build API contracts. If you are building AI agents in Python, you are almost certainly using Pydantic.

**Answer-Ready**: Pydantic is Python's #1 data validation library (200M+ downloads/month). Essential for AI: defines LLM tool schemas, validates structured outputs, configures agents. Used by FastAPI, LangChain, Instructor, DSPy, and every major AI framework. V2 is 5-50x faster than V1. 22k+ GitHub stars.

**Best for**: Python developers building AI agents, APIs, or data pipelines. **Works with**: Every Python AI framework. **Setup time**: Under 1 minute.

## Why Pydantic Matters for AI

### 1. LLM Tool Definitions

```python
from pydantic import BaseModel

class SearchTool(BaseModel):
    query: str = Field(description="Search query")
    max_results: int = Field(default=5, ge=1, le=20)
    language: str = Field(default="en")

# Auto-generates JSON Schema for LLM tool calling
print(SearchTool.model_json_schema())
```

### 2. Structured Output Validation

```python
class ExtractedEntity(BaseModel):
    name: str
    entity_type: str = Field(description="person, org, or location")
    confidence: float = Field(ge=0, le=1)

# Validate LLM output
raw = {"name": "Anthropic", "entity_type": "org", "confidence": 0.95}
entity = ExtractedEntity.model_validate(raw)
```

### 3. Agent Configuration

```python
class AgentConfig(BaseModel):
    model: str = "claude-sonnet-4-20250514"
    temperature: float = Field(default=0.7, ge=0, le=2)
    max_tokens: int = Field(default=4096, ge=1)
    tools: list[str] = []
    system_prompt: str = ""

config = AgentConfig.model_validate_json(open("config.json").read())
```

### 4. API Contracts (FastAPI)

```python
from fastapi import FastAPI

app = FastAPI()

class ChatRequest(BaseModel):
    message: str
    model: str = "claude-sonnet-4-20250514"

class ChatResponse(BaseModel):
    reply: str
    tokens_used: int

@app.post("/chat", response_model=ChatResponse)
async def chat(req: ChatRequest):
    ...
```

## Pydantic V2 Performance

| Operation | V1 | V2 | Speedup |
|-----------|----|----|---------|
| Model creation | 1x | 5-10x | 5-10x |
| JSON parsing | 1x | 10-50x | 10-50x |
| Serialization | 1x | 5-20x | 5-20x |

V2 uses a Rust core (`pydantic-core`) for dramatic speed improvements.

## AI Frameworks Using Pydantic

| Framework | How It Uses Pydantic |
|-----------|---------------------|
| LangChain | Tool definitions, output parsers |
| Instructor | Structured output validation |
| DSPy | Signature definitions |
| FastAPI | Request/response models |
| Pydantic AI | Agent framework built on Pydantic |
| Guardrails AI | Validator definitions |

## FAQ

**Q: V1 or V2?**
A: Always V2. It is 5-50x faster and the ecosystem has migrated. V1 is in maintenance mode.

**Q: How does it relate to JSON Schema?**
A: Pydantic models auto-generate JSON Schema via `model_json_schema()`. This is how LLMs understand your tool parameters.

**Q: Can I use it for runtime config?**
A: Yes, `pydantic-settings` loads from env vars, .env files, and config files with full validation.

## Source & Thanks

> Created by [Samuel Colvin](https://github.com/pydantic). Licensed under MIT.
>
> [pydantic/pydantic](https://github.com/pydantic/pydantic) — 22k+ stars

<!-- ZH -->

## 快速使用

```bash
pip install pydantic
```

Python AI 生态的数据验证基石。

## 什么是 Pydantic？

Python 最流行的数据验证库（月下载 200M+），AI 生态基础设施：定义工具 schema、验证结构化输出、配置 Agent。

**一句话总结**：Python 数据验证 #1（200M+/月），AI 必备：LLM 工具 schema + 结构化输出验证 + Agent 配置，所有主流 AI 框架依赖，V2 比 V1 快 5-50x，22k+ stars。

**适合人群**：构建 AI Agent、API 或数据管线的 Python 开发者。

## AI 中的 Pydantic

### 1. 工具定义 — 自动生成 JSON Schema
### 2. 输出验证 — 验证 LLM 返回的结构化数据
### 3. Agent 配置 — 类型安全的配置文件

## 常见问题

**Q: V1 还是 V2？**
A: 永远 V2，快 5-50x，生态已迁移。

## 来源与致谢

> [pydantic/pydantic](https://github.com/pydantic/pydantic) — 22k+ stars, MIT

---
Source: https://tokrepo.com/en/workflows/1960042c-de29-4831-a1e5-80177c4c9af4
Author: Script Depot