# Instructor — Structured LLM Outputs with Pydantic

> Extract structured data from LLMs using Pydantic models. Works with OpenAI, Anthropic, Gemini, and local models. The simplest way to get reliable JSON from any LLM.

## Install

Save as a script file and run:

## Quick Use

```bash
pip install instructor
```

```python
import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class User(BaseModel):
    name: str
    age: int

user = client.chat.completions.create(
    model="gpt-4o",
    response_model=User,
    messages=[{"role": "user", "content": "John is 30 years old"}],
)
print(user)  # User(name='John', age=30)
```

## What is Instructor?

Instructor patches LLM client libraries to return validated Pydantic objects instead of raw text. It handles retries, streaming, and partial responses — making structured extraction reliable across any provider.

**Answer-Ready**: Instructor is a Python library that extracts structured, validated data from LLMs using Pydantic models, supporting OpenAI, Anthropic, Gemini, and local models with automatic retry logic.

## Key Patterns

### 1. Multi-Provider Support

```python
# Anthropic
import instructor
from anthropic import Anthropic
client = instructor.from_anthropic(Anthropic())

# Gemini
import instructor
import google.generativeai as genai
client = instructor.from_gemini(genai.GenerativeModel("gemini-1.5-pro"))

# Ollama (local)
from openai import OpenAI
client = instructor.from_openai(
    OpenAI(base_url="http://localhost:11434/v1"),
    mode=instructor.Mode.JSON,
)
```

### 2. Nested & Complex Types

```python
from typing import List
from pydantic import BaseModel

class Address(BaseModel):
    street: str
    city: str
    country: str

class Company(BaseModel):
    name: str
    industry: str
    addresses: List[Address]
    employee_count: int

company = client.chat.completions.create(
    model="gpt-4o",
    response_model=Company,
    messages=[{"role": "user", "content": "..."}],
)
```

### 3. Streaming Partial Results

```python
for partial in client.chat.completions.create_partial(
    model="gpt-4o",
    response_model=User,
    messages=[{"role": "user", "content": "John is 30"}],
):
    print(partial)  # Progressively filled fields
```

### 4. Automatic Retries

```python
user = client.chat.completions.create(
    model="gpt-4o",
    response_model=User,
    messages=[...],
    max_retries=3,  # Retries with validation errors fed back
)
```

## FAQ

**Q: How does it differ from function calling?**
A: Instructor builds on function calling but adds Pydantic validation, automatic retries with error feedback, streaming, and multi-provider support.

**Q: Does it work with Claude?**
A: Yes, via `instructor.from_anthropic()` with full tool-use support.

**Q: Performance overhead?**
A: Minimal — it is a thin wrapper. Retries add latency only when validation fails.

## Source & Thanks

- GitHub: [jxnl/instructor](https://github.com/jxnl/instructor) (8k+ stars)
- Docs: [python.useinstructor.com](https://python.useinstructor.com)
- Author: Jason Liu

<!-- ZH -->

## 快速使用

```bash
pip install instructor
```

用 Pydantic 模型定义输出结构，Instructor 自动从 LLM 提取验证过的结构化数据。

## 什么是 Instructor？

Instructor 为 LLM 客户端库打补丁，让它们返回经过验证的 Pydantic 对象而非原始文本。支持重试、流式输出和部分响应。

**一句话总结**：Instructor 使用 Pydantic 模型从 LLM 提取结构化数据，支持 OpenAI、Anthropic、Gemini 和本地模型。

## 核心模式

### 1. 多供应商支持
支持 OpenAI、Anthropic、Gemini、Ollama 等，一行代码切换。

### 2. 嵌套复杂类型
支持嵌套模型、列表、可选字段等 Pydantic 全部功能。

### 3. 流式部分结果
逐步填充字段，适合大型结构化输出的实时展示。

### 4. 自动重试
验证失败时自动将错误信息反馈给 LLM 重新生成。

## 常见问题

**Q: 支持 Claude 吗？**
A: 支持，通过 `instructor.from_anthropic()` 使用 tool-use 模式。

**Q: 性能开销？**
A: 极小，仅薄封装。重试仅在验证失败时增加延迟。

## 来源与致谢

- GitHub: [jxnl/instructor](https://github.com/jxnl/instructor) (8k+ stars)
- 作者：Jason Liu

---
Source: https://tokrepo.com/en/workflows/9301dfb7-b047-4c15-94d2-47d349a77865
Author: Agent Toolkit