# Structured Outputs — Force LLMs to Return Valid JSON

> Complete guide to getting reliable structured JSON from LLMs. Covers OpenAI structured outputs, Claude tool use, Instructor library, and Outlines for guaranteed valid responses.

## Install

Paste the prompt below into your AI tool:

## Quick Use

### OpenAI Structured Outputs

```python
from openai import OpenAI
from pydantic import BaseModel

class ExtractedInfo(BaseModel):
    name: str
    age: int
    skills: list[str]

client = OpenAI()
response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[{"role": "user", "content": "John is 30, knows Python and Rust"}],
    response_format=ExtractedInfo,
)
print(response.choices[0].message.parsed)
# ExtractedInfo(name='John', age=30, skills=['Python', 'Rust'])
```

### Claude Tool Use (Structured)

```python
import anthropic

client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[{
        "name": "extract_info",
        "description": "Extract structured information",
        "input_schema": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "age": {"type": "integer"},
                "skills": {"type": "array", "items": {"type": "string"}}
            },
            "required": ["name", "age", "skills"]
        }
    }],
    tool_choice={"type": "tool", "name": "extract_info"},
    messages=[{"role": "user", "content": "John is 30, knows Python and Rust"}],
)
```

## What are Structured Outputs?

Structured outputs force LLMs to return data in a specific format (JSON, typed objects) instead of free-form text. This is critical for building reliable AI pipelines where downstream code needs to parse the response. Different providers offer different mechanisms — this guide covers all major approaches.

**Answer-Ready**: Structured outputs force LLMs to return valid JSON/typed data. OpenAI uses response_format with Pydantic, Claude uses tool_choice for guaranteed schemas, Instructor adds retry logic, Outlines uses guided generation. Essential for reliable AI pipelines.

**Best for**: AI engineers building data extraction, classification, or structured generation pipelines. **Works with**: OpenAI, Claude, open-source models.

## Approaches Compared

| Method | Provider | Guarantee | Retry |
|--------|----------|-----------|-------|
| OpenAI Structured Outputs | OpenAI | Schema-enforced | N/A |
| Claude Tool Use | Anthropic | Schema-enforced | N/A |
| Instructor | Any (wrapper) | Retry-based | Yes |
| Outlines | Open-source models | Token-level | N/A |
| JSON Mode | OpenAI/Anthropic | Valid JSON only | N/A |

## Method Details

### 1. OpenAI Structured Outputs (Best for OpenAI)

- Uses Pydantic models as response_format
- Server-side schema enforcement
- 100% valid output guaranteed
- Supports nested objects, arrays, enums, unions

### 2. Claude Tool Use (Best for Claude)

- Define a tool with input_schema
- Force the tool via tool_choice
- Claude fills the schema as tool arguments
- 100% valid output guaranteed

### 3. Instructor (Best for Multi-Provider)

```python
import instructor
from openai import OpenAI

client = instructor.from_openai(OpenAI())

class UserInfo(BaseModel):
    name: str
    age: int

user = client.chat.completions.create(
    model="gpt-4o",
    response_model=UserInfo,
    messages=[{"role": "user", "content": "John is 30"}],
)
```

### 4. Outlines (Best for Open-Source)

```python
import outlines

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
generator = outlines.generate.json(model, UserInfo)
user = generator("John is 30")
```

## Best Practices

1. **Keep schemas simple** — Flat structures are more reliable than deeply nested
2. **Use descriptions** — Add field descriptions to help the LLM understand intent
3. **Provide examples** — Few-shot examples in the prompt improve accuracy
4. **Validate outputs** — Even with guarantees, validate business logic
5. **Handle edge cases** — Optional fields for data that might not be present

## FAQ

**Q: Which method is most reliable?**
A: OpenAI Structured Outputs and Claude Tool Use are both server-enforced. They are equally reliable for their respective providers.

**Q: Can I use structured outputs with streaming?**
A: Yes, both OpenAI and Claude support streaming with structured outputs. Partial objects are available as they stream.

**Q: What about Gemini?**
A: Gemini supports JSON mode and schema constraints. Use the `response_mime_type` parameter.

## Source & Thanks

> References:
> - [OpenAI Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs)
> - [Anthropic Tool Use](https://docs.anthropic.com/en/docs/build-with-claude/tool-use)
> - [Instructor](https://github.com/jxnl/instructor) — 9k+ stars
> - [Outlines](https://github.com/dottxt-ai/outlines) — 10k+ stars

<!-- ZH -->

## 快速使用

用 Pydantic 模型定义输出格式，LLM 保证返回有效 JSON。

## 什么是结构化输出？

强制 LLM 返回特定格式（JSON/类型化对象），而非自由文本。构建可靠 AI 管线的关键。

**一句话总结**：强制 LLM 返回有效 JSON，OpenAI 用 response_format，Claude 用 tool_choice，Instructor 多供应商支持，Outlines 开源引导生成。

## 方法对比

### 1. OpenAI Structured Outputs — 服务端 schema 约束
### 2. Claude Tool Use — 工具调用 schema 约束
### 3. Instructor — 多供应商包装器 + 重试
### 4. Outlines — 开源模型 token 级引导

## 最佳实践

1. 保持 schema 简单
2. 添加字段描述
3. 提供 few-shot 示例
4. 验证业务逻辑

## 来源与致谢

> [OpenAI Docs](https://platform.openai.com/docs/guides/structured-outputs) | [Anthropic Docs](https://docs.anthropic.com/en/docs/build-with-claude/tool-use) | [Instructor](https://github.com/jxnl/instructor) | [Outlines](https://github.com/dottxt-ai/outlines)

---
Source: https://tokrepo.com/en/workflows/26c0617e-28c8-4a26-8a87-b765d3921208
Author: Prompt Lab