# Structured Outputs — Force LLMs to Return Valid JSON

> Complete guide to getting reliable structured JSON from LLMs. Covers OpenAI structured outputs, Claude tool use, Instructor library, and Outlines for guaranteed valid responses.

## Install

Paste the prompt below into your AI tool:

## Quick Use

### OpenAI Structured Outputs

```python
from openai import OpenAI
from pydantic import BaseModel

class ExtractedInfo(BaseModel):
    name: str
    age: int
    skills: list[str]

client = OpenAI()
response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[{"role": "user", "content": "John is 30, knows Python and Rust"}],
    response_format=ExtractedInfo,
)
print(response.choices[0].message.parsed)
# ExtractedInfo(name='John', age=30, skills=['Python', 'Rust'])
```

### Claude Tool Use (Structured)

```python
import anthropic

client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[{
        "name": "extract_info",
        "description": "Extract structured information",
        "input_schema": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "age": {"type": "integer"},
                "skills": {"type": "array", "items": {"type": "string"}}
            },
            "required": ["name", "age", "skills"]
        }
    }],
    tool_choice={"type": "tool", "name": "extract_info"},
    messages=[{"role": "user", "content": "John is 30, knows Python and Rust"}],
)
```

## What are Structured Outputs?

Structured outputs force LLMs to return data in a specific format (JSON, typed objects) instead of free-form text. This is critical for building reliable AI pipelines where downstream code needs to parse the response. Different providers offer different mechanisms — this guide covers all major approaches.

**Answer-Ready**: Structured outputs force LLMs to return valid JSON/typed data. OpenAI uses response_format with Pydantic, Claude uses tool_choice for guaranteed schemas, Instructor adds retry logic, Outlines uses guided generation. Essential for reliable AI pipelines.

**Best for**: AI engineers building data extraction, classification, or structured generation pipelines. **Works with**: OpenAI, Claude, open-source models.

## Approaches Compared

| Method | Provider | Guarantee | Retry |
|--------|----------|-----------|-------|
| OpenAI Structured Outputs | OpenAI | Schema-enforced | N/A |
| Claude Tool Use | Anthropic | Schema-enforced | N/A |
| Instructor | Any (wrapper) | Retry-based | Yes |
| Outlines | Open-source models | Token-level | N/A |
| JSON Mode | OpenAI/Anthropic | Valid JSON only | N/A |

## Method Details

### 1. OpenAI Structured Outputs (Best for OpenAI)

- Uses Pydantic models as response_format
- Server-side schema enforcement
- 100% valid output guaranteed
- Supports nested objects, arrays, enums, unions

### 2. Claude Tool Use (Best for Claude)

- Define a tool with input_schema
- Force the tool via tool_choice
- Claude fills the schema as tool arguments
- 100% valid output guaranteed

### 3. Instructor (Best for Multi-Provider)

```python
import instructor
from openai import OpenAI

client = instructor.from_openai(OpenAI())

class UserInfo(BaseModel):
    name: str
    age: int

user = client.chat.completions.create(
    model="gpt-4o",
    response_model=UserInfo,
    messages=[{"role": "user", "content": "John is 30"}],
)
```

### 4. Outlines (Best for Open-Source)

```python
import outlines

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
generator = outlines.generate.json(model, UserInfo)
user = generator("John is 30")
```

## Best Practices

1. **Keep schemas simple** — Flat structures are more reliable than deeply nested
2. **Use descriptions** — Add field descriptions to help the LLM understand intent
3. **Provide examples** — Few-shot examples in the prompt improve accuracy
4. **Validate outputs** — Even with guarantees, validate business logic
5. **Handle edge cases** — Optional fields for data that might not be present

## FAQ

**Q: Which method is most reliable?**
A: OpenAI Structured Outputs and Claude Tool Use are both server-enforced. They are equally reliable for their respective providers.

**Q: Can I use structured outputs with streaming?**
A: Yes, both OpenAI and Claude support streaming with structured outputs. Partial objects are available as they stream.

**Q: What about Gemini?**
A: Gemini supports JSON mode and schema constraints. Use the `response_mime_type` parameter.

## Source & Thanks

> References:
> - [OpenAI Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs)
> - [Anthropic Tool Use](https://docs.anthropic.com/en/docs/build-with-claude/tool-use)
> - [Instructor](https://github.com/jxnl/instructor) — 9k+ stars
> - [Outlines](https://github.com/dottxt-ai/outlines) — 10k+ stars

<!-- ZH -->

## Quick Use

Define output format with a Pydantic model; the LLM is guaranteed to return valid JSON.

## What Are Structured Outputs?

Force an LLM to return a specific format (JSON/typed object) rather than free text. The key to building reliable AI pipelines.

**TL;DR**: Force LLMs to return valid JSON. OpenAI uses response_format, Claude uses tool_choice, Instructor wraps multiple providers, Outlines open-source guided generation.

## Method Comparison

### 1. OpenAI Structured Outputs — Server-side schema enforcement
### 2. Claude Tool Use — Schema enforcement via tool calls
### 3. Instructor — Multi-provider wrapper with retries
### 4. Outlines — Token-level guided generation for open-source models

## Best Practices

1. Keep schemas simple
2. Add field descriptions
3. Provide few-shot examples
4. Validate business logic

## Source & Thanks

> [OpenAI Docs](https://platform.openai.com/docs/guides/structured-outputs) | [Anthropic Docs](https://docs.anthropic.com/en/docs/build-with-claude/tool-use) | [Instructor](https://github.com/jxnl/instructor) | [Outlines](https://github.com/dottxt-ai/outlines)

---
Source: https://tokrepo.com/en/workflows/structured-outputs-force-llms-return-valid-json-26c0617e
Author: Prompt Lab