# Structured Outputs — Force LLMs to Return Valid JSON > Complete guide to getting reliable structured JSON from LLMs. Covers OpenAI structured outputs, Claude tool use, Instructor library, and Outlines for guaranteed valid responses. ## Install Paste the prompt below into your AI tool: ## Quick Use ### OpenAI Structured Outputs ```python from openai import OpenAI from pydantic import BaseModel class ExtractedInfo(BaseModel): name: str age: int skills: list[str] client = OpenAI() response = client.beta.chat.completions.parse( model="gpt-4o", messages=[{"role": "user", "content": "John is 30, knows Python and Rust"}], response_format=ExtractedInfo, ) print(response.choices[0].message.parsed) # ExtractedInfo(name='John', age=30, skills=['Python', 'Rust']) ``` ### Claude Tool Use (Structured) ```python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, tools=[{ "name": "extract_info", "description": "Extract structured information", "input_schema": { "type": "object", "properties": { "name": {"type": "string"}, "age": {"type": "integer"}, "skills": {"type": "array", "items": {"type": "string"}} }, "required": ["name", "age", "skills"] } }], tool_choice={"type": "tool", "name": "extract_info"}, messages=[{"role": "user", "content": "John is 30, knows Python and Rust"}], ) ``` ## What are Structured Outputs? Structured outputs force LLMs to return data in a specific format (JSON, typed objects) instead of free-form text. This is critical for building reliable AI pipelines where downstream code needs to parse the response. Different providers offer different mechanisms — this guide covers all major approaches. **Answer-Ready**: Structured outputs force LLMs to return valid JSON/typed data. OpenAI uses response_format with Pydantic, Claude uses tool_choice for guaranteed schemas, Instructor adds retry logic, Outlines uses guided generation. Essential for reliable AI pipelines. **Best for**: AI engineers building data extraction, classification, or structured generation pipelines. **Works with**: OpenAI, Claude, open-source models. ## Approaches Compared | Method | Provider | Guarantee | Retry | |--------|----------|-----------|-------| | OpenAI Structured Outputs | OpenAI | Schema-enforced | N/A | | Claude Tool Use | Anthropic | Schema-enforced | N/A | | Instructor | Any (wrapper) | Retry-based | Yes | | Outlines | Open-source models | Token-level | N/A | | JSON Mode | OpenAI/Anthropic | Valid JSON only | N/A | ## Method Details ### 1. OpenAI Structured Outputs (Best for OpenAI) - Uses Pydantic models as response_format - Server-side schema enforcement - 100% valid output guaranteed - Supports nested objects, arrays, enums, unions ### 2. Claude Tool Use (Best for Claude) - Define a tool with input_schema - Force the tool via tool_choice - Claude fills the schema as tool arguments - 100% valid output guaranteed ### 3. Instructor (Best for Multi-Provider) ```python import instructor from openai import OpenAI client = instructor.from_openai(OpenAI()) class UserInfo(BaseModel): name: str age: int user = client.chat.completions.create( model="gpt-4o", response_model=UserInfo, messages=[{"role": "user", "content": "John is 30"}], ) ``` ### 4. Outlines (Best for Open-Source) ```python import outlines model = outlines.models.transformers("mistralai/Mistral-7B-v0.1") generator = outlines.generate.json(model, UserInfo) user = generator("John is 30") ``` ## Best Practices 1. **Keep schemas simple** — Flat structures are more reliable than deeply nested 2. **Use descriptions** — Add field descriptions to help the LLM understand intent 3. **Provide examples** — Few-shot examples in the prompt improve accuracy 4. **Validate outputs** — Even with guarantees, validate business logic 5. **Handle edge cases** — Optional fields for data that might not be present ## FAQ **Q: Which method is most reliable?** A: OpenAI Structured Outputs and Claude Tool Use are both server-enforced. They are equally reliable for their respective providers. **Q: Can I use structured outputs with streaming?** A: Yes, both OpenAI and Claude support streaming with structured outputs. Partial objects are available as they stream. **Q: What about Gemini?** A: Gemini supports JSON mode and schema constraints. Use the `response_mime_type` parameter. ## Source & Thanks > References: > - [OpenAI Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs) > - [Anthropic Tool Use](https://docs.anthropic.com/en/docs/build-with-claude/tool-use) > - [Instructor](https://github.com/jxnl/instructor) — 9k+ stars > - [Outlines](https://github.com/dottxt-ai/outlines) — 10k+ stars ## Quick Use Define output format with a Pydantic model; the LLM is guaranteed to return valid JSON. ## What Are Structured Outputs? Force an LLM to return a specific format (JSON/typed object) rather than free text. The key to building reliable AI pipelines. **TL;DR**: Force LLMs to return valid JSON. OpenAI uses response_format, Claude uses tool_choice, Instructor wraps multiple providers, Outlines open-source guided generation. ## Method Comparison ### 1. OpenAI Structured Outputs — Server-side schema enforcement ### 2. Claude Tool Use — Schema enforcement via tool calls ### 3. Instructor — Multi-provider wrapper with retries ### 4. Outlines — Token-level guided generation for open-source models ## Best Practices 1. Keep schemas simple 2. Add field descriptions 3. Provide few-shot examples 4. Validate business logic ## Source & Thanks > [OpenAI Docs](https://platform.openai.com/docs/guides/structured-outputs) | [Anthropic Docs](https://docs.anthropic.com/en/docs/build-with-claude/tool-use) | [Instructor](https://github.com/jxnl/instructor) | [Outlines](https://github.com/dottxt-ai/outlines) --- Source: https://tokrepo.com/en/workflows/structured-outputs-force-llms-return-valid-json-26c0617e Author: Prompt Lab