What is Structured Outputs — Force LLMs to Return Valid JSON?

Complete guide to getting reliable structured JSON from LLMs. Covers OpenAI structured outputs, Claude tool use, Instructor library, and Outlines for guaranteed valid responses.

Is Structured Outputs — Force LLMs to Return Valid JSON free to use?

Yes. Structured Outputs — Force LLMs to Return Valid JSON is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Structured Outputs — Force LLMs to Return Valid JSON?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Structured Outputs — Force LLMs to Return Valid JSON

OpenAI Structured Outputs

from openai import OpenAI
from pydantic import BaseModel

class ExtractedInfo(BaseModel):
    name: str
    age: int
    skills: list[str]

client = OpenAI()
response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[{"role": "user", "content": "John is 30, knows Python and Rust"}],
    response_format=ExtractedInfo,
)
print(response.choices[0].message.parsed)
# ExtractedInfo(name='John', age=30, skills=['Python', 'Rust'])

Claude Tool Use (Structured)

import anthropic

client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[{
        "name": "extract_info",
        "description": "Extract structured information",
        "input_schema": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "age": {"type": "integer"},
                "skills": {"type": "array", "items": {"type": "string"}}
            },
            "required": ["name", "age", "skills"]
        }
    }],
    tool_choice={"type": "tool", "name": "extract_info"},
    messages=[{"role": "user", "content": "John is 30, knows Python and Rust"}],
)

What are Structured Outputs?

Structured outputs force LLMs to return data in a specific format (JSON, typed objects) instead of free-form text. This is critical for building reliable AI pipelines where downstream code needs to parse the response. Different providers offer different mechanisms — this guide covers all major approaches.

Answer-Ready: Structured outputs force LLMs to return valid JSON/typed data. OpenAI uses response_format with Pydantic, Claude uses tool_choice for guaranteed schemas, Instructor adds retry logic, Outlines uses guided generation. Essential for reliable AI pipelines.

Best for: AI engineers building data extraction, classification, or structured generation pipelines. Works with: OpenAI, Claude, open-source models.

Approaches Compared

Method	Provider	Guarantee	Retry
OpenAI Structured Outputs	OpenAI	Schema-enforced	N/A
Claude Tool Use	Anthropic	Schema-enforced	N/A
Instructor	Any (wrapper)	Retry-based	Yes
Outlines	Open-source models	Token-level	N/A
JSON Mode	OpenAI/Anthropic	Valid JSON only	N/A

Method Details

1. OpenAI Structured Outputs (Best for OpenAI)

Uses Pydantic models as response_format
Server-side schema enforcement
100% valid output guaranteed
Supports nested objects, arrays, enums, unions

2. Claude Tool Use (Best for Claude)

Define a tool with input_schema
Force the tool via tool_choice
Claude fills the schema as tool arguments
100% valid output guaranteed

3. Instructor (Best for Multi-Provider)

import instructor
from openai import OpenAI

client = instructor.from_openai(OpenAI())

class UserInfo(BaseModel):
    name: str
    age: int

user = client.chat.completions.create(
    model="gpt-4o",
    response_model=UserInfo,
    messages=[{"role": "user", "content": "John is 30"}],
)

4. Outlines (Best for Open-Source)

import outlines

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
generator = outlines.generate.json(model, UserInfo)
user = generator("John is 30")

Best Practices

Keep schemas simple — Flat structures are more reliable than deeply nested
Use descriptions — Add field descriptions to help the LLM understand intent
Provide examples — Few-shot examples in the prompt improve accuracy
Validate outputs — Even with guarantees, validate business logic
Handle edge cases — Optional fields for data that might not be present

FAQ

Q: Which method is most reliable? A: OpenAI Structured Outputs and Claude Tool Use are both server-enforced. They are equally reliable for their respective providers.

Q: Can I use structured outputs with streaming? A: Yes, both OpenAI and Claude support streaming with structured outputs. Partial objects are available as they stream.

Q: What about Gemini? A: Gemini supports JSON mode and schema constraints. Use the response_mime_type parameter.