Cette page est affichée en anglais. Une traduction française est en cours.

SkillsMar 30, 2026·2 min de lecture

Instructor — Structured Outputs from LLMs

Get structured, validated outputs from LLMs using Pydantic models. Works with OpenAI, Anthropic, Google, Ollama, and more. Retry logic, streaming, partial responses. 12.6K+ stars.

Script Depot · Community

Prêt pour agents

Installation avec revue préalable

Cet actif nécessite une revue. Le prompt copié demande un dry-run, affiche les écritures, puis continue seulement après confirmation.

Needs Confirmation · 66/100Policy : confirmer

Surface agent

Tout agent MCP/CLI

Type

Skill

Installation

Single

Confiance

Confiance : Established

Point d'entrée

Instructor — Structured Outputs from LLMs

Commande avec revue préalable

npx -y tokrepo@latest install 4a86c01b-d4d2-4c0f-8152-393c5685e429 --target codex

Dry-run d'abord, confirmez les écritures, puis lancez cette commande.

TL;DR

Instructor extracts structured, validated data from LLM responses using Pydantic models with retry logic and streaming.

§01

What it is

Instructor is a Python library that wraps LLM API clients to return structured, validated outputs instead of raw text. You define a Pydantic model, pass it to Instructor, and the library handles prompt injection, JSON parsing, and validation. If the LLM returns invalid data, Instructor retries automatically.

Instructor is for developers building applications that need reliable structured data from LLMs: extracting entities from text, classifying inputs, generating structured reports, or creating API responses. It supports OpenAI, Anthropic, Google, Ollama, and other providers.

§02

How it saves time or tokens

Without Instructor, you write custom parsing code for every LLM call, handle JSON extraction manually, and build retry logic yourself. Instructor replaces all of that with a single function call. The workflow provides pip install and working code examples that produce typed Python objects from LLM responses in minutes.

§03

How to use

Install Instructor:

pip install instructor

Patch your LLM client and define a Pydantic model:

import instructor
from pydantic import BaseModel
from openai import OpenAI

client = instructor.from_openai(OpenAI())

class User(BaseModel):
    name: str
    age: int
    email: str

user = client.chat.completions.create(
    model='gpt-4o',
    response_model=User,
    messages=[{'role': 'user', 'content': 'Extract: John is 30, john@example.com'}]
)

print(user.name)   # John
print(user.age)    # 30
print(user.email)  # john@example.com

The returned object is a validated Pydantic instance with type checking and field validation.

§04

Example

from typing import List
import instructor
from pydantic import BaseModel, Field
from anthropic import Anthropic

client = instructor.from_anthropic(Anthropic())

class ExtractedEntity(BaseModel):
    name: str
    entity_type: str = Field(description='person, org, or location')
    confidence: float = Field(ge=0.0, le=1.0)

class ExtractionResult(BaseModel):
    entities: List[ExtractedEntity]
    summary: str

result = client.messages.create(
    model='claude-sonnet-4-20250514',
    max_tokens=1024,
    response_model=ExtractionResult,
    messages=[{'role': 'user', 'content': 'Extract entities from: Apple CEO Tim Cook visited Berlin for a meeting with SAP executives.'}]
)

for entity in result.entities:
    print(f'{entity.name} ({entity.entity_type}): {entity.confidence}')

§05

Related on TokRepo

AI tools for coding -- Developer tools for building with LLMs
Prompt library -- Reusable prompt patterns for structured outputs

§06

Common pitfalls

Complex nested Pydantic models increase token usage because Instructor injects the schema into the prompt. Keep models flat when possible.
Retry logic consumes additional API calls and tokens. Set max_retries to a reasonable number (2-3) to avoid runaway costs.
Not all LLM providers support function calling natively. For providers without native support, Instructor falls back to JSON mode, which may be less reliable.

Questions fréquentes

Which LLM providers does Instructor support?+

Instructor supports OpenAI, Anthropic, Google (Gemini), Ollama, LiteLLM, Cohere, and any provider with an OpenAI-compatible API. Each provider has a dedicated from_ function for patching the client.

How does retry logic work?+

When the LLM returns data that fails Pydantic validation, Instructor sends the validation error back to the LLM with a corrected prompt and retries. This continues up to max_retries times. Each retry is a separate API call.

Does Instructor support streaming?+

Yes. Instructor supports partial streaming where fields are populated as the LLM generates them. You can iterate over partial results and update your UI progressively. Use the stream=True parameter with create_partial.

Can I use Instructor with local models?+

Yes. Use instructor.from_openai with an Ollama or vLLM client that exposes an OpenAI-compatible API. Local models work best when they support function calling or structured output modes.

How does Instructor compare to LangChain output parsers?+

Instructor is focused exclusively on structured extraction with Pydantic validation and automatic retries. LangChain output parsers are part of a larger framework. Instructor is simpler and more reliable for the specific task of getting validated structured data from LLMs.

Sources citées (3)

Instructor GitHub— Instructor provides structured outputs from LLMs using Pydantic
Instructor Documentation— Supports OpenAI, Anthropic, Google, Ollama and more
Pydantic Documentation— Pydantic validation for data models

En lien sur TokRepo

AI coding tools Prompt library Featured workflows

🙏

Source et remerciements

Created by Jason Liu. Licensed under MIT. instructor-ai/instructor — 12,600+ GitHub stars

Fil de discussion

Connectez-vous pour rejoindre la discussion.

Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires

Instructor — Structured LLM Outputs with Pydantic

Extract structured data from LLMs using Pydantic models. Works with OpenAI, Anthropic, Gemini, and local models. The simplest way to get reliable JSON from any LLM.

Skills

Pydantic

Instructor — Typed Structured Outputs for LLMs

Instructor turns LLM replies into validated Pydantic models with retries. `pip install instructor`, then extract typed objects across major providers.

Skills

Agent Toolkit

Outlines — Structured Outputs with Any Model

Outlines generates structured outputs (Pydantic types, enums, ints) from LLMs. `pip install outlines`, connect a backend, then request typed results.

Skills

Agent Toolkit

Guardrails — Validate & Secure LLM Outputs

Guardrails is a Python framework for validating LLM inputs/outputs to detect risks and generate structured data. 6.6K+ GitHub stars. Pre-built validators, Pydantic models. Apache 2.0.

Skills

Script Depot