What is Outlines?
Outlines is a library for structured text generation from LLMs. Instead of hoping the model outputs valid JSON, Outlines guarantees it through guided generation — it constrains the model's token sampling to only produce tokens that match your schema. Works with any open-source model via transformers, llama.cpp, vLLM, or MLX.
Answer-Ready: Outlines guarantees structured LLM outputs through guided generation. Supports JSON schemas, regex patterns, Pydantic models, and grammars. Works with any open-source model. No retries needed — output is always valid. 10k+ GitHub stars.
Best for: AI engineers needing reliable structured extraction. Works with: HuggingFace transformers, vLLM, llama.cpp, MLX. Setup time: Under 2 minutes.
Core Features
1. JSON Schema Generation
from pydantic import BaseModel
class Character(BaseModel):
name: str
age: int
weapon: str
generator = outlines.generate.json(model, Character)
character = generator("Create an RPG character named Aria")
# Always returns valid Character object2. Regex-Constrained Generation
# Generate valid email addresses
email_gen = outlines.generate.regex(model, r"[a-z]+@[a-z]+\.[a-z]{2,3}")
email = email_gen("Generate an email for support")
# Generate dates in specific format
date_gen = outlines.generate.regex(model, r"\d{4}-\d{2}-\d{2}")
date = date_gen("When was Python created?")3. Choice (Classification)
classifier = outlines.generate.choice(model, ["positive", "negative", "neutral"])
sentiment = classifier("This product is amazing!")
# Always one of the three options4. Grammar-Based Generation
# Generate valid SQL
sql_grammar = outlines.grammars.sql
sql_gen = outlines.generate.cfg(model, sql_grammar)
query = sql_gen("Write a query to find users older than 25")5. Multiple Backend Support
# HuggingFace Transformers
model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
# vLLM (fast serving)
model = outlines.models.vllm("mistralai/Mistral-7B-v0.1")
# llama.cpp (CPU/Metal)
model = outlines.models.llamacpp("model.gguf")
# MLX (Apple Silicon)
model = outlines.models.mlxlm("mlx-community/Mistral-7B-v0.1-4bit")Outlines vs Alternatives
| Feature | Outlines | Instructor | LMQL |
|---|---|---|---|
| Guaranteed valid output | Yes | Retry-based | Yes |
| Works offline | Yes | API only | Partial |
| Regex constraints | Yes | No | No |
| Grammar support | Yes | No | Yes |
| Speed overhead | Minimal | Retry cost | Moderate |
FAQ
Q: Does it work with Claude or GPT-4? A: Outlines is designed for open-source models where you control token sampling. For API models, use Instructor (retry-based) instead.
Q: How does guided generation work? A: Outlines builds a finite-state machine from your schema/regex and masks invalid tokens at each generation step, ensuring only valid tokens are sampled.
Q: Does it slow down generation? A: Minimal overhead. The FSM is precompiled, and token masking is O(1) per step.