DSPy — Programming Foundation Models Declaratively
Replace hand-written prompts with modular programs. DSPy compiles declarative AI pipelines into optimized prompts automatically, boosting reliability and performance.
Review-first install path
This asset needs a review step. The copied prompt tells the agent to dry-run, show the writes, then proceed only after confirmation.
npx -y tokrepo@latest install 023c142e-eba9-40ff-b598-4c0774814726 --target codexDry-run first, confirm the writes, then run this command.
What it is
DSPy is a framework for programming (not prompting) language models. Instead of tweaking prompt strings, you define what your AI pipeline should do declaratively using signatures and modules, then DSPy optimizes the prompts automatically through compilation. It treats LLM calls as optimizable modules, similar to how PyTorch treats neural network layers.
DSPy is created at Stanford NLP and targets AI engineers building reliable LLM pipelines who want systematic optimization over manual prompt engineering.
How it saves time or tokens
Manual prompt engineering is trial and error. DSPy replaces this with automatic compilation that optimizes prompts based on evaluation metrics. The optimizer tries different prompt strategies and selects the one that scores highest on your test set. This produces better prompts in less time, and the resulting prompts are often more token-efficient because the optimizer removes unnecessary instructions.
How to use
- Install DSPy:
pip install dspy
- Configure a language model:
import dspy
lm = dspy.LM('openai/gpt-4o-mini')
dspy.configure(lm=lm)
- Define a signature and module:
qa = dspy.ChainOfThought('question -> answer')
result = qa(question='What is the capital of France?')
print(result.answer)
Example
Building a fact-checking pipeline with compilation:
import dspy
class FactCheck(dspy.Signature):
claim = dspy.InputField(desc='A factual claim to verify')
evidence = dspy.OutputField(desc='Supporting or refuting evidence')
verdict = dspy.OutputField(desc='True, False, or Uncertain')
class FactChecker(dspy.Module):
def __init__(self):
self.check = dspy.ChainOfThought(FactCheck)
def forward(self, claim):
return self.check(claim=claim)
# Compile with training examples
from dspy.teleprompt import BootstrapFewShot
optimizer = BootstrapFewShot(metric=your_metric)
compiled_checker = optimizer.compile(FactChecker(), trainset=train_data)
# Use the optimized pipeline
result = compiled_checker(claim='The Earth is flat')
print(result.verdict)
Related on TokRepo
- Prompt library — curated prompts and prompt patterns
- AI tools for research — research and analysis tools
Common pitfalls
- Compilation requires a training set with labeled examples; without good examples, the optimizer cannot improve over the default prompt
- DSPy adds abstraction overhead; for simple one-shot tasks, direct prompting is simpler and equally effective
- The optimizer runs multiple LLM calls during compilation, which can be expensive; start with a small training set and a cheaper model
Frequently Asked Questions
DSPy works with OpenAI (GPT-4, GPT-4o), Anthropic Claude, Google Gemini, and local models via Ollama or vLLM. The dspy.LM class accepts model identifiers in the format provider/model-name. Any model with a chat or completion API is compatible.
A Signature defines the input and output fields of an LLM call. It can be a simple string like 'question -> answer' or a class with typed InputField and OutputField attributes. Signatures describe what the module does without specifying how.
Compilation runs an optimizer (like BootstrapFewShot or MIPRO) that tries different prompt strategies on your training data. It evaluates each strategy using your metric function and selects the one that scores highest. The result is an optimized module with better prompts.
Yes. DSPy has built-in retrieval modules that integrate with search engines and vector stores. You can compose retrieval and generation modules into a single optimizable pipeline, and the compiler optimizes both the retrieval query and the generation prompt.
LangChain provides building blocks for LLM applications (chains, tools, memory). DSPy focuses on automatic prompt optimization through compilation. They serve different needs: LangChain for application architecture, DSPy for systematic prompt engineering. They can be used together.
Citations (3)
- DSPy GitHub— DSPy is a framework for programming foundation models
- DSPy Paper— Created at Stanford NLP for systematic prompt optimization
- DSPy Docs— Compilation optimizes prompts automatically
Related on TokRepo
Source & Thanks
Created by Stanford NLP. Licensed under MIT.
stanfordnlp/dspy — 22k+ stars
Discussion
Related Assets
DSPy — Program LLMs Instead of Prompting
DSPy is a Python framework for programming language models instead of prompting them. 33.3K+ GitHub stars. Build modular AI systems — classifiers, RAG pipelines, agent loops — and let DSPy optimize pr
LitGPT — Fine-Tune and Deploy AI Models Simply
Lightning AI's framework for fine-tuning and serving 20+ LLM families. LitGPT supports LoRA, QLoRA, full fine-tuning with one-command training on consumer hardware.
Prompt Architect — 27 Frameworks for Expert Prompts
Transform vague prompts into structured, expert-level prompts using 27 research-backed frameworks across 7 intent categories. Works with Claude Code, ChatGPT, Cursor, and 30+ AI tools.
System Prompts — Extracted from 30+ AI Coding Tools
Full system prompts extracted from Claude Code, Cursor, Devin, Windsurf, Replit, v0, and 25+ more AI tools. See exactly how they work.