What is DSPy?
DSPy is a framework that replaces hand-written prompts with modular, compilable programs. Instead of tweaking prompt strings, you define what your AI pipeline should do declaratively — then DSPy optimizes the prompts automatically through compilation. It treats LLM calls as optimizable modules, similar to how PyTorch treats neural network layers.
Answer-Ready: DSPy is a framework for programming (not prompting) LLMs. Define AI pipelines declaratively, compile them into optimized prompts automatically. Created at Stanford NLP. Replaces prompt engineering with systematic optimization. 22k+ GitHub stars.
Best for: AI engineers building reliable LLM pipelines. Works with: OpenAI, Anthropic Claude, local models. Setup time: Under 3 minutes.
Core Concepts
1. Signatures (Define I/O)
# Simple signature
classify = dspy.Predict("sentence -> sentiment")
# Detailed signature
class FactCheck(dspy.Signature):
claim = dspy.InputField(desc="A factual claim to verify")
evidence = dspy.OutputField(desc="Supporting or refuting evidence")
verdict = dspy.OutputField(desc="True, False, or Uncertain")2. Modules (Build Pipelines)
class RAGPipeline(dspy.Module):
def __init__(self):
self.retrieve = dspy.Retrieve(k=5)
self.generate = dspy.ChainOfThought("context, question -> answer")
def forward(self, question):
context = self.retrieve(question).passages
return self.generate(context=context, question=question)3. Optimizers (Compile Prompts)
from dspy.teleprompt import BootstrapFewShot
# Provide training examples
trainset = [
dspy.Example(question="...", answer="...").with_inputs("question"),
]
# Compile: auto-generate optimized prompts
optimizer = BootstrapFewShot(metric=my_metric, max_bootstrapped_demos=4)
compiled_rag = optimizer.compile(RAGPipeline(), trainset=trainset)4. Metrics
def my_metric(example, prediction, trace=None):
return prediction.answer.lower() == example.answer.lower()Why DSPy over Prompt Engineering?
| Aspect | Prompt Engineering | DSPy |
|---|---|---|
| Approach | Manual string tweaking | Declarative programming |
| Optimization | Trial and error | Automatic compilation |
| Reliability | Fragile | Systematic |
| Modularity | Copy-paste | Composable modules |
| Model switching | Rewrite prompts | Recompile |
FAQ
Q: Does it work with Claude?
A: Yes, supports Anthropic Claude via dspy.LM("anthropic/claude-sonnet-4-20250514").
Q: How is it different from LangChain? A: LangChain chains manual prompts together. DSPy optimizes prompts automatically through compilation — you define the task, DSPy figures out the best prompt.
Q: Is it production-ready? A: Yes, used by companies for production RAG, classification, and extraction pipelines.