Core Concepts
Composable Pipelines
Build RAG, search, and agent pipelines by connecting components:
from haystack import Pipeline
from haystack.components.retrievers import InMemoryBM25Retriever
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from haystack.document_stores.in_memory import InMemoryDocumentStore
# Create a RAG pipeline
store = InMemoryDocumentStore()
pipeline = Pipeline()
pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=store))
pipeline.add_component("prompt", PromptBuilder(
template="Context: {{documents}}\n\nQuestion: {{query}}\nAnswer:"
))
pipeline.add_component("llm", OpenAIGenerator())
pipeline.connect("retriever", "prompt.documents")
pipeline.connect("prompt", "llm")Document Processing
Ingest and process documents from any source:
from haystack.components.converters import PyPDFToDocument
from haystack.components.preprocessors import DocumentCleaner, DocumentSplitter
converter = PyPDFToDocument()
cleaner = DocumentCleaner()
splitter = DocumentSplitter(split_by="sentence", split_length=3)Multiple Retrieval Strategies
- BM25 (keyword search)
- Dense retrieval (semantic search)
- Hybrid (keyword + semantic)
- Sparse retrieval
30+ Integrations
| Category | Integrations |
|---|---|
| LLMs | OpenAI, Anthropic, Cohere, Ollama |
| Vector DBs | Pinecone, Weaviate, Qdrant, Chroma |
| Search | Elasticsearch, OpenSearch |
| Storage | S3, Azure Blob, Google Cloud |
| Evaluation | RAGAS, DeepEval |
Agent Capabilities
from haystack.components.agents import Agent
agent = Agent(
generator=OpenAIGenerator(),
tools=[search_tool, calculator_tool, web_tool]
)
result = agent.run("Research the latest AI trends and summarize")Key Stats
- 20,000+ GitHub stars
- 30+ integrations
- 500+ contributors
- Production-ready since 2019
- Used by Fortune 500 companies
FAQ
Q: What is Haystack? A: Haystack is an open-source Python framework by deepset for building production RAG pipelines, search systems, and AI agents with composable, pluggable components.
Q: Is Haystack free? A: Yes, fully open-source under Apache 2.0 license. deepset offers managed cloud hosting.
Q: How is Haystack different from LangChain? A: Haystack focuses on production-grade pipelines with strong typing and component validation. LangChain is more flexible but less opinionated. Haystack excels at search and retrieval use cases.