# R2R — Production-Ready Agentic RAG System > A state-of-the-art production-ready retrieval-augmented generation system with agentic capabilities, a RESTful API, and built-in document processing, vector search, and knowledge graph support. ## Install Save in your project root: # R2R — Production-Ready Agentic RAG System ## Quick Use ```bash pip install r2r # Start the R2R server r2r serve --docker # Ingest a document r2r ingest-files --file-paths /path/to/doc.pdf # Query r2r rag --query "What does this document cover?" ``` ## Introduction R2R (Reason to Retrieve) is a production-grade retrieval-augmented generation framework built by SciPhi AI. It provides everything needed to go from raw documents to an agentic RAG pipeline with a single deployable service, removing the need to stitch together separate vector databases, embedding services, and LLM orchestration layers. ## What R2R Does - Ingests documents in 20+ formats (PDF, DOCX, HTML, Markdown, images with OCR) and chunks them automatically - Runs hybrid search combining vector similarity and keyword matching with reciprocal rank fusion - Builds and queries a knowledge graph alongside the vector index for multi-hop reasoning - Exposes a full RESTful API for document management, search, RAG, and agent interactions - Supports agentic RAG where the system plans retrieval strategies and iterates on answers ## Architecture Overview R2R runs as a containerized service with three main subsystems: an ingestion pipeline that parses, chunks, and embeds documents into PostgreSQL with pgvector; a retrieval engine that performs hybrid search and optional graph traversal; and an agentic orchestrator that chains retrieval, reasoning, and tool use. The system uses Hatchet for async task orchestration and exposes all functionality through a FastAPI-based REST interface. A Python SDK and CLI wrap the API for developer convenience. ## Self-Hosting & Configuration - Deploy with Docker Compose for a single-command setup including PostgreSQL, pgvector, and the R2R server - Configure LLM and embedding providers via environment variables (supports OpenAI, Anthropic, local models) - Customize chunking strategy, overlap, and embedding dimensions in the TOML config - Enable the knowledge graph module by setting the graph provider configuration - Scale horizontally by adding worker instances behind the task queue ## Key Features - End-to-end RAG in a single service: ingestion, embedding, search, generation, and agent orchestration - Hybrid retrieval with vector search, full-text search, and knowledge graph traversal - Multi-tenant architecture with user-level document permissions and access control - Agentic RAG mode where the system autonomously decides when and how to retrieve - Built-in evaluation endpoints for measuring retrieval and generation quality ## Comparison with Similar Tools - **LangChain** — general-purpose LLM framework requiring assembly; R2R is an integrated, deployable RAG service - **LlamaIndex** — strong indexing library but needs external infrastructure; R2R bundles everything in one container - **Haystack** — modular pipeline framework; R2R trades modularity for faster time-to-production - **RAGFlow** — document-focused RAG engine; R2R adds agentic capabilities and knowledge graph support - **Verba** — Weaviate-based RAG UI; R2R is backend-focused with a full API and more retrieval strategies ## FAQ **Q: What database does R2R use?** A: PostgreSQL with the pgvector extension for vector storage and optional graph storage. Everything runs in the provided Docker Compose stack. **Q: Can I use local models instead of OpenAI?** A: Yes. R2R supports any OpenAI-compatible endpoint including Ollama, vLLM, and other local inference servers. **Q: How does the knowledge graph work?** A: R2R extracts entities and relationships from ingested documents and stores them in a graph structure. During retrieval, the agent can traverse the graph for multi-hop reasoning alongside vector search. **Q: Is R2R suitable for production workloads?** A: Yes. It includes authentication, multi-tenancy, async task processing, and horizontal scaling support. ## Sources - https://github.com/SciPhi-AI/R2R - https://r2r-docs.sciphi.ai --- Source: https://tokrepo.com/en/workflows/asset-4d12517d Author: AI Open Source