# R2R — Production-Ready Agentic RAG System

> A state-of-the-art production-ready retrieval-augmented generation system with agentic capabilities, a RESTful API, and built-in document processing, vector search, and knowledge graph support.

## Install

Save in your project root:

# R2R — Production-Ready Agentic RAG System

## Quick Use
```bash
pip install r2r
# Start the R2R server
r2r serve --docker
# Ingest a document
r2r ingest-files --file-paths /path/to/doc.pdf
# Query
r2r rag --query "What does this document cover?"
```

## Introduction
R2R (Reason to Retrieve) is a production-grade retrieval-augmented generation framework built by SciPhi AI. It provides everything needed to go from raw documents to an agentic RAG pipeline with a single deployable service, removing the need to stitch together separate vector databases, embedding services, and LLM orchestration layers.

## What R2R Does
- Ingests documents in 20+ formats (PDF, DOCX, HTML, Markdown, images with OCR) and chunks them automatically
- Runs hybrid search combining vector similarity and keyword matching with reciprocal rank fusion
- Builds and queries a knowledge graph alongside the vector index for multi-hop reasoning
- Exposes a full RESTful API for document management, search, RAG, and agent interactions
- Supports agentic RAG where the system plans retrieval strategies and iterates on answers

## Architecture Overview
R2R runs as a containerized service with three main subsystems: an ingestion pipeline that parses, chunks, and embeds documents into PostgreSQL with pgvector; a retrieval engine that performs hybrid search and optional graph traversal; and an agentic orchestrator that chains retrieval, reasoning, and tool use. The system uses Hatchet for async task orchestration and exposes all functionality through a FastAPI-based REST interface. A Python SDK and CLI wrap the API for developer convenience.

## Self-Hosting & Configuration
- Deploy with Docker Compose for a single-command setup including PostgreSQL, pgvector, and the R2R server
- Configure LLM and embedding providers via environment variables (supports OpenAI, Anthropic, local models)
- Customize chunking strategy, overlap, and embedding dimensions in the TOML config
- Enable the knowledge graph module by setting the graph provider configuration
- Scale horizontally by adding worker instances behind the task queue

## Key Features
- End-to-end RAG in a single service: ingestion, embedding, search, generation, and agent orchestration
- Hybrid retrieval with vector search, full-text search, and knowledge graph traversal
- Multi-tenant architecture with user-level document permissions and access control
- Agentic RAG mode where the system autonomously decides when and how to retrieve
- Built-in evaluation endpoints for measuring retrieval and generation quality

## Comparison with Similar Tools
- **LangChain** — general-purpose LLM framework requiring assembly; R2R is an integrated, deployable RAG service
- **LlamaIndex** — strong indexing library but needs external infrastructure; R2R bundles everything in one container
- **Haystack** — modular pipeline framework; R2R trades modularity for faster time-to-production
- **RAGFlow** — document-focused RAG engine; R2R adds agentic capabilities and knowledge graph support
- **Verba** — Weaviate-based RAG UI; R2R is backend-focused with a full API and more retrieval strategies

## FAQ
**Q: What database does R2R use?**
A: PostgreSQL with the pgvector extension for vector storage and optional graph storage. Everything runs in the provided Docker Compose stack.

**Q: Can I use local models instead of OpenAI?**
A: Yes. R2R supports any OpenAI-compatible endpoint including Ollama, vLLM, and other local inference servers.

**Q: How does the knowledge graph work?**
A: R2R extracts entities and relationships from ingested documents and stores them in a graph structure. During retrieval, the agent can traverse the graph for multi-hop reasoning alongside vector search.

**Q: Is R2R suitable for production workloads?**
A: Yes. It includes authentication, multi-tenancy, async task processing, and horizontal scaling support.

## Sources
- https://github.com/SciPhi-AI/R2R
- https://r2r-docs.sciphi.ai

---
Source: https://tokrepo.com/en/workflows/asset-4d12517d
Author: AI Open Source