Cette page est affichée en anglais. Une traduction française est en cours.
ConfigsMay 21, 2026·3 min de lecture

R2R — Production-Ready Agentic RAG System

A state-of-the-art production-ready retrieval-augmented generation system with agentic capabilities, a RESTful API, and built-in document processing, vector search, and knowledge graph support.

Prêt pour agents

Cet actif peut être lu et installé directement par les agents

TokRepo expose une commande CLI universelle, un contrat d'installation, le metadata JSON, un plan selon l'adaptateur et le contenu raw pour aider les agents à juger l'adaptation, le risque et les prochaines actions.

Native · 98/100Policy : autoriser
Surface agent
Tout agent MCP/CLI
Type
Skill
Installation
Single
Confiance
Confiance : Established
Point d'entrée
R2R
Commande CLI universelle
npx tokrepo install 4d12517d-5530-11f1-9bc6-00163e2b0d79

Introduction

R2R (Reason to Retrieve) is a production-grade retrieval-augmented generation framework built by SciPhi AI. It provides everything needed to go from raw documents to an agentic RAG pipeline with a single deployable service, removing the need to stitch together separate vector databases, embedding services, and LLM orchestration layers.

What R2R Does

  • Ingests documents in 20+ formats (PDF, DOCX, HTML, Markdown, images with OCR) and chunks them automatically
  • Runs hybrid search combining vector similarity and keyword matching with reciprocal rank fusion
  • Builds and queries a knowledge graph alongside the vector index for multi-hop reasoning
  • Exposes a full RESTful API for document management, search, RAG, and agent interactions
  • Supports agentic RAG where the system plans retrieval strategies and iterates on answers

Architecture Overview

R2R runs as a containerized service with three main subsystems: an ingestion pipeline that parses, chunks, and embeds documents into PostgreSQL with pgvector; a retrieval engine that performs hybrid search and optional graph traversal; and an agentic orchestrator that chains retrieval, reasoning, and tool use. The system uses Hatchet for async task orchestration and exposes all functionality through a FastAPI-based REST interface. A Python SDK and CLI wrap the API for developer convenience.

Self-Hosting & Configuration

  • Deploy with Docker Compose for a single-command setup including PostgreSQL, pgvector, and the R2R server
  • Configure LLM and embedding providers via environment variables (supports OpenAI, Anthropic, local models)
  • Customize chunking strategy, overlap, and embedding dimensions in the TOML config
  • Enable the knowledge graph module by setting the graph provider configuration
  • Scale horizontally by adding worker instances behind the task queue

Key Features

  • End-to-end RAG in a single service: ingestion, embedding, search, generation, and agent orchestration
  • Hybrid retrieval with vector search, full-text search, and knowledge graph traversal
  • Multi-tenant architecture with user-level document permissions and access control
  • Agentic RAG mode where the system autonomously decides when and how to retrieve
  • Built-in evaluation endpoints for measuring retrieval and generation quality

Comparison with Similar Tools

  • LangChain — general-purpose LLM framework requiring assembly; R2R is an integrated, deployable RAG service
  • LlamaIndex — strong indexing library but needs external infrastructure; R2R bundles everything in one container
  • Haystack — modular pipeline framework; R2R trades modularity for faster time-to-production
  • RAGFlow — document-focused RAG engine; R2R adds agentic capabilities and knowledge graph support
  • Verba — Weaviate-based RAG UI; R2R is backend-focused with a full API and more retrieval strategies

FAQ

Q: What database does R2R use? A: PostgreSQL with the pgvector extension for vector storage and optional graph storage. Everything runs in the provided Docker Compose stack.

Q: Can I use local models instead of OpenAI? A: Yes. R2R supports any OpenAI-compatible endpoint including Ollama, vLLM, and other local inference servers.

Q: How does the knowledge graph work? A: R2R extracts entities and relationships from ingested documents and stores them in a graph structure. During retrieval, the agent can traverse the graph for multi-hop reasoning alongside vector search.

Q: Is R2R suitable for production workloads? A: Yes. It includes authentication, multi-tenancy, async task processing, and horizontal scaling support.

Sources

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires