Esta página se muestra en inglés. Una traducción al español está en curso.
ConfigsMay 21, 2026·3 min de lectura

R2R — Production-Ready Agentic RAG System

A state-of-the-art production-ready retrieval-augmented generation system with agentic capabilities, a RESTful API, and built-in document processing, vector search, and knowledge graph support.

Listo para agents

Este activo puede ser leído e instalado directamente por agents

TokRepo expone un comando CLI universal, contrato de instalación, metadata JSON, plan según adaptador y contenido raw para que los agents evalúen compatibilidad, riesgo y próximos pasos.

Native · 98/100Política: permitir
Superficie agent
Cualquier agent MCP/CLI
Tipo
Skill
Instalación
Single
Confianza
Confianza: Established
Entrada
R2R
Comando CLI universal
npx tokrepo install 4d12517d-5530-11f1-9bc6-00163e2b0d79

Introduction

R2R (Reason to Retrieve) is a production-grade retrieval-augmented generation framework built by SciPhi AI. It provides everything needed to go from raw documents to an agentic RAG pipeline with a single deployable service, removing the need to stitch together separate vector databases, embedding services, and LLM orchestration layers.

What R2R Does

  • Ingests documents in 20+ formats (PDF, DOCX, HTML, Markdown, images with OCR) and chunks them automatically
  • Runs hybrid search combining vector similarity and keyword matching with reciprocal rank fusion
  • Builds and queries a knowledge graph alongside the vector index for multi-hop reasoning
  • Exposes a full RESTful API for document management, search, RAG, and agent interactions
  • Supports agentic RAG where the system plans retrieval strategies and iterates on answers

Architecture Overview

R2R runs as a containerized service with three main subsystems: an ingestion pipeline that parses, chunks, and embeds documents into PostgreSQL with pgvector; a retrieval engine that performs hybrid search and optional graph traversal; and an agentic orchestrator that chains retrieval, reasoning, and tool use. The system uses Hatchet for async task orchestration and exposes all functionality through a FastAPI-based REST interface. A Python SDK and CLI wrap the API for developer convenience.

Self-Hosting & Configuration

  • Deploy with Docker Compose for a single-command setup including PostgreSQL, pgvector, and the R2R server
  • Configure LLM and embedding providers via environment variables (supports OpenAI, Anthropic, local models)
  • Customize chunking strategy, overlap, and embedding dimensions in the TOML config
  • Enable the knowledge graph module by setting the graph provider configuration
  • Scale horizontally by adding worker instances behind the task queue

Key Features

  • End-to-end RAG in a single service: ingestion, embedding, search, generation, and agent orchestration
  • Hybrid retrieval with vector search, full-text search, and knowledge graph traversal
  • Multi-tenant architecture with user-level document permissions and access control
  • Agentic RAG mode where the system autonomously decides when and how to retrieve
  • Built-in evaluation endpoints for measuring retrieval and generation quality

Comparison with Similar Tools

  • LangChain — general-purpose LLM framework requiring assembly; R2R is an integrated, deployable RAG service
  • LlamaIndex — strong indexing library but needs external infrastructure; R2R bundles everything in one container
  • Haystack — modular pipeline framework; R2R trades modularity for faster time-to-production
  • RAGFlow — document-focused RAG engine; R2R adds agentic capabilities and knowledge graph support
  • Verba — Weaviate-based RAG UI; R2R is backend-focused with a full API and more retrieval strategies

FAQ

Q: What database does R2R use? A: PostgreSQL with the pgvector extension for vector storage and optional graph storage. Everything runs in the provided Docker Compose stack.

Q: Can I use local models instead of OpenAI? A: Yes. R2R supports any OpenAI-compatible endpoint including Ollama, vLLM, and other local inference servers.

Q: How does the knowledge graph work? A: R2R extracts entities and relationships from ingested documents and stores them in a graph structure. During retrieval, the agent can traverse the graph for multi-hop reasoning alongside vector search.

Q: Is R2R suitable for production workloads? A: Yes. It includes authentication, multi-tenancy, async task processing, and horizontal scaling support.

Sources

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados